Skip to content
Advertisement

Python3 BigQuery or Google Cloud Python through HTTP Proxy

How to route BigQuery client calls through HTTP Proxy ?

Before Posting this, I tried following but it is still not routing through http proxy. And the Google Cloud service credentials are set through shell environment variable GOOGLE_APPLICATION_CREDENTIALS

import httplib2
import socks
import google.auth

credentials, _ = google.auth.default()
http_client = httplib2.Http(proxy_info = httplib2.ProxyInfo(socks.PROXY_TYPE_HTTP, 'someproxy', 80));

bigquery_client = bigquery.Client(credentials=credentials, _http=http_client)

Outgoing traffic ( 172.217.x.x belongs to googleapis.com ) not routing through HTTP Proxy ,

$ netstat -nputw
Local Address           Foreign Address
x.x.x.x                 172.217.6.234:443       SYN_SENT

Advertisement

Answer

Answering the question myself as I found the reason/solution.

Reason:

google-cloud-python library uses httplib2, As of this writing httplib2 has two code bases for python 2 and python 3. The Python 3 version of httplib2 is not implemented with socks/proxy support. Please refer to httplib2’s repo#init_py.

Work Around:

There is a discussion to move google-cloud-python from httplib2 to urllib3, but in the mean time one can use httplib2shim

import google.auth
import httplib2shim
import google_auth_httplib2

// More declarative way exists, but left for simplicity
os.environ["HTTP_PROXY"] = "someproxy:80"
os.environ["HTTPS_PROXY"] = "someproxy:80"
http_client = httplib2shim.Http()
credentials, _ = google.auth.default()

# IMO, Following 2 lines should be done at the google-cloud-python
# This exposes client speicific logic, and it already does that
credentials = google.auth.credentials.with_scopes_if_required
              (credentials, bigquery.Client.SCOPE)
authed_http = google_auth_httplib2.AuthorizedHttp(credentials,http_client)

bigquery_client = bigquery.Client(credentials=credentials, _http=authed_http)
Advertisement