Skip to content
Advertisement

localhost refused to connect in a databricks notebook calling the google api

I read the Google API documentation pages (Drive API, pyDrive) and created a databricks notebook to connect to the Google drive. I used the sample code in the documentation page as follow:

from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']

def main():
    """Shows basic usage of the Drive v3 API.
    Prints the names and ids of the first 10 files the user has access to.
    """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                CRED_PATH, SCOPES)
            creds = flow.run_local_server()
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('drive', 'v3', credentials=creds)

    # Call the Drive v3 API
    results = service.files().list(
        pageSize=10, fields="nextPageToken, files(id, name)").execute()
    items = results.get('files', [])

    if not items:
        print('No files found.')
    else:
        print('Files:')
        for item in items:
            print(u'{0} ({1})'.format(item['name'], item['id']))

if __name__ == '__main__':
    main()

The CRED_PATH includes the credential file path in /dbfs/FileStore/shared_uploads. The script prompts me the URL to authorize the application but immediately after allowing access it redirects to the page that says “This site can’t be reached: localhost refused to connect.”
The localhost is listening on the default port (8080):
enter image description here
I checked the redirect URI of the registered app in Google API Services and it includes the localhost.
I’m not sure what should I check/set to have access the Google API in databricks. Any thought is appreciated

Advertisement

Answer

Although I’m not sure whether this is better workaround for your situation, in your situation, how about using the service account instead of OAuth2 you are using? By this, the access token can be retrieved without opening the URL for retrieving the authorization code, and Drive API can be used with googleapis for python you are using. From this, I thought that your issue might be able to be removed.

The method for using the service account with your script is as follows.

Usage:

1. Create service account.

About this, you can see the following official document.

and/or

When the service account is created, the credential file of JSON data is downloaded. This file is used for the script.

2. Sample script:

The sample script for using the service account with googleapis for python is as follows.

from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.discovery import build

credentialFileOfServiceAccount = '###.json' # Please set the file path of the creadential file of service account.
creds = ServiceAccountCredentials.from_json_keyfile_name(credentialFileOfServiceAccount, ['https://www.googleapis.com/auth/drive.metadata.readonly'])
service = build('drive', 'v3', credentials=creds)

results = service.files().list(pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])

if not items:
    print('No files found.')
else:
    print('Files:')
    for item in items:
        print(u'{0} ({1})'.format(item['name'], item['id']))

Note:

  • The Google Drive of the service account is different from your Google Drive. So in this case, when you share a folder on your Google Drive with the mail address of the service account (This email address can be seen in the credential file.). By this, you can get and put the file to the folder using the service account and you can see and edit the file in the folder on your Google Drive using the browser.

References:

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement