Skip to content
Advertisement

How do I update column description in BigQuery table using python script?

I can use SchemaField(f"{field_name}", f"{field_type}", mode="NULLABLE", description=...) while making a new table.

But I want to update the description of the column of the already uploaded table.

Advertisement

Answer

Unfortunately, we don’t have such a mechanism available yet to update a column description of the table through the client library. As a workaround, you can try the following available options to update your table column level description:

Option 1: Using the following ALTER TABLE ALTER COLUMN SET OPTIONS data definition language (DDL) statement:

ALTER TABLE `projectID.datasetID.tableID`
ALTER COLUMN Name
SET OPTIONS (
 description="Country Name"
);

Refer to this doc for more information about the ALTER COLUMN SET OPTIONS statement.

Option 2: Using the bq command-line tool’s bq update command:

Step 1: Get the JSON schema by running the following bq show command:

bq show --format=prettyjson projectID:datasetID.tableID > table.json

Step 2: Then copy the schema from the table.json to schema.json file.

Note: Don’t copy the entire data from the ‘table.json’ file, copy only the schema, it will look something like below:

[
    {
        "description": "Country Name",
        "mode": "NULLABLE",
        "name": "Name",
        "type": "STRING"
    }
]

Step 3: In the ‘schema.json’ file, modify the description name as you like. Then, run the following bq update command to update a table column description.

bq update projectID:datasetID.tableID schema.json

Refer to this doc for more information about bq update command.

Option 3: Calling the tables.patch API method:

Refer to this doc for more information about tables.patch API method.

As per your requirement, I took the following Python code from this medium article and not from the Google Cloud official docs. So Google Cloud will not provide any support for this code.

Step 1: Add the schema in the ‘schema.py’ file and modify the column description name as per your requirement:

#Add field schema
TableObject = {
   "tableReference": {
     "projectId": "projectID",
     "datasetId": "datasetID",
     "tableId": "tableID",
   },
   "schema": {
     "fields": [
         {
           "description": "Country Name",
           "mode": "NULLABLE",
           "name": "Name",
           "type": "STRING"
         }
       ],
   },
}

Step 2: Run the following code to get the expected result:

Note: keep that schema.py and following code file in the same directory.

#!/usr/bin/env python
#https://developers.google.com/api-client-library/python/
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
from schema import TableObject
# [START Table Creator]
def PatchTable(bigquery):
   tables = bigquery.tables()
   tables.patch(                        
         projectId=TableObject['tableReference']['projectId'],
         datasetId=TableObject['tableReference']['datasetId'],
         tableId=TableObject['tableReference']['tableId'], 
         body=TableObject).execute()
   print ("Table Patched")
# [END]
def main():
   #To get credentials
   credentials = GoogleCredentials.get_application_default()
   # Construct the service object for interacting with the BigQuery API.
   bigquery = discovery.build('bigquery', 'v2', credentials=credentials)
   PatchTable(bigquery)
 
if __name__ == '__main__':
   main()
   print ("BigQuery Table Patch")
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement