I am uploading a newline-delimited JSON file from GCS to BigQuery. There are some fields in the JSON file which contain dicts for values, and I have no problem getting those values into BigQuery, as the nested fields are broken down into separate columns. So it all works if the following example is a line from the JSON file:
{"dict_field": {"value1": 1, "value2": 2}}
However, if one line from the file has an empty dict as the value for field_dict, like this:
{"dict_field": {}}
I get the following error message:
Exception: BigQuery job failed. Final error was: {‘reason’: ‘invalid’, ‘message’: “Unsupported empty struct type for field ‘dict_field'”} […]
I looked through the BigQuery documentation and couldn’t find any stated restriction regarding empty dicts as values. Does anyone know if there is a workaround to this issue, or if I have to manually clean the data before importing it in BigQuery?
Advertisement
Answer
As pointed out by @rodvictor and @Fcojavmelo, loading empty dicts from JSON files isn’t currently possible although it’ not explicitly mentioned anywhere in the BigQuery documentation, only in this issue/feature request.
In conclusion, the data has to be manually cleaned, and any empty dicts removed to avoid errors.