Skip to content
Advertisement

BigQuery – Best way to DROP date-sharded tables

I have a few date-sharded tables I want to delete but they already have more than 100 shards for each table and can’t drop them manually.

I’ve tried using wildcards

DROP TABLE my_dataset.my_table_*;

but it does not seem to work.

I finally used the python API:

for table_id in tables:
    table_ref = client.dataset(dataset_id).table(table_id)
    client.delete_table(table_ref)

And it works, but I needed to create the tables array with the names of the tables I wanted to drop.

Is there a way to DROP all date-shards of a date-sharded table in BigQuery form the UI?

Or using an SQL command in the UI?

Or using the command line with a wildcard?

Thanks

Advertisement

Answer

And what about instead of creating the tables array (with the names of the tables) you use…

from google.cloud import bigquery
client = bigquery.Client()
dataset_ref = client.dataset('my_dataset')

tables = list(client.list_tables(dataset_ref))  # API request(s), now you have the list of tables in this dataset
queried_tables=[]
for table in tables:
    print(table.table_id)
    if table.table_id.startswith("your_favourite_prefix"): #will perform the action only if the table has the desired prefix
        queried_tables.append(table.table_id)

print(queried_tables) #the list of the desired tables names, now you can use your script to delete them all
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement