I am using python to send a query to Athena and get table DDL. I am using start_query_execution
and get_query_execution
functions in the awswrangler package.
import boto3 import awswrangler as wr import time import pandas as pd boto3.setup_default_session(region_name="us-east-1") sql="show create table 'table-name'" query_exec_id = wr.athena.start_query_execution(sql=sql, database='database-name') time.sleep(20) res=wr.athena.get_query_execution(query_execution_id=query_exec_id)
The code above creates a dict object that stores query results in an s3 link. The link can be accessed by
res['ResultConfiguration']['OutputLocation']
. It’s a text link: s3://…..txt
Can someone help me figure how to access the output in the link. I tried using readlines() but it seemes to error out.
Here is what I did
import urllib3 target_url = res['ResultConfiguration']['OutputLocation'] f = urllib3.urlopen(target_url) for l in f.readlines(): print (l)
Or if someone can suggest an easier way to get table DDL in python.
Advertisement
Answer
Keep in mind that the returned link will time out after a short while… and make sure your credentials allow you to get the data from the URL specified. If you drop the error message here we can help you better. –
Oh… “It’s a text link: s3://…..txt” is not a standard URL. You cant read that with urllib3. You can use awswrangler to read the bucket. –
I think the form is wr.s3.read_fwf(…)