I have a dataframe (test_df) that looks like this:
dq_code dq_sql Results ID_24 select 'A' as B, 'B' as C, ... ID_42 select * from dual
I want to assign the result from executing the query in dq_sql to the column Results. I tried this:
test_df['Results'] = '' for row in test_df.itertuples(): test_df['Results'] = pd.read_sql(row.dq_sql, engine)
which results in:
dq_code dq_sql Results 0 ID_24 select 'A' as B, 'B' as C, ... X 1 ID_42 select * from dual NaN
Strange thing is, the ‘X’ result should be assigned to row 1, whereas the query in row 0 does not seem to run. When I try to run the query in row 0 without the for loop, like this:
query = ''' select 'A' as B, 'B' as C, 'C' as D, 'E/'||to_char(Key_1,'YYYYMMDD') as Key, sysdate as Date1, sysdate as Date2, 'The date' as Reason, 'The date was ' || to_char(Key_1,'YYYYMMDD') as Wrong_val, 'Test' as Case, Null as Value, Null as Info_1, Null as Info_2, Null as Info_3 from schema.table TAB where Key_1 between sysdate-7 AND sysdate ''' query_test = pd.read_sql_query(query, engine)
It does give me the results I need. But somehow it does not when I include this query in the for-loop.
Any help would be appreciated!
Advertisement
Answer
In the end, it worked out by (1) creating an empty dataframe which contains a column, (2) appending the results from the query to that column. Used a dictionary to do so, and removed the index twice in order to remove redundant brackets. Not necessarily using any form of itertuples, iterrows or else.
Code:
df = pd.DataFrame(columns=['Results']) for query in test_df['dq_sql']: df = df.append({'Results': pd.read_sql_query(str(query),conn).values[0][0]}, ignore_index=True)