I have a dataframe that contains a few rows. I want to access one by one row and create another data frame with specific columns. After that running some other logics but it failed before that.
Dataframe df_input_data
src_table_name src_column_name src_business_key_name 0 banking_fraud Acct_id Acct_id 1 sale_mast cust_code bill_no
Access row using iterrows():
for index, df_input_single in df_input_data.iterrows():
print("input", df_input_single)
Output:
input src_table_name banking_fraud src_column_name Acct_id src_business_key_name Acct_id
Creating another dataframe:
df_src_input = pd.DataFrame().assign(table_name=df_input_single['src_table_name'],
                                     column_name=df_input_single['src_column_name'],
                                     business_key_name=df_input_single['src_business_key_name'])
issue is df_src_input is empty.
df_src_input Empty DataFrame Columns: [table_name, column_name, business_key_name, select_column_names, where_condition, end_date, load_dt_tm, src_tgt_validation_type, schema_name, schema_table] Index: []
Is there any other way to assign value to different dataframe.
Advertisement
Answer
if you print out
print('type', type(df_input_single['src_column_name']))
for example, it will be type <class 'str'>, which is just a string value. To put data in a column, it must be a list or a tuple. Put each value in square brackets.
for index, df_input_single in df_input_data.iterrows():
    #print('input', df_input_single)
    #print('type', type(df_input_single['src_column_name']))
    df_src_input = pd.DataFrame().assign(table_name=[df_input_single['src_table_name']],
                                         column_name=[df_input_single['src_column_name']],
                                         business_key_name=[df_input_single['src_business_key_name']])
    print(df_src_input)
