I have a dictionary that is a list of dataframes that have all the same columns and data structure. I am wanting to essentially ‘union’ all of these into a single dataframe again, where the dictionary keys are converted into another column: df_list{}
{'A' : col1 col2 col3 001 val1 val2 val3 002 val3 val4 val5 'B' : col1 col2 col3 001 val1 val2 val3 002 val3 val4 val5
…and so on
but am wanting:
key Col1 Col2 Col3 A val1 val2 val3 A val4 val5 val6 B val1 val2 val3 B val4 val5 val6
I tried using pd.DataFrame.from_dict() but either I am not using it right or I need something else..
final_df = pd.DataFrame.from_dict(df_list)
but get: ValueError: If using all scalar values, you must pass an index
when I try passing the index, I get one column back vs a dataframe.
Advertisement
Answer
This should do it:
import pandas as pd df1 = pd.DataFrame({ "col1":['val1','val3'], "col2":['val2','val3'], "col3":['val3','val5'] }) df2 = pd.DataFrame({ "col1":['val7','val3'], "col2":['val2','val3'], "col3":['val3','val5'] }) pd_dct = {"A": df1, "B": df2} # adding the key in for key in pd_dct.keys(): pd_dct[key]['key'] = key # concatenating the DataFrames df = pd.concat(pd_dct.values())
Alternatively, we can also do this in one line with:
pd.concat(pd_dct, axis=0).reset_index(level=0).rename({'level_0':'key'}, axis=1)