Skip to content
Advertisement

How to create a Pandas DataFrame from dictionary of dataframes?

I have a dictionary that is a list of dataframes that have all the same columns and data structure. I am wanting to essentially ‘union’ all of these into a single dataframe again, where the dictionary keys are converted into another column: df_list{}

{'A' : col1 col2 col3 
001    val1  val2  val3
002    val3  val4  val5

'B' : col1 col2 col3 
001    val1  val2  val3
002    val3  val4  val5

…and so on

but am wanting:

key  Col1  Col2  Col3
A    val1  val2  val3
A    val4  val5  val6
B    val1  val2  val3
B    val4  val5  val6

I tried using pd.DataFrame.from_dict() but either I am not using it right or I need something else..

final_df = pd.DataFrame.from_dict(df_list)

but get: ValueError: If using all scalar values, you must pass an index

when I try passing the index, I get one column back vs a dataframe.

Advertisement

Answer

This should do it:

import pandas as pd

df1 = pd.DataFrame({
    "col1":['val1','val3'],
    "col2":['val2','val3'],
    "col3":['val3','val5']
})


df2 = pd.DataFrame({
    "col1":['val7','val3'],
    "col2":['val2','val3'],
    "col3":['val3','val5']
})

pd_dct = {"A": df1, "B": df2}

# adding the key in 
for key in pd_dct.keys():
    pd_dct[key]['key'] = key 

# concatenating the DataFrames
df = pd.concat(pd_dct.values())


Alternatively, we can also do this in one line with:

pd.concat(pd_dct, axis=0).reset_index(level=0).rename({'level_0':'key'}, axis=1)

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement