I am trying to transform multiple dictionaries with keys and corresponding lists to a pandas dataframe and can’t get to the right way of transforming them. For the pandas data frame, the keys are the index column and the lists
How can I transform python dictionaries with keys and corresponding lists (in values) to a pandas dataframe with keys as the index column and each of the dictionary as the other columns ?
Here is a sample set of dictionaries and one of my non-working solutions:
import pandas as pd
dict1 = {'key_1': [1, 2, 3, 4], 'key_2': [5, 6, 7, 8], 'key_3': [9, 10, 11, 12]}
dict2 = {'key_1': ['a', 'b', 'c', 'd'], 'key_2': ['e', 'f', 'g', 'h'], 'key_3': ['i', 'j', 'k', 'l']}
dict3 = {'key_1': ['DD', 'CC', 'BB', 'AA'], 'key_3': ['II', 'JJ', 'KK', 'LL']}
df = pd.DataFrame.from_dict({'dict1':pd.Series(dict1),
                            'dict2':pd.Series(dict2),
                            'dict3':pd.Series(dict3)})
print(df)
This is what I need the resulting dataframe to look like:
I tried using explode and it would work if I only had one dictionary, but doing it recursively for the other dictionaries did not work.  Then, I tried some of the solutions in this Stackoverflow transformation solution but couldn’t get the solutions to work, in some cases because of the NaNs in my example.
Advertisement
Answer
You would need to fillna with a list with 4 items. Unfortunately fillna doesn’t support a list as parameter.
But you can take advantage of a stack/unstack (and the fill_value parameter of unstack), then explode all columns:
(df .stack() .unstack(fill_value=[pd.NA]*4) .explode(list(df)) )
output:
dict1 dict2 dict3 key_1 1 a DD key_1 2 b CC key_1 3 c BB key_1 4 d AA key_2 5 e <NA> key_2 6 f <NA> key_2 7 g <NA> key_2 8 h <NA> key_3 9 i II key_3 10 j JJ key_3 11 k KK key_3 12 l LL
