How to remove brackets from multi-value keys when converting to dataframes or extend values of a key without extraneous characters

Question

The above code handles a nested dictionary to dataframe conversion perfectly fine but if you have a nested dictionary created with the .append() or .extend() method it adds extraneous brackets[] and quotes '' which is making downstream analysis difficult. For example for a nested dictionary like this: created with the setup: And converted to a dataframe with pd.dataframe.from_dict() Creates a

Accepted Answer

One option is to stack the columns, join the strings, then unstack:out = pd.DataFrame(my_data).stack().map(', '.join).unstack()But it&#8217;s probably more efficient to modify the input dictionary in vanilla Python first and then construct the DataFrame:for d in my_data.values():    for k,v in d.items():        d[k] = ', '.join(v)out = pd.DataFrame(my_data)Output:                                Ceratopteris richardii                               Arabidopsis thalianasuperkingdom                                 Eukaryota                                          Eukaryotakingdom                                  Viridiplantae                                      Viridiplantaephylum                                    Streptophyta                                       Streptophytasubphylum                               Streptophytina                                     Streptophytinaclade         Embryophyta, Tracheophyta, Euphyllophyta  Embryophyta, Tracheophyta, Euphyllophyta, Sper...class                                   Polypodiopsida                                      Magnoliopsidasubclass                                  Polypodiidae                                                NaNorder                                     Polypodiales                                        Brassicalessuborder                                   Pteridineae                                                NaNfamily                                     Pteridaceae                                       Brassicaceaesubfamily                                Parkerioideae                                                NaNgenus                                     Ceratopteris                                        Arabidopsistribe                                              NaN                                         Camelineae

Columns one	Column two
Key1	[‘Value1′,’Value2′,’value3’]
Key2	[‘Value2′,’value4′,’value5’]

Columns one	Column two
Key1	Value1,Value2,value3
Key2	Value2,value4,value5

Advertisement

Answer