I have a column in pandas dataframe that has the following structure (see example). I think I have a nested dictionary in a single column, and I want each key to have it’s own column. I want all the matching keys to be the same column. Run the examples for more details
import pandas as pd import numpy as np data = ["{'age': 59, 'gender': 'Female','pain': 'No', 'date': '2022-09-29'}", "{'gender': 'Male', 'date': '2022-10-11'}", "{'age': 18, 'date': '2022-10-11', 'pain': 'No'}", ] original_df = pd.DataFrame(data, columns=['test']) original_df.head()
I want to explode the dataframe so that it has the following structure:
data = [[59,'Female','No','2022-09-29'], [np.nan,'Male',np.nan,'2022-10-11'], [18,np.nan,'2022-10-11','No']] # Create the pandas DataFrame desired_df = pd.DataFrame(data, columns=['age','gender','date','pain']) desired_df.head()
Advertisement
Answer
Using json.normalize
from ast import literal_eval import pandas as pd original_df["test"] = original_df["test"].apply(literal_eval) df = original_df.join(pd.json_normalize(original_df.pop("test"))) print(df) age gender pain date 0 59.0 Female No 2022-09-29 1 NaN Male NaN 2022-10-11 2 18.0 NaN No 2022-10-11