Splitting object data into new columns in dataframe

Question

i have a dataframe with column business_id and attributes with thousands of rows like this: how do create new column for each attribute with the value to the business id ? and if it&#8217;s not applicable to that business id, it will specify false. example: while also noting that there are some attributes wit…

Accepted Answer

You could apply pd.json_normalize() to each of the dicts in attributes.Update: if the elements in attributes are JSON strings, then we need to turn them into dict first. The following function will do that (and also accept dict):import jsondef json_normalize(s):    if isinstance(s, str):        s = json.loads(s)    return pd.json_normalize(s)Now, use this instead of pd.json_normalize() directly:df2 = pd.concat(df['attributes'].apply(json_normalize).to_list(),                keys=df.business_id)>>> df2                         WiFi  HasTV RestaurantsTableService Caters Price  business_id                                                                 6iYb2HFDywm3zjuRg0q 0  u'free  False                    True   True   NaN   7f4z43MHAV-l-LsRYsa 0     NaN  False                     NaN    NaN     2                         Ambience.romantic Ambience.intimate  business_id                                                6iYb2HFDywm3zjuRg0q 0               NaN               NaN  7f4z43MHAV-l-LsRYsa 0             False             FalseNote, the above follows a reproducible setup:# first, reproducible setup, with dictsdf = pd.DataFrame({    'attributes': [        {"WiFi":"u'free","HasTV":"False","RestaurantsTableService":"True","Caters":"True"},        {"HasTV":"False", "Ambience":{'romantic': False, 'intimate': False,},"Price":"2"},    ],    'business_id': ['6iYb2HFDywm3zjuRg0q', '7f4z43MHAV-l-LsRYsa'],})# or, with stringsdf = pd.DataFrame({    'attributes': [        """{"WiFi":"free","HasTV":"False","RestaurantsTableService":"True","Price":"3"}""",        """{"HasTV":"False","Ambience":{"romantic": "False", "intimate": "False"},"Price":"2"}""",    ],    'business_id': ['6iYb2HFDywm3zjuRg0q', '7f4z43MHAV-l-LsRYsa'],})

Advertisement

Answer