import pandas as pd dict1 = { "brand": "Ford", "model": "Mustang", "year": 1964 } dict2 = { "brand": "Ford", "model": "F150", "year": 1999 } dict3 = { "brand": "Chevy", "model": "Malibu", "year": 1972 } d = { "col0": ["GM", "GM", "Dodge"], "col1": [dict1, dict3, dict2], "col2": [dict3, dict2, dict2], "col3": [dict1, dict2, dict3] } df = pd.DataFrame(d) grouped = df.groupby(['col0'], as_index=False) first = lambda a : a[0] df = grouped.agg({'col1':first,'col2':first, 'col3':first})
When I try to use the agg function, I’m getting raise KeyError(key) from err
.
What I”m trying to do is combine these based on the columns I’m grouping by and I want to take the first dict after grouping.
I want the output to be what you see below and I don’t really care which “GM” is kept. I arbitrarily chose the first, which is fine.
d = { "col0": ["GM", "Dodge"], "col1": [dict1, dict2], "col2": [dict3, dict2], "col3": [dict1, dict3] }
Advertisement
Answer
Use .iloc
:
grouped = df.groupby('col0') first = lambda a : a.iloc[0] df = grouped.agg({'col1':first, 'col2': first, 'col3': first})