import pandas as pd
dict1 = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
dict2 = {
"brand": "Ford",
"model": "F150",
"year": 1999
}
dict3 = {
"brand": "Chevy",
"model": "Malibu",
"year": 1972
}
d = {
"col0": ["GM", "GM", "Dodge"],
"col1": [dict1, dict3, dict2],
"col2": [dict3, dict2, dict2],
"col3": [dict1, dict2, dict3]
}
df = pd.DataFrame(d)
grouped = df.groupby(['col0'], as_index=False)
first = lambda a : a[0]
df = grouped.agg({'col1':first,'col2':first, 'col3':first})
When I try to use the agg function, I’m getting raise KeyError(key) from err.
What I”m trying to do is combine these based on the columns I’m grouping by and I want to take the first dict after grouping.
I want the output to be what you see below and I don’t really care which “GM” is kept. I arbitrarily chose the first, which is fine.
d = {
"col0": ["GM", "Dodge"],
"col1": [dict1, dict2],
"col2": [dict3, dict2],
"col3": [dict1, dict3]
}
Advertisement
Answer
Use .iloc:
grouped = df.groupby('col0')
first = lambda a : a.iloc[0]
df = grouped.agg({'col1':first, 'col2': first, 'col3': first})