DataFrame:
c_os_family_ss c_os_major_is l_customer_id_i
0 Windows 7 90418
1 Windows 7 90418
2 Windows 7 90418
Code:
print df
for name, group in df.groupby('l_customer_id_i').agg(lambda x: ','.join(x)):
print name
print group
I’m trying to just loop over the aggregated data, but I get the error:
ValueError: too many values to unpack
@EdChum, here’s the expected output:
c_os_family_ss
l_customer_id_i
131572 Windows 7,Windows 7,Windows 7,Windows 7,Window
135467 Windows 7,Windows 7,Windows 7,Windows 7,Window
c_os_major_is
l_customer_id_i
131572 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
135467 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
The output is not the problem, I wish to loop over every group.
Advertisement
Answer
df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))
does already return a dataframe, so you cannot loop over the groups anymore.
In general:
df.groupby(...)
returns aGroupBy
object (a DataFrameGroupBy or SeriesGroupBy), and with this, you can iterate through the groups (as explained in the docs here). You can do something like:JavaScript151grouped = df.groupby('A')
2
3for name, group in grouped:
45
When you apply a function on the groupby, in your example
df.groupby(...).agg(...)
(but this can also betransform
,apply
,mean
, …), you combine the result of applying the function to the different groups together in one dataframe (the apply and combine step of the ‘split-apply-combine’ paradigm of groupby). So the result of this will always be again a DataFrame (or a Series depending on the applied function).