I have a pandas dataframe:
col1 col2 col3 a NaN NaN b 1 2 b 3 4 c 5 6
I would like to change it to a dataframe like this:
col1 col2 col3 a NaN NaN b [1,3] [2,4] c 5 6
Is there a simple way to achieve this?
Advertisement
Answer
You need custom lambda function for lists only if length is greater like 1:
df1 = df.groupby('col1').agg(lambda x: list(x) if len(x) > 1 else x).reset_index()
print (df1)
  col1        col2        col3
0    a         NaN         NaN
1    b  [1.0, 3.0]  [2.0, 4.0]
2    c         5.0         6.0
because if aggregate by list get also one element lists:
print (df.groupby('col1').agg(list))
            col2        col3
col1                        
a          [nan]       [nan]
b     [1.0, 3.0]  [2.0, 4.0]
c          [5.0]       [6.0]