Skip to content
Advertisement

Grouping column values in pandas and making other column values into a list

I have a pandas dataframe:

 col1   col2   col3

  a      NaN    NaN
  b      1       2
  b      3       4
  c      5       6

I would like to change it to a dataframe like this:

 col1    col2  col3
  a      NaN      NaN
  b     [1,3]    [2,4]
  c       5       6

Is there a simple way to achieve this?

Advertisement

Answer

You need custom lambda function for lists only if length is greater like 1:

df1 = df.groupby('col1').agg(lambda x: list(x) if len(x) > 1 else x).reset_index()
print (df1)
  col1        col2        col3
0    a         NaN         NaN
1    b  [1.0, 3.0]  [2.0, 4.0]
2    c         5.0         6.0

because if aggregate by list get also one element lists:

print (df.groupby('col1').agg(list))
            col2        col3
col1                        
a          [nan]       [nan]
b     [1.0, 3.0]  [2.0, 4.0]
c          [5.0]       [6.0]
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement