Skip to content
Advertisement

Update columns with duplicate values from the DataFrame in Pandas

I have a data set which has values for different columns as different entries with first name to identify the respective columns. For instance James’s gender is in first row and James’s age is in 5th row.

DataFrame df1=

Index First Name Age Gender Weight in lb Height in cm
0 James Male
1 John 175
2 Patricia 23
5 James 22
4 James 185
5 John 29
6 John 176

I am trying to make it combined into one DataFrame as below df1=

Index First Name Age Gender Weight Height
0 James 22 Male 185
1 John 29 175 176
2 Patricia 23

I tried to do groupby but it is not working.

Advertisement

Answer

Assuming NaN in the empty cells, you can use groupby.first:

df.groupby('First Name', as_index=False).first()

output:

  First Name   Age Gender  Weight in lb  Height in cm
0      James  22.0   Male         185.0           NaN
1       John  29.0   None         175.0         176.0
2   Patricia  23.0   None           NaN           NaN
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement