Skip to content
Advertisement

replace/change duplicate columns values where column name is same but values are different, then drop duplicate columns

Is there any way to drop duplicate columns, but replacing their values depending upon conditions like in table below, I would like to remove duplicate/second A and B columns, but want to replace the value of primary A and B (1st and 2nd column) where value is 0 but 1 in duplicate columns. Ex – In 3rd row, where A, B have value 0 , should replace with 1 with their respective duplicate columns value..

Input Data :

enter image description here

Output Data:

enter image description here

This is an example of a problem I’m working on, my real data have around 200 columns so i’m hoping to find an optimal solution without hardcoding columns names for removal..

Advertisement

Answer

Use DataFrame.any per duplicated columns names if only 1,0 values in columns:

df = df.any(axis=1, level=0).astype(int)

Or if need maximal value per duplicated columns names:

df = df.max(axis=1, level=0)
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement