Is there any way to drop duplicate columns, but replacing their values depending upon conditions like in table below, I would like to remove duplicate/second A and B columns, but want to replace the value of primary A and B (1st and 2nd column) where value is 0 but 1 in duplicate columns. Ex – In 3rd row, where A, B have value 0 , should replace with 1 with their respective duplicate columns value..
Input Data :
Output Data:
This is an example of a problem I’m working on, my real data have around 200 columns so i’m hoping to find an optimal solution without hardcoding columns names for removal..
Advertisement
Answer
Use DataFrame.any
per duplicated columns names if only 1,0
values in columns:
df = df.any(axis=1, level=0).astype(int)
Or if need maximal value per duplicated columns names:
df = df.max(axis=1, level=0)