Skip to content
Advertisement

Find the column name of the second largest value of each row in a Pandas DataFrame

I am trying to find column name associated with the largest and second largest values in a DataFrame, here’s a simplified example (the real one has over 500 columns):

JavaScript

Needs to become:

JavaScript

I can find the column name with the largest value (i,e, 1larg above) with idxmax, but how can I find the second largest?

Advertisement

Answer

(You don’t have any duplicate maximum values in your rows, so I’ll guess that if you have [1,1,2,2] you want val3 and val4 to be selected.)

One way would be to use the result of argsort as an index into a Series with the column names.

JavaScript

produces

JavaScript

(where I’ve added an extra 1995 [1,1,2,2] row.)

Alternatively, you could probably melt into a flat format, pick out the largest two values in each Date group, and then turn it again.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement