Find the column name of the second largest value of each row in a Pandas DataFrame

Question

I am trying to find column name associated with the largest and second largest values in a DataFrame, here's a simplified example (the real one has over 500 columns): Needs to become: I can find the column name with the largest value (i,e, 1larg above) with idxmax, but how can I find the second largest? Answer (You don't have any

Accepted Answer

(You don&#8217;t have any duplicate maximum values in your rows, so I&#8217;ll guess that if you have [1,1,2,2] you want val3 and val4 to be selected.)One way would be to use the result of argsort as an index into a Series with the column names.df = df.set_index("Date")arank = df.apply(np.argsort, axis=1)ranked_cols = df.columns.to_series()[arank.values[:,::-1][:,:2]]new_frame = pd.DataFrame(ranked_cols, index=df.index)produces         0     1Date            1990  val4  val21991  val3  val41992  val1  val21993  val1  val41994  val2  val41995  val4  val3(where I&#8217;ve added an extra 1995 [1,1,2,2] row.)Alternatively, you could probably melt into a flat format, pick out the largest two values in each Date group, and then turn it again.

Advertisement

Answer