Take the average of nlargest columns ignoring the non-numeric columns

Question

Is there any function to find out first &#8216;n&#8217; largest numbers and take an average of those two into a different column in pandas. Note: Time or any non-numeric column to be ignored. time n1 n2 n3 n4 average_largest_2 11:50 1 2 3 4 3.5 12:50 5 6 7 8 7.5 13:50 8 7 6 5 7.5 Use this code

Accepted Answer

You can use nlargest per row and get the mean:df1['average_largest_2'] = (df1.select_dtypes('number')                            .apply(lambda r: r.nlargest(2).mean(), axis=1)                            )Or using the underlying numpy array:a = df1.select_dtypes('number').to_numpy()df1['average_largest_2'] = np.sort(a)[:,-2:].mean(1)Output:    time  n1  n2  n3  n4  average_largest_20  11:50   1   2   3   4                3.51  12:50   5   6   7   8                7.52  13:50   8   7   6   5                7.5

`time`	n1	n2	n3	n4	average_largest_2
11:50	1	2	`3`	`4`	3.5
12:50	5	6	`7`	`8`	7.5
13:50	`8`	`7`	6	5	7.5

Advertisement

Answer