Skip to content
Advertisement

Pandas: New column with values greater than 0 and operate with these values

I have a big dataframe with more than 2500 columns but the structure is very similar than this:

      A     B     C     D      E
0     1     0     8     0      0
1     0     0     0     0      5
2     1     2     3     0      0
3     0     2     0     1      0

I need to detect all the columns with a value greater than 0 and I have done with this:

df['X'] = df.gt(0).dot(df.columns + ',')

And then I get this:

      A     B     C     D      E    X
0     1     0     8     0      0    A,C
1     0     0     0     0      5    E
2     1     2     9     0      0    A,B,C
3     0     3     0     1      0    B,D

The problem is that I don’t need in ‘X’ the name of the columns but the value of the columns and I need to perform the following mathematical operation:

I want it to detect only the 2 lowest values and subtract the lowest value from the second lowest. In the case that there were not more than 2 values in ‘X’ it would be enough for me to show the value of the column only.

The final result in my example would be the following:

      A     B     C     D      E    X
0     1     0     8     0      0    7
1     0     0     0     0      5    5
2     1     2     9     0      0    1
3     0     3     0     1      0    2

Any idea how to solve it or any orientation?

Advertisement

Answer

We can do nsmallest then follow by np.ptp and condition for those row only have one value not equal to 0

df['new'] = df.apply(lambda x :  np.ptp(pd.Series.nsmallest(x[x!=0],2)) if sum(x!=0) != 1 else x[x!=0].iloc[0],axis=1)
Out[520]: 
0    7
1    5
2    1
3    1
dtype: int64

Or doing two steps

df['new'] = df[df.ne(0).sum(1)>1].apply(lambda x :  np.ptp(pd.Series.nsmallest(x,2)),axis=1)
df['new'].fillna(df.max(1),inplace=True)
df
Out[530]: 
   A  B  C  D  E  new
0  1  0  8  0  0  7.0
1  0  0  0  0  5  5.0
2  1  2  3  0  0  1.0
3  0  2  0  1  0  1.0
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement