Below is the code and output, what I’m trying to get is shown in the “exp” column, as you can see the “countif” column just counts 5 columns, but I want it to only count negative values.
So for example: index 0, df1[0]
should equal 2
What am I doing wrong?
Python
import pandas as pd import numpy as np a = ['A','B','C','B','C','A','A','B','C','C','A','C','B','A'] b = [2,4,1,1,2,5,-1,2,2,3,4,3,3,3] c = [-2,4,1,-1,2,5,1,2,2,3,4,3,3,3] d = [-2,-4,1,-1,2,5,1,2,2,3,4,3,3,3] exp = [2,1,0,2,0,0,1,0,0,0,0,0,0,0] df1 = pd.DataFrame({'b':b,'c':c,'d':d,'exp':exp}, columns=['b','c','d','exp']) df1['sumif'] = df1.where(df1<0,0).sum(1) df1['countif'] = df1.where(df1<0,0).count(1) df1 # df1.sort_values(['a','countif'], ascending=[True, True])
Output
Advertisement
Answer
First DataFrame.where
working different, it replace False
values to 0
here by condition (here False are greater of equal 0
), so cannot be used for count:
print (df1.iloc[:, :3].where(df1<0,0)) b c d 0 0 -2 -2 1 0 0 -4 2 0 0 0 3 0 -1 -1 4 0 0 0 5 0 0 0 6 -1 0 0 7 0 0 0 8 0 0 0 9 0 0 0 10 0 0 0 11 0 0 0 12 0 0 0 13 0 0 0
You need compare first 3 columns for less like 0
and sum
:
df1['exp1'] = (df1.iloc[:, :3] < 0).sum(1) #If need compare all columns #df1['exp1'] = (df1 < 0).sum(1) print (df1) b c d exp exp1 0 2 -2 -2 2 2 1 4 4 -4 1 1 2 1 1 1 0 0 3 1 -1 -1 2 2 4 2 2 2 0 0 5 5 5 5 0 0 6 -1 1 1 1 1 7 2 2 2 0 0 8 2 2 2 0 0 9 3 3 3 0 0 10 4 4 4 0 0 11 3 3 3 0 0 12 3 3 3 0 0 13 3 3 3 0 0