Let’s take this sample dataframe :
df = pd.DataFrame({'Category':['A','B','B','B','A'], 'Subcategory':['C','C','E','D','D']}) Category Subcategory 0 A C 1 B C 2 B E 3 B D 4 A D
I would like to replace the “B” values in Category by “B2” where there is a C or a D in Subcategory. I tried the following but I get the error “The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()” :
df['Category']=np.where((df['Category']=='B') and (df['Subcategory']=='C' or df['Subcategory']== 'D'),'B2',df['Category'])
I know that some similar questions are already asked on Stackoverflow but I can’t figure out where I am wrong. Could you please help me to understand my error ?
Expected output :
Category Subcategory 0 A C 1 B2 C 2 B E 3 B2 D 4 A D
Advertisement
Answer
Use bitwise operator and brackets with every conditions. Refer this for explanation
df['Category']=np.where((df['Category']=='B') & ((df['Subcategory']=='C') | (df['Subcategory']== 'D')),'B2',df['Category']) df Category Subcategory 0 A C 1 B2 C 2 B E 3 B2 D 4 A D
And better to use isin
to check multiple values, df["Subcategory"].isin(['C', 'D'])