My dataframe is like this:
df = pd.DataFrame({'A': [1,2,3], 'B': [1,4,5]})
If column A has the same value as column B, output 1, else 0.
I want to output like this:
A B is_equal 0 1 1 1 1 2 4 0 2 3 5 0
I figured out df['is_equal'] = np.where((df['A'] == df['B']), 1, 0)
worked fine.
But I want to use lambda here because I used a similar line in another case before. df['is_equals'] = df.apply(lambda x: 1 if df['A']==1 else 0, axis=1)
won’t work. It threw the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Why did this error happen and how can I fix the code.
Thank you in advance.
Advertisement
Answer
What you attempt to do is very inefficient. Don not do it. .apply
should not be used when other solutions are possible. The best solution is:
df['is_equal'] = (df['A'] == df['B']).astype(int)
But if you insist:
df.apply(lambda row: int(row['A'] == row['B']), axis=1)
The latter answer is 2,5 times slower. The original np.where
is the fastest.