I want to create a binary column which indicates 1 if the values of both columns in the following table are within the same range. For example, if the value on cat_1 is between 5-10 and the value in cat_2 is also between 5-10 then it should indicate 1, otherwise, it should be 0.
JavaScript
x
6
1
| cat_1. | cat_2. | [5-10] (new column to be created|
2
| -------- | -------------- | --------------------------------|
3
| 5 | 10 |1. |
4
| 7 | 9. |1 |
5
| 1 | 7. |0 |
6
So far, I have tried the following code but it return an error:
JavaScript
1
3
1
df.loc[((df['cat_1l'] >= 5 & df['cat_1'] <= 10)
2
& (df['cat_2'] >= 5 & result['cat_2'] <= 10)), '[5-10]' = 1
3
and here is the error:
JavaScript
1
2
1
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
2
Advertisement
Answer
The reason why you’re getting an error is that evaluation of &
has priority over >=
. To fix your snippet, add parentheses around column comparisons:
JavaScript
1
3
1
df.loc[((df['cat_1l'] >= 5) & (df['cat_1'] <= 10)
2
& (df['cat_2'] >= 5) & (result['cat_2'] <= 10)), '[5-10]' = 1
3
Even better, it is preferred to define the new column as a whole, without subsetting using .loc
. Consider e.g.:
JavaScript
1
2
1
df['[5-10]'] = df['cat1'].between(5, 10) & df['cat_2'].between(5, 10)
2