I want to filter my dataframe with an or
condition to keep rows with a particular column’s values that are outside the range [-0.25, 0.25]
. I tried:
df = df[(df['col'] < -0.25) or (df['col'] > 0.25)]
But I get the error:
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Advertisement
Answer
The or
and and
Python statements require truth-values. For pandas, these are considered ambiguous, so you should use “bitwise” |
(or) or &
(and) operations:
df = df[(df['col'] < -0.25) | (df['col'] > 0.25)]
These are overloaded for these kinds of data structures to yield the element-wise or
or and
.
Just to add some more explanation to this statement:
The exception is thrown when you want to get the bool
of a pandas.Series
:
>>> import pandas as pd >>> x = pd.Series([1]) >>> bool(x) ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You hit a place where the operator implicitly converted the operands to bool
(you used or
but it also happens for and
, if
and while
):
>>> x or x ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). >>> x and x ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). >>> if x: ... print('fun') ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). >>> while x: ... print('fun') ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Besides these four statements, there are several Python functions that hide some bool
calls (like any
, all
, filter
, …). These are normally not problematic with pandas.Series
, but for completeness I wanted to mention these.
In your case, the exception isn’t really helpful, because it doesn’t mention the right alternatives. For and
and or
, if you want element-wise comparisons, you can use:
-
>>> import numpy as np >>> np.logical_or(x, y)
or simply the
|
operator:>>> x | y
-
>>> np.logical_and(x, y)
or simply the
&
operator:>>> x & y
If you’re using the operators, then be sure to set your parentheses correctly because of operator precedence.
There are several logical NumPy functions which should work on pandas.Series
.
The alternatives mentioned in the Exception are more suited if you encountered it when doing if
or while
. I’ll shortly explain each of these:
If you want to check if your Series is empty:
>>> x = pd.Series([]) >>> x.empty True >>> x = pd.Series([1]) >>> x.empty False
Python normally interprets the
len
gth of containers (likelist
,tuple
, …) as truth-value if it has no explicit Boolean interpretation. So if you want the Python-like check, you could do:if x.size
orif not x.empty
instead ofif x
.If your
Series
contains one and only one Boolean value:>>> x = pd.Series([100]) >>> (x > 50).bool() True >>> (x < 50).bool() False
If you want to check the first and only item of your Series (like
.bool()
, but it works even for non-Boolean contents):>>> x = pd.Series([100]) >>> x.item() 100
If you want to check if all or any item is not-zero, not-empty or not-False:
>>> x = pd.Series([0, 1, 2]) >>> x.all() # Because one element is zero False >>> x.any() # because one (or more) elements are non-zero True