I have some code that runs on a cron (via kubernetes) for several months now.
Yesterday, part of my code didn’t work that normally does:
This statement, all of a sudden, wasnt ‘True’ (both df_temp and df_temp4 have data in them:
if ( len(df_temp > 0) & len(df_temp4 > 0)): print "HERE"
however, this worked:
if ( len(df_temp > 0) and len(df_temp4 > 0)): print "HERE"
Was there some sort of code push that would cause this change? Since I’ve run this code for months, not sure what would cause this statement to fail all of a sudden.
Advertisement
Answer
The len(df_temp > 0)
and len(df_temp4 > 0)
probably don’t do what you expect. The comparison operators with pandas DataFrames return element-wise results, that means they create a boolean DataFrame where each value indicates if the corresponding value in the DataFrame is greater than zero:
>>> import pandas as pd >>> df = pd.DataFrame({'a': [-1,0,1], 'b': [-1,0,1]}) >>> df a b 0 -1 -1 1 0 0 2 1 1 >>> df > 0 a b 0 False False 1 False False 2 True True
So the len
of df
is the same as the len
of df > 0
:
>>> len(df) 3 >>> len(df > 0) 3
difference between “&” and “and”
They mean different things:
&
is bitwise andand
is logical and (and short-circuiting)
Since you asked specifically about pandas (assuming at least one operand is a NumPy array, pandas Series, or pandas DataFrame):
&
also refers to the element-wise “bitwise and”.- The element-wise “logical and” for pandas isn’t
and
but one has to use a function, i.e.numpy.logical_and
.
For more explanation you can refer to “Difference between ‘and’ (boolean) vs. ‘&’ (bitwise) in python. Why difference in behavior with lists vs numpy arrays?”
not sure what would cause this statement to fail all of a sudden.
You did not provide the “fail” nor the expected behavior so unfortunately I cannot help you there.