Skip to content
Advertisement

difference between “&” and “and” in pandas

I have some code that runs on a cron (via kubernetes) for several months now.

Yesterday, part of my code didn’t work that normally does:

This statement, all of a sudden, wasnt ‘True’ (both df_temp and df_temp4 have data in them:

if ( len(df_temp > 0) & len(df_temp4 > 0)):
    print "HERE"

however, this worked:

if ( len(df_temp > 0) and len(df_temp4 > 0)):
    print "HERE"

Was there some sort of code push that would cause this change? Since I’ve run this code for months, not sure what would cause this statement to fail all of a sudden.

Advertisement

Answer

The len(df_temp > 0) and len(df_temp4 > 0) probably don’t do what you expect. The comparison operators with pandas DataFrames return element-wise results, that means they create a boolean DataFrame where each value indicates if the corresponding value in the DataFrame is greater than zero:

>>> import pandas as pd
>>> df = pd.DataFrame({'a': [-1,0,1], 'b': [-1,0,1]})
>>> df
   a  b
0 -1 -1
1  0  0
2  1  1
>>> df > 0
       a      b
0  False  False
1  False  False
2   True   True

So the len of df is the same as the len of df > 0:

>>> len(df)
3
>>> len(df > 0)
3

difference between “&” and “and”

They mean different things:

Since you asked specifically about pandas (assuming at least one operand is a NumPy array, pandas Series, or pandas DataFrame):

  • & also refers to the element-wise “bitwise and”.
  • The element-wise “logical and” for pandas isn’t and but one has to use a function, i.e. numpy.logical_and.

For more explanation you can refer to “Difference between ‘and’ (boolean) vs. ‘&’ (bitwise) in python. Why difference in behavior with lists vs numpy arrays?”

not sure what would cause this statement to fail all of a sudden.

You did not provide the “fail” nor the expected behavior so unfortunately I cannot help you there.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement