Skip to content
Advertisement

pandas: filter rows of DataFrame with operator chaining

Most operations in pandas can be accomplished with operator chaining (groupby, aggregate, apply, etc), but the only way I’ve found to filter rows is via normal bracket indexing

df_filtered = df[df['column'] == value]

This is unappealing as it requires I assign df to a variable before being able to filter on its values. Is there something more like the following?

df_filtered = df.mask(lambda x: x['column'] == value)

Advertisement

Answer

I’m not entirely sure what you want, and your last line of code does not help either, but anyway:

“Chained” filtering is done by “chaining” the criteria in the boolean index.

In [96]: df
Out[96]:
   A  B  C  D
a  1  4  9  1
b  4  5  0  2
c  5  5  1  0
d  1  3  9  6

In [99]: df[(df.A == 1) & (df.D == 6)]
Out[99]:
   A  B  C  D
d  1  3  9  6

If you want to chain methods, you can add your own mask method and use that one.

In [90]: def mask(df, key, value):
   ....:     return df[df[key] == value]
   ....:

In [92]: pandas.DataFrame.mask = mask

In [93]: df = pandas.DataFrame(np.random.randint(0, 10, (4,4)), index=list('abcd'), columns=list('ABCD'))

In [95]: df.ix['d','A'] = df.ix['a', 'A']

In [96]: df
Out[96]:
   A  B  C  D
a  1  4  9  1
b  4  5  0  2
c  5  5  1  0
d  1  3  9  6

In [97]: df.mask('A', 1)
Out[97]:
   A  B  C  D
a  1  4  9  1
d  1  3  9  6

In [98]: df.mask('A', 1).mask('D', 6)
Out[98]:
   A  B  C  D
d  1  3  9  6
Advertisement