Skip to content
Advertisement

Lambda function apply on dataframe DOES not contain str

I’m trying to add a condition .apply with a lambda function that pickups up a keyword but ignores values if they contain a specific word “total.”

My dataframe:

Section Budget      Hours   Budget  Hours       Total Budget
1       70460.04    702.08  0       0           81581.02
2       137149.03   1141.3  0       0           159823.28
3       33958.35    227.05  0       0           55243.23
4       160688.52   1062    0       0           176156.52
5       1340843.5   12311.650       0           1367133.42
6       73295.25    480     0       0           73295.25
7       0           0       0       0           0
8       1131911.04  6537.96 0       0           1131911.04
9       2216733.4   16425.640       0           3708886.3

Here’s my code:

keyword = budget
row, col = np.where((headers_df.apply(lambda x: x.astype(str).str.contains(keyword, case=False)))
                    & (~headers_df.apply(lambda x: x.astype(str).str.contains('total', case=False))))

current col output:

col = (1, 3, 5)

Desired col output:

col = (1, 3)

Question

What am I doing wrong? I have tried the ~ to make it a NOT but it’s still picking up rows and columns even if they contain the keyword “total.”

Advertisement

Answer

You can use regex like this:

df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]

Output:

Index(['Section Budget', 'Budget'], dtype='object')

OR

df.columns[df.columns.str.contains('budget', case=False) & 
           ~df.columns.str.contains('total', case=False)]

Output:

Index(['Section Budget', 'Budget'], dtype='object')
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement