I'm trying to add a condition .apply with a lambda function that pickups up a keyword but ignores values if they contain a specific word "total." My dataframe: Here's my code: current col output: Desired col output: Question What am I doing wrong? I have tried the ~ to make it a NOT but it's still picking up rows and

You can use regex like this:df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]Output:Index(['Section Budget', 'Budget'], dtype='object')ORdf.columns[df.columns.str.contains('budget', case=False) & ~df.columns.str.contains('total', case=False)]Output:Index(['Section Budget', 'Budget'], dtype='object')

I’m trying to add a condition .apply with a lambda function that pickups up a keyword but ignores values if they contain a specific word “total.”

My dataframe:

Section Budget      Hours   Budget  Hours       Total Budget
1       70460.04    702.08  0       0           81581.02
2       137149.03   1141.3  0       0           159823.28
3       33958.35    227.05  0       0           55243.23
4       160688.52   1062    0       0           176156.52
5       1340843.5   12311.650       0           1367133.42
6       73295.25    480     0       0           73295.25
7       0           0       0       0           0
8       1131911.04  6537.96 0       0           1131911.04
9       2216733.4   16425.640       0           3708886.3

JavaScript
​x
 
Section Budget      Hours   Budget  Hours       Total Budget
1       70460.04    702.08  0       0           81581.02
2       137149.03   1141.3  0       0           159823.28
3       33958.35    227.05  0       0           55243.23
4       160688.52   1062    0       0           176156.52
5       1340843.5   12311.650       0           1367133.42
6       73295.25    480     0       0           73295.25
7       0           0       0       0           0
8       1131911.04  6537.96 0       0           1131911.04
9       2216733.4   16425.640       0           3708886.3
​

Here’s my code:

keyword = budget
row, col = np.where((headers_df.apply(lambda x: x.astype(str).str.contains(keyword, case=False)))
                    & (~headers_df.apply(lambda x: x.astype(str).str.contains('total', case=False))))

JavaScript
 
keyword = budget
row, col = np.where((headers_df.apply(lambda x: x.astype(str).str.contains(keyword, case=False)))
                    & (~headers_df.apply(lambda x: x.astype(str).str.contains('total', case=False))))
​

current col output:

col = (1, 3, 5)

JavaScript
 
col = (1, 3, 5)
​

Desired col output:

col = (1, 3)

JavaScript
 
col = (1, 3)
​

Question

What am I doing wrong? I have tried the ~ to make it a NOT but it’s still picking up rows and columns even if they contain the keyword “total.”

Answer

You can use regex like this:

df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]

JavaScript
 
df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]
​

Output:

Index(['Section Budget', 'Budget'], dtype='object')

JavaScript
 
Index(['Section Budget', 'Budget'], dtype='object')
​

df.columns[df.columns.str.contains('budget', case=False) & 
           ~df.columns.str.contains('total', case=False)]

JavaScript
 
df.columns[df.columns.str.contains('budget', case=False) & 
           ~df.columns.str.contains('total', case=False)]
​

Output:

Index(['Section Budget', 'Budget'], dtype='object')

JavaScript
 
Index(['Section Budget', 'Budget'], dtype='object')
​

Lambda function apply on dataframe DOES not contain str

Question

Advertisement

Answer