I’m trying to add a condition .apply
with a lambda function that pickups up a keyword but ignores values if they contain a specific word “total.”
My dataframe:
JavaScript
x
11
11
1
Section Budget Hours Budget Hours Total Budget
2
1 70460.04 702.08 0 0 81581.02
3
2 137149.03 1141.3 0 0 159823.28
4
3 33958.35 227.05 0 0 55243.23
5
4 160688.52 1062 0 0 176156.52
6
5 1340843.5 12311.650 0 1367133.42
7
6 73295.25 480 0 0 73295.25
8
7 0 0 0 0 0
9
8 1131911.04 6537.96 0 0 1131911.04
10
9 2216733.4 16425.640 0 3708886.3
11
Here’s my code:
JavaScript
1
4
1
keyword = budget
2
row, col = np.where((headers_df.apply(lambda x: x.astype(str).str.contains(keyword, case=False)))
3
& (~headers_df.apply(lambda x: x.astype(str).str.contains('total', case=False))))
4
current col
output:
JavaScript
1
2
1
col = (1, 3, 5)
2
Desired col
output:
JavaScript
1
2
1
col = (1, 3)
2
Question
What am I doing wrong? I have tried the ~
to make it a NOT
but it’s still picking up rows and columns even if they contain the keyword “total.”
Advertisement
Answer
You can use regex like this:
JavaScript
1
2
1
df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]
2
Output:
JavaScript
1
2
1
Index(['Section Budget', 'Budget'], dtype='object')
2
OR
JavaScript
1
3
1
df.columns[df.columns.str.contains('budget', case=False) &
2
~df.columns.str.contains('total', case=False)]
3
Output:
JavaScript
1
2
1
Index(['Section Budget', 'Budget'], dtype='object')
2