I’m trying to add a condition .apply
with a lambda function that pickups up a keyword but ignores values if they contain a specific word “total.”
My dataframe:
Section Budget Hours Budget Hours Total Budget 1 70460.04 702.08 0 0 81581.02 2 137149.03 1141.3 0 0 159823.28 3 33958.35 227.05 0 0 55243.23 4 160688.52 1062 0 0 176156.52 5 1340843.5 12311.650 0 1367133.42 6 73295.25 480 0 0 73295.25 7 0 0 0 0 0 8 1131911.04 6537.96 0 0 1131911.04 9 2216733.4 16425.640 0 3708886.3
Here’s my code:
keyword = budget row, col = np.where((headers_df.apply(lambda x: x.astype(str).str.contains(keyword, case=False))) & (~headers_df.apply(lambda x: x.astype(str).str.contains('total', case=False))))
current col
output:
col = (1, 3, 5)
Desired col
output:
col = (1, 3)
Question
What am I doing wrong? I have tried the ~
to make it a NOT
but it’s still picking up rows and columns even if they contain the keyword “total.”
Advertisement
Answer
You can use regex like this:
df.columns[df.columns.str.contains('(?=.*Budget)^(?!Total)', regex=True)]
Output:
Index(['Section Budget', 'Budget'], dtype='object')
OR
df.columns[df.columns.str.contains('budget', case=False) & ~df.columns.str.contains('total', case=False)]
Output:
Index(['Section Budget', 'Budget'], dtype='object')