I have the following list:
JavaScript
x
2
1
pre = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to"]
2
From dataframe column I want to find texts that have the words of the list in order to generate a new column that can print these words along with the next word, for example, in a column cell there is the following text WOULD NOT PRIME CORRECTLY DURING VIRECTOMY.
, I want a new column that prints the following: WOULD NOT PRIME
.
I have tried something like this
JavaScript
1
6
1
def matcher(Event_Description):
2
for i in pre:
3
if i in Event_Description:
4
return i + 1
5
return "Not found"
6
Advertisement
Answer
You can loop over every prefix in the list and check for the prefix using .find()
. If it is found, you can change the prefix to the case of event
and append the next word. Like this:
JavaScript
1
8
1
def matcher(event):
2
pres = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to"]
3
for pre in pres:
4
i = event.lower().find(pre)
5
if i != -1:
6
return ' '.join([pre.upper() if event.isupper() else pre, *event[i + len(pre) + 1:].split(' ')[0]])
7
return "Not found"
8
If you want to include the next two words, just change this line:
JavaScript
1
2
1
return ' '.join([pre.upper() if event.isupper() else pre, *event[i + len(pre) + 1:].split(' ')[0]])
2
to a slice like this:
JavaScript
1
2
1
return ' '.join([pre.upper() if event.isupper() else pre, *event[i + len(pre) + 1:].split(' ')[0:2]])
2