Python regex to match many tokens in sequnece

Question

I have a test string that looks like These are my food preferences mango and I also like bananas and I like grapes too. I am trying to write a regex in python to return the text with such rules: Search for the keyword: preferences make a group (words 1:7) until the word 'like' >> Repeat this step as much

Accepted Answer

You can use(?Pbpreferencesb)(?P(?:s+w+(?:s+w+){0,6}?s+like)+)(?:s+(?Pw+(?:s+w+){1,7}))?See the regex demo.Details:(?Pbpreferencesb) – Group “Start”: a whole word preferences(?P(?:s+w+(?:s+w+){0,6}?s+like)+) – Group “Mid”: one or more repetitions ofs+ – one or more whitespacesw+(?:s+w+){0,6}? – one or more word chars and then zero to six occurrences of one or more whitespaces and then one or more word chars, as few as possibles+like – one or more whitespaces and then the word like(?:s+(?Pw+(?:s+w+){1,7}))? – an optional occurrence ofs+ – one or more whitespaces(?Pw+(?:s+w+){1,7}) – Group “Last”: one or more word chars and then one to seven occurrences of one or more whitespaces and one or more word charsSee the Python demo:import retext = "These are my food preferences mango and I also like bananas and I like grapes too."pattern = r"(?Pbpreferencesb)(?P(?:s+w+(?:s+w+){0,6}?s+like)+)(?:s+(?Pw+(?:s+w+){1,7}))?"match = re.search(pattern, text)if match: print(match.group("Start")) print( re.split(r"s*blikebs*", match.group("Mid").strip()) ) print(match.group("Last"))Output:preferences['mango and I also', 'bananas and I', '']grapes too

Advertisement

Answer