Split a string by regex and keep the seperator AS A PART OF ITEMS in python

Question

I want to split a whatsapp chat backup text by date and keep the date as part of messages. I tried and couldn't achieve the exact result i want. If anyone can suggest me a way to achieve this, that would be a big help. (I don't know much about regex) the above code does the job and keep the

Accepted Answer

That happened because you used re.split that keeps the chunks captured in the resulting list as separate items.Your regex makes sense only if your matches can span several lines, else, extracting any line that starts with a time-like pattern would be enough.That is why I&#8217;d suggestregex = r"bd+/d+/d.*?(?=s*bd+/d+/d+|$)"results = re.findall(regex, chat, re.S)See the Python demo:import rechat = '''27/01/2019, 08:58 - Member 01 created group "Python Lovers ❤️"27/01/2019, 08:58 - You were added19/03/2019, 19:29 - Member 02: Hello guys,,,19/03/2019, 19:29 - Member 03: Hi there..'''regex = r"bd+/d+/d.*?(?=s*bd+/d+/d+|$)"results = re.findall(regex, chat, re.S)for r in results:    print(r)Output:27/01/2019, 08:58 - Member 01 created group "Python Lovers ❤️"27/01/2019, 08:58 - You were added19/03/2019, 19:29 - Member 02: Hello guys,,,19/03/2019, 19:29 - Member 03: Hi there..Note the absence of the redundant capturing group and no * after the positive lookahead that made it optional. Whitespaces at the end of each match are stripped using s* pattern inside the lookahead.The re.S flag allows . to match any char including line break chars.

Advertisement

Answer