Skip to content
Advertisement

How to make my regex match stop after a lookahead?

I have some text from a pdf in one string, I want to break it up so that I have a list where every string starts with a digit and a period, and then stops before the next number.

For example I want to turn this:

JavaScript

Into this:

JavaScript

The issue is that the original string has ‘n’ scattered in the middle of the titles (for example in 4.1 theres a n before the word encumbrances.

JavaScript

This is the regex I’ve been trying to use but it matches the whole string instead of each number line. Is there any way for my regex to stop the match right before the next number line?

Advertisement

Answer

Something like:

list = re.findall(r"^d+..*?(?=^d+.|Z)", text, re.MULTILINE | re.DOTALL)

Further explanation on request.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement