Skip to content
Advertisement

Regular Expression split w/ Lookbehind loses second half

I have a string that contains a number of keywords. I would like to split the string into a list of those keywords (but keep the keywords because they identify what the following data means)

Take the following string for example:

JavaScript

the important keywords are “ttyp”, “pfil”, “tsng”, “tart”. I would like to split the file so the output looks:

JavaScript

I’ve been researching regular expressions, and I think this expression would work, but when tested in Python, I end up losing the part that I want to keep. According to the Python re.split documents, this should work.

Checkout my regex calculator: https://regex101.com/r/FOlgv8/1

Note: I’m trying to get the first part to work. Then I’ll add the rest of the keywords using |.

JavaScript

This is my example code:

JavaScript

Console Output:

JavaScript

I’ve tried positive lookahead and positive lookback with no luck. I could just use a literal ‘ttyp’ but then I lose the keyword.

Any help would be appreciated, I’ve been researching, trial and erroring (mostly erroring) for hours now.

Advertisement

Answer

Here ya go:

JavaScript

The reason yours didn’t work is that you split by .*, meaning you capture everything after the separator and treat it as the seperator itself (and thus throw it).

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement