Skip to content
Advertisement

Split string with a certain keyword outside a string but not inside a string

I have a question about how to use regex at this condition (or can be in any solution in Python):

What I want to achieve is to split the colon ‘:’ if it’s found outside a string, but don’t split it if it’s inside a string, like this example below:

Regex I use: (?!B"[^"]*):(?![^"]*"B)

string_to_split: str = '"A: String 1": "B: String 2": C: "D: String 4"'

Output > ["A: String 1", "B: String 2", 'C', "D: String 4"]

It got what I’ve expected, but somehow it won’t work if I put anything in front of a string that is not in a letter or a number (somehow, it won’t be split by regex if in front of a string are symbols/spaces, etc) like this one:

string_to_split: str = '"A: String 1": "B: String 2": C: " D: String 4"' (space before letter ‘D’)

Output > ["A: String 1", "B: String 2": C: " D: String 4"]

The reason why I do this is that I want to get more comfortable using regex in Python (I barely use regex when coding), so I think it might have to use look-ahead or look-behind but don’t know really much about it… I really appreciate you guys if you got into some sort of solution for this, thank you…

Advertisement

Answer

Would you please try the following:

import re

pat='(?:[^:]*"[^"]+"[^:]*)|[^:]+'
str = '"A: String 1": "B: String 2": C: " D: String 4"'

m = [x.strip() for x in re.findall(pat, str)]
#m = [x.strip('" ') for x in re.findall(pat, str)]      # removes double quotes too
print(m)

Output:

['"A: String 1"', '"B: String 2"', 'C', '" D: String 4"']
  • The regex pat matches any sequences of characters other than a colon, while allowing the existence of colons within the double quotes.
  • The regex leaves the leading/trailing whitespaces, which is then removed by strip().

If you want to remove the surrounding double quotes as well, apply strip('" ') instead. Then the output will be:

['A: String 1', 'B: String 2', 'C', 'D: String 4']
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement