I would like to split a string like this
str = "$$Node_Type<block=begin>Blabla$$Node_Type<block=end>"
to something like this:
tab = ["$$Node_Type<block=begin>", "Blabla", "$$Node_Type<block=end>"]
but I can also have this:
str = "$$Node_Type1<block=begin>Blabla1$$Node_Type2<block=begin>Blabla2$$Node_Type2<block=end>$$Node_Type1<block=end>"
to something like this:
tab = ["$$Node_Type1<block=begin>", "Blabla1", "$$Node_Type2<block=begin>", "Blabla2", "$$Node_Type2<block=end>", "$$Node_Type1<block=end>"]
The idea at the end is to print it like that
$$Node_Type1<block=begin> Blabla1 $$Node_Type2<block=begin> Blabla2 $$Node_Type2<block=end> $$Node_Type1<block=end>
Does someone has an idea ? Thx
Advertisement
Answer
You can take advantage of the fact that re.split
retains the “splitter” in the results if it’s a capturing group, and then:
import re example = "Hello$$Node_Type1<block=begin>Blabla1$$Node_Type2<block=begin>Blabla2$$Node_Type2<block=end>$$Node_Type1<block=end>" level = 0 for bit in re.split(r'($$[^>]+>)', example): if bit.startswith('$$') and bit.endswith('block=end>'): level -= 1 if bit: print(' ' * level + bit) if bit.startswith('$$') and bit.endswith('block=begin>'): level += 1
This prints out
Hello $$Node_Type1<block=begin> Blabla1 $$Node_Type2<block=begin> Blabla2 $$Node_Type2<block=end> $$Node_Type1<block=end>