I would like to split a string like this
str = "$$Node_Type<block=begin>Blabla$$Node_Type<block=end>"
to something like this:
tab = ["$$Node_Type<block=begin>", "Blabla", "$$Node_Type<block=end>"]
but I can also have this:
str = "$$Node_Type1<block=begin>Blabla1$$Node_Type2<block=begin>Blabla2$$Node_Type2<block=end>$$Node_Type1<block=end>"
to something like this:
tab = ["$$Node_Type1<block=begin>", "Blabla1", "$$Node_Type2<block=begin>", "Blabla2", "$$Node_Type2<block=end>", "$$Node_Type1<block=end>"]
The idea at the end is to print it like that
$$Node_Type1<block=begin>
   Blabla1
   $$Node_Type2<block=begin>
      Blabla2
   $$Node_Type2<block=end>
$$Node_Type1<block=end>
Does someone has an idea ? Thx
Advertisement
Answer
You can take advantage of the fact that re.split retains the “splitter” in the results if it’s a capturing group, and then:
import re
example = "Hello$$Node_Type1<block=begin>Blabla1$$Node_Type2<block=begin>Blabla2$$Node_Type2<block=end>$$Node_Type1<block=end>"
level = 0
for bit in re.split(r'($$[^>]+>)', example):
    if bit.startswith('$$') and bit.endswith('block=end>'):
        level -= 1
    if bit:
        print('  ' * level + bit)
    if bit.startswith('$$') and bit.endswith('block=begin>'):
        level += 1
This prints out
Hello
$$Node_Type1<block=begin>
  Blabla1
  $$Node_Type2<block=begin>
    Blabla2
  $$Node_Type2<block=end>
$$Node_Type1<block=end>