Skip to content
Advertisement

Extract string inside nested brackets

I need to extract strings from nested brackets like so:

[ this is [ hello [ who ] [what ] from the other side ] slim shady ]

Result (Order doesn’t matter):

This is slim shady
Hello from the other side
Who 
What

Note, the string could have N brackets, and they will always be valid, but may or may not be nested. Also, the string doesn’t have to start with a bracket.

The solutions I have found online to a similar problem suggest a regex, but I’m not sure it will work in this case.

I was thinking of implementing this similar to how we check if a string has all valid parentheses:

Walk through the string. If we see a [ we push its index on the stack, if we see a ], we substring from there to the current spot.

However, we’d need to erase that substring from the original string so we don’t get it as part of any of the outputs. So, instead of pushing just pushing the index into the stack, I was thinking of creating a LinkedList as we go along, and when we find a [ we insert that Node on the LinkedList. This will allow us to easily delete the substring from the LinkedList.

Would this be a good approach or is there a cleaner, known solution?

EDIT:

'[ this is [ hello [ who ] [what ] from the other [side] ] slim shady ][oh my [g[a[w[d]]]]]'

Should return (Order doesn’t matter):

this is slim shady
hello from the other
who 
what 
side
oh my
g
a
w
d

White spaces don’t matter, that’s trivial to remove afterwards. What matters is being able to distinguish the different contents within the brackets. Either by separating them in new lines, or having a list of strings.

Advertisement

Answer

This code scans the text by character and pushes an empty list on to the stack for every opening [ and pops the last pushed list off the stack for every closing ].

text = '[ this is [ hello [ who ] [what ] from the other side ] slim shady ]'

def parse(text):
    stack = []
    for char in text:
        if char == '[':
            #stack push
            stack.append([])
        elif char == ']':
            yield ''.join(stack.pop())
        else:
            #stack peek
            stack[-1].append(char)

print(tuple(parse(text)))

Output;

(' who ', 'what ', ' hello   from the other side ', ' this is  slim shady ')
(' who ', 'what ', 'side', ' hello   from the other  ', ' this is  slim shady ', 'd', 'w', 'a', 'g', 'oh my ')
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement