Skip to content
Advertisement

How to get the whole parenthese of a string with maybe infinite level of parentheses inside

I already saw this answer : How to get parentheses inside parentheses but it didn’t really work if I don’t know the number of levels of those parentheses.

I’m actually trying to deobfuscate a js file with python, and I have this kind of string that I want to “scrape” :

String.fromCharCode
        (
            (010 * 12 + 6),
            (06 * (0x1 * (1 * 0xa + 6) + 1) + 12),
            (4 * 27 + 3),
            (01 * 0x3b + 50),
            (1 * 0x34 + 15),
            (1 * (1 * (3 * ((0x1 * 8 + 7) * 1 + 0) + 8) + 24) + 27),
            (0x1 * (2 * 0x25 + 7) + 16),
            (1 * 0112 + 40),
            (1 * 0x2c + 23),
            (0x3 * 042 + 9),
            (1 * ((05 * 4 + 1) * 03 + 0) + 37),
            (0x2 * (1 * 0x1f + 4) + 31)
        )

When I run : re.findall(r"String.fromCharCode((.+?))", content) it returns me String.fromCharCode((03 * (07 * 4 + 3) at first. So it seems like my line of code is only searching for the first occurrence of a closed parenthesis. I didn’t try the answer of the above link but it seems like to not be “infinite”, we should know beforehand the number of levels.

And what I want to get is the whole parenthesis like that : ((010 * 12 + 6),(06 * (0x1 * (1 * 0xa + 6) + 1) + 12),(4 * 27 + 3),(01 * 0x3b + 50),(1 * 0x34 + 15),(1 * (1 * (3 * ((0x1 * 8 + 7) * 1 + 0) + 8) + 24) + 27),(0x1 * (2 * 0x25 + 7) + 16),(1 * 0112 + 40),(1 * 0x2c + 23),(0x3 * 042 + 9),(1 * ((05 * 4 + 1) * 03 + 0) + 37),(0x2 * (1 * 0x1f + 4) + 31))

EDIT:

To clarify, the code have many other occurrence of the “String.fromCharCode” that is above. If I were to delete the ? in the regex code, it will retrieve the entire code.

EDIT2:

I’ve made a thing : https://pastebin.com/BVtD8R51 It seems to work.

Advertisement

Answer

I wonder if this is really the right way to tackle the problem but you might get along with a recursive approach and the newer regex module:

String.fromCharCode[^()]*
(
    (
        (?:[^()]|(?1))*
    )
)

See a demo on regex101.com.


Which in Python could be:
import regex as re

rx = re.compile(r'''
    String.fromCharCode[^()]*
    (
        (
            (?:[^()]|(?1))*
        )
    )
''', re.VERBOSE)

for snippet in rx.finditer(your_string_here):
    print(snippet.group(0))
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement