Skip to content
Advertisement

What is the correct way of grabbing an inner string in regular expressions for Python for multiple conditions

I would like to return all strings within the specified starting and end strings.

Given a string libs = 'libr(lib1), libr(lib2), libr(lib3), req(reqlib), libra(nonlib)'.

From the above libs string I would like to search for strings that are in between libr( and ) or the string between req( and ).

I would like to return ['lib1', 'lib2', 'lib3', 'reqlib']

import re 
libs = 'libr(lib1), libr(lib2), libr(lib3), req(reqlib), libra(nonlib)'
pat1 = r'libr+((.*?))'
pat2 = r'req+((.*?))'
pat = f"{pat1}|{pat2}"
re.findall(pat, libs)

The code above currently returns [('lib1', ''), ('lib2', ''), ('lib3', ''), ('', 'reqlib')] and I am not sure how I should fix this.

Advertisement

Answer

Try this regex

(?:(?<=libr()|(?<=req())[^)]+

Click for Demo

Click for Code

Explanation:

  • (?:(?<=libr()|(?<=req())
    • (?<=libr() – positive lookbehind that matches the position which is immediately preceded by text libr(
    • | – or
    • (?<=req() – positive lookbehind that matches the position which is immediately preceded by text req(
  • [^)]+ – matches 1+ occurrences of any character which is not a ). So, this will match everything until it finds the next )
Advertisement