I want to match cc dd
that doesn’t start with aa
import re s = 'bb cc dd eenaa : bb cc dd een11 cc dd ee' pp = re.compile(r'(?P<n1>ee)|(?P<n2>^(?!aab).*bcc ddb)', re.MULTILINE) def _rep(x): print(x.groupdict()) return [f'<{k}>' for k, v in x.groupdict().items() if v is not None][0] rr = pp.sub(_rep, s) print(rr)
Result: Current
# print(x.groupdict()) {'n1': None, 'n2': 'bb cc dd'} {'n1': 'ee', 'n2': None} {'n1': 'ee', 'n2': None} {'n1': None, 'n2': '11 cc dd'} {'n1': 'ee', 'n2': None} # print(rr) <n2> <n1> aa : bb cc dd <n1> <n2> <n1>
Result: I want ..
# print(x.groupdict()) {'n1': None, 'n2': 'cc dd'} {'n1': 'ee', 'n2': None} {'n1': 'ee', 'n2': None} {'n1': None, 'n2': 'cc dd'} {'n1': 'ee', 'n2': None} # print(rr) bb <n2> <n1> aa : bb cc dd <n1> 11 <n2> <n1>
Advertisement
Answer
With re
, it won’t be possible to achieve what you need because you expect multiple occurrences per string that will be replaced later, and you need a variable-width lookbehind pattern support (not available in re
).
You need to install the PyPi regex
module by launching pip install regex
in your terminal/console and then use
import regex s = 'bb cc dd eenaa : bb cc dd een11 cc dd ee' pp = regex.compile(r'(?P<n1>ee)|(?<!^aab.*)b(?P<n2>cc dd)b', regex.MULTILINE) def _rep(x): #print(x.groupdict()) return [f'<{k}>' for k, v in x.groupdict().items() if v is not None][0] rr = pp.sub(_rep, s) print(rr)
See the Python demo.
Here, (?<!^aab.*)b(?P<n2>cc dd)b
matches a whole word cc dd
capturing it into n2
group that is not immediately preceded with aa
whole word at the beginning of the current line (regex.MULTILINE
with ^
make this anchor match any line start position and .*
makes sure the check is performed even if cc dd
is not immediately preceded with aa
.