Is there a way to use same name in regex named group in python?
e.g.(?P<n>foo)|(?P<n>bar)
.
Use case:
I am trying to capture type
and id
with this regex:
/(?=videos)((?P<type>videos)/(?P<id>d+))|(?P<type>w+)/?(?P<v>v)?/?(?P<id>d+)?
from this strings:
- /channel/v/123
- /ch/v/41500082
- /channel
- /videos/41500082
For now I am getting error:
redefinition of group name 'id' as group 6; was group 3
Advertisement
Answer
The answer is: Python re
does not support identically named groups.
Python PyPi regex
module supports identically named named capturing groups:
The same name can be used by more than one group, with later captures ‘overwriting’ earlier captures. All of the captures of the group will be available from the captures method of the match object.
And here is a live Python 2.7 demo:
import regex s = "foo bar" rx = regex.compile(r"(?P<n>foo)|(?P<n>bar)") print([x.group("n") for x in rx.finditer(s)]) // => ['foo', 'bar']
Also, in other cases, when you want to match several alternatives and capture just parts into one group, you may resort to a branch reset feature:
Branch reset
(?|...|...)
Capture group numbers will be reused across the alternatives, but groups with different names will have different group numbers.
Examples:
>>> regex.match(r"(?|(first)|(second))", "first").groups()
('first',)
>>> regex.match(r"(?|(first)|(second))", "second").groups()
('second',)
Note that there is only one group.