I have a string which is a file name, examples:
'20220213-0000-FSC-814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml' '20220213-0000-F814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml'
I want to find a string with re.search which corresponds to Fddd or FSC-ddd.
I have a regex like this:
type_match = re.search(r'(F(d{3}))|(FSC-(d{3}))', string)
Later after I have found for example FSC-814 , I want to get only the number from this found string, I used:
int(type_match.group(1))
but it does not work after I included or statement in the re.search
Advertisement
Answer
You can use
F(?:SC)?-?(d{3})
See the regex demo.
Details:
- F– an- Fchar
- (?:SC)?– an optional- SCchar sequence
- -?– an optional hyphen
- (d{3})– Capturing group 1: three digits.
See the Python demo:
import re
texts = ['20220213-0000-FSC-814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml',
'20220213-0000-F814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml']
pattern = r'F(?:SC)?-?(d{3})'
for text in texts:
    match = re.search(pattern, text)
    if match:
        print (match.group(1))
Output:
814 814