I have a string which is a file name, examples:
'20220213-0000-FSC-814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml' '20220213-0000-F814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml'
I want to find a string with re.search which corresponds to Fddd
or FSC-ddd
.
I have a regex like this:
type_match = re.search(r'(F(d{3}))|(FSC-(d{3}))', string)
Later after I have found for example FSC-814
, I want to get only the number from this found string, I used:
int(type_match.group(1))
but it does not work after I included or statement in the re.search
Advertisement
Answer
You can use
F(?:SC)?-?(d{3})
See the regex demo.
Details:
F
– anF
char(?:SC)?
– an optionalSC
char sequence-?
– an optional hyphen(d{3})
– Capturing group 1: three digits.
See the Python demo:
import re texts = ['20220213-0000-FSC-814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml', '20220213-0000-F814-SC_VIRG_REFBAL_PRES_NPMINMAX-v1.xml'] pattern = r'F(?:SC)?-?(d{3})' for text in texts: match = re.search(pattern, text) if match: print (match.group(1))
Output:
814 814