I have a list of strings coming from os.listdir()
that looks like the following:
JavaScript
x
5
1
['foo',
2
'bar'
3
'backup_20180406'
4
]
5
out of those entries, I wanna get the ones that match the “backup_YYYYMMDD” pattern. The regex for that, with named groups, would be
JavaScript
1
2
1
regex = r"BACKUP_(?P<date>d+)"
2
I am trying to create a list that contains the date only from the above (aka the .group('date')
), but I cannot find a way to do it without parsing the strings twice..
JavaScript
1
2
1
res = [re.search(regex, x).group('date') for x in filter(r.match, os.listdir(folder))]
2
I am sure that I am missing something really obvious and concise here, so is there a better way?
Advertisement
Answer
I usually do:
JavaScript
1
5
1
regex = re.compile(r"BACKUP_(?P<date>d+)")
2
a = ['foo', "BACKUP_20180406", 'xxx']
3
matches = [regex.match(x) for x in a]
4
valid = [x.group('date') for x in matches if x]
5
Or just
JavaScript
1
2
1
valid = [x.group('date') for x in (regex.match(y) for y in a) if x]
2
Also notice that regex.match
is much faster than regex.search
when applicable – i.e. when you search from the beginning of the line.