How can I generate a list with regex in python for countries with 4 or 5 letters?
names = ['Azerbaijan', 'Zimbabwe', 'Cuba', 'Cambodia', 'Somalia','Mali', 'Belarus', "Côte d'Ivoire", 'Venezuela', 'Syria', 'Kazakhstan', 'Austria', 'Malawi', 'Romania', 'Congo (Brazzaville)']
I was trying to do this but it returns an empty list:
import re
n = [w for w in names if re.findall(r'[A-Z]{4,6},', str(names))]
print(n)
Output:
[]
It is an exercise that’s why I can only do it with the module re. Any help will be appreciate.
Advertisement
Answer
You can use len(w).
>>> names = ['Azerbaijan', 'Zimbabwe', 'Cuba', 'Cambodia', 'Somalia','Mali', 'Belarus', "Côte d'Ivoire", 'Venezuela', 'Syria', 'Kazakhstan', 'Austria', 'Malawi', 'Romania', 'Congo (Brazzaville)'] >>> [w for w in names if w.isalpha() and (len(w) in range(4,6))] ['Cuba', 'Mali', 'Syria']
But if you want to solve it with regex you can use re.search and
If maybe you have numbers in list you can use walrus operator (:=) for python >= 3.8
names = ['1234', 'Azerbaijan', 'Cuba']
print([w for w in names if (tmp := re.search(r'[A-Za-z]{4,5}', w)) and (tmp.group(0) == w)])
# ['Cuba']
For python < 3.8 you can use try/except.
names = ['1234', 'Azerbaijan', 'Cuba']
res = []
for w in names:
    try :
        if re.search(r'[A-Za-z]{4,5}', w).group(0) == w:
            res.append(w)
    except AttributeError:
        continue
print(res)
# ['Cuba']