Repeat pattern using python regex

Question

Well, I'm cleaning a dataset, using Pandas. I have a column called "Country", where different rows could have numbers or other information into parenthesis and I have to remove them, for example: Australia1, Perú (country), 3Costa Rica, etc. To do this, I'm getting the column and I make a mapping over it. But I have a problem with this regex,

Accepted Answer

In this situation, I will clean the data step by step.df_str = '''CountryAustralia1Perú (country)3Costa RicaUnited States of America'''df = pd.read_csv(io.StringIO(df_str.strip()), sep='n')# handle the data(df['Country'] .str.replace('d+', '', regex=True)  # remove number .str.split('(').str[0]              # get items before `(` .str.strip()                         # strip spaces )

Advertisement

Answer