Skip to content
Advertisement

pandas rename multiple columns using regex pattern

I have a dataframe like as shown below

ID,US-Test1,US-Test2,US-Test3
1,11,12,13
2,13,16,18
3,15,19,21

I would like to remove the keyword US - from all my column names

I tried the below but there should be better way to do this

newNames = {
    'US-Test1':'Test1',
    'US-Test2':'Test2'
}
df.rename(columns=newNames,inplace=True)

But my real data has 70 plus columns and this is not efficient.

Any regex approach to rename columns based on regex to exclude the pattern and retain only what I want?

I expect my output to be like as shown below

ID,Test1,Test2,Test3
1,11,12,13
2,13,16,18
3,15,19,21

Advertisement

Answer

You could use a regex that matches the “US-” at the beginning like this:

df.columns = df.columns.str.replace("^US-", "", regex=True)

It replaces the matching “US-” with an empty string.

Also, if you know the columns that you want to transform you could apply slicing on their names to remove the first 3 characters:

df.columns = df.columns.str.slice(3)

Of course, this will affect columns that do not match your condition (i.e. do not begin with “US-“)

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement