I have a dataframe like as shown below
ID,US-Test1,US-Test2,US-Test3 1,11,12,13 2,13,16,18 3,15,19,21
I would like to remove the keyword US -
from all my column names
I tried the below but there should be better way to do this
newNames = { 'US-Test1':'Test1', 'US-Test2':'Test2' } df.rename(columns=newNames,inplace=True)
But my real data has 70 plus columns and this is not efficient.
Any regex approach to rename columns based on regex to exclude the pattern and retain only what I want?
I expect my output to be like as shown below
ID,Test1,Test2,Test3 1,11,12,13 2,13,16,18 3,15,19,21
Advertisement
Answer
You could use a regex that matches the “US-” at the beginning like this:
df.columns = df.columns.str.replace("^US-", "", regex=True)
It replaces the matching “US-” with an empty string.
Also, if you know the columns that you want to transform you could apply slicing on their names to remove the first 3 characters:
df.columns = df.columns.str.slice(3)
Of course, this will affect columns that do not match your condition (i.e. do not begin with “US-“)