Skip to content
Advertisement

Python Trim Multiple Column Names of a dataframe

I have data frame of 20 columns. All of them have a common text and a serial number. I want to trim the text part and make the name shorter. Below is an example:

xdf = pd.DataFrame({'Column1':[10,20],'Column2':[80,90]})

   Column1  Column2
0       10       80
1       20       90

Expected output:

        C1       C2
0       10       80
1       20       90

Solution1:

oldcols = ['Column1','Column2']
newcols = ['C1','C2']
xdf.rename(columns=dict(zip(oldcols,newcols)),inplace=True)

        C1       C2
0       10       80
1       20       90

Solution2:

for i in range(len(oldcols)):
    xdf.rename(columns={'%s'%(xdf[i]):'%s'%(xdf[i].replace('Column','C'))},inplace=True)

raise KeyError(key) from err

Solution1 works fine but I have to prepare an old and new column names list. Instead, I want to iterate through each column name and replace the column text. However, solution2 is not working.

Advertisement

Answer

You could use str.findall on the columns to split into text and number; then use a list comprehension to take only the first letter and join it with the numbers for each column name:

xdf.columns = [x[0]+y for li in xdf.columns.str.findall(r'([A-Za-z]+)(d+)') for x,y in li]

Output:

   C1  C2
0  10  80
1  20  90
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement