I have data frame of 20 columns. All of them have a common text and a serial number. I want to trim the text part and make the name shorter. Below is an example:
xdf = pd.DataFrame({'Column1':[10,20],'Column2':[80,90]}) Column1 Column2 0 10 80 1 20 90
Expected output:
C1 C2 0 10 80 1 20 90
Solution1:
oldcols = ['Column1','Column2'] newcols = ['C1','C2'] xdf.rename(columns=dict(zip(oldcols,newcols)),inplace=True) C1 C2 0 10 80 1 20 90
Solution2:
for i in range(len(oldcols)): xdf.rename(columns={'%s'%(xdf[i]):'%s'%(xdf[i].replace('Column','C'))},inplace=True) raise KeyError(key) from err
Solution1 works fine but I have to prepare an old and new column names list. Instead, I want to iterate through each column name and replace the column text. However, solution2 is not working.
Advertisement
Answer
You could use str.findall
on the columns to split into text and number; then use a list comprehension to take only the first letter and join it with the numbers for each column name:
xdf.columns = [x[0]+y for li in xdf.columns.str.findall(r'([A-Za-z]+)(d+)') for x,y in li]
Output:
C1 C2 0 10 80 1 20 90