Skip to content
Advertisement

elegant way to replace multiple list of values with a multiple single value

I have a dataframe like as shown below

df = pd.DataFrame()
df['text'] = ['p', 'S', 'primary','PRI','SECONDARY', 'SEC', 'S', 'TERTIARY','T','third']

I would like to replace a list of values like as shown below

a) Replace P, PRIMARY,PRI with primary b) Replace S, SECONDARY, SEC with secondary c) Replace T, TERTIARY, THIRD with third

I tried the below

df['text'] = df['text'].replace(['P','PRIMARY','PRI'],'primary')
df['text'] = df['text'].replace(['S','SECONDARY','SEC'],'secondary')
df['text'] = df['text'].replace(['T','TERTIARY','THIRD'],'tertiary')

But is there any other efficient and elegant way to write this in a single line?

I expect my output to be like as shown below

     text
0   primary
1   secondary
2   primary
3   primary
4   secondary
5   secondary
6   secondary
7   tertiary
8   tertiary
9   tertiary

Advertisement

Answer

One idea for avoid multiple replace is use dictionary and flatten it to another dict for keys from lists values, for match convert column to uppercase by Series.str.upper:

d = {'primary': ['P','PRIMARY','PRI'],
     'secondary':['S','SECONDARY','SEC'],
     'tertiary':['T','TERTIARY','THIRD']}


d1 = {x: k for k, v in d.items() for x in v}
df['text'] = df['text'].str.upper().replace(d1)
print (df)
        text
0    primary
1  secondary
2    primary
3    primary
4  secondary
5  secondary
6  secondary
7   tertiary
8   tertiary
9   tertiary
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement