I have a dataframe like as shown below
df = pd.DataFrame() df['text'] = ['p', 'S', 'primary','PRI','SECONDARY', 'SEC', 'S', 'TERTIARY','T','third']
I would like to replace a list of values like as shown below
a) Replace P, PRIMARY,PRI with primary
b) Replace S, SECONDARY, SEC with secondary
c) Replace T, TERTIARY, THIRD with third
I tried the below
df['text'] = df['text'].replace(['P','PRIMARY','PRI'],'primary') df['text'] = df['text'].replace(['S','SECONDARY','SEC'],'secondary') df['text'] = df['text'].replace(['T','TERTIARY','THIRD'],'tertiary')
But is there any other efficient and elegant way to write this in a single line?
I expect my output to be like as shown below
text 0 primary 1 secondary 2 primary 3 primary 4 secondary 5 secondary 6 secondary 7 tertiary 8 tertiary 9 tertiary
Advertisement
Answer
One idea for avoid multiple replace is use dictionary and flatten it to another dict for keys from lists values, for match convert column to uppercase by Series.str.upper:
d = {'primary': ['P','PRIMARY','PRI'],
'secondary':['S','SECONDARY','SEC'],
'tertiary':['T','TERTIARY','THIRD']}
d1 = {x: k for k, v in d.items() for x in v}
df['text'] = df['text'].str.upper().replace(d1)
print (df)
text
0 primary
1 secondary
2 primary
3 primary
4 secondary
5 secondary
6 secondary
7 tertiary
8 tertiary
9 tertiary