I have a dataframe like as shown below
df = pd.DataFrame() df['text'] = ['p', 'S', 'primary','PRI','SECONDARY', 'SEC', 'S', 'TERTIARY','T','third']
I would like to replace a list of values like as shown below
a) Replace P
, PRIMARY
,PRI
with primary
b) Replace S
, SECONDARY
, SEC
with secondary
c) Replace T
, TERTIARY
, THIRD
with third
I tried the below
df['text'] = df['text'].replace(['P','PRIMARY','PRI'],'primary') df['text'] = df['text'].replace(['S','SECONDARY','SEC'],'secondary') df['text'] = df['text'].replace(['T','TERTIARY','THIRD'],'tertiary')
But is there any other efficient and elegant way to write this in a single line?
I expect my output to be like as shown below
text 0 primary 1 secondary 2 primary 3 primary 4 secondary 5 secondary 6 secondary 7 tertiary 8 tertiary 9 tertiary
Advertisement
Answer
One idea for avoid multiple replace
is use dictionary
and flatten it to another dict for keys from lists values, for match convert column to uppercase by Series.str.upper
:
d = {'primary': ['P','PRIMARY','PRI'], 'secondary':['S','SECONDARY','SEC'], 'tertiary':['T','TERTIARY','THIRD']} d1 = {x: k for k, v in d.items() for x in v} df['text'] = df['text'].str.upper().replace(d1) print (df) text 0 primary 1 secondary 2 primary 3 primary 4 secondary 5 secondary 6 secondary 7 tertiary 8 tertiary 9 tertiary