I have a dataframe like as shown below
JavaScript
x
3
1
df = pd.DataFrame()
2
df['text'] = ['p', 'S', 'primary','PRI','SECONDARY', 'SEC', 'S', 'TERTIARY','T','third']
3
I would like to replace a list of values like as shown below
a) Replace P
, PRIMARY
,PRI
with primary
b) Replace S
, SECONDARY
, SEC
with secondary
c) Replace T
, TERTIARY
, THIRD
with third
I tried the below
JavaScript
1
4
1
df['text'] = df['text'].replace(['P','PRIMARY','PRI'],'primary')
2
df['text'] = df['text'].replace(['S','SECONDARY','SEC'],'secondary')
3
df['text'] = df['text'].replace(['T','TERTIARY','THIRD'],'tertiary')
4
But is there any other efficient and elegant way to write this in a single line?
I expect my output to be like as shown below
JavaScript
1
12
12
1
text
2
0 primary
3
1 secondary
4
2 primary
5
3 primary
6
4 secondary
7
5 secondary
8
6 secondary
9
7 tertiary
10
8 tertiary
11
9 tertiary
12
Advertisement
Answer
One idea for avoid multiple replace
is use dictionary
and flatten it to another dict for keys from lists values, for match convert column to uppercase by Series.str.upper
:
JavaScript
1
20
20
1
d = {'primary': ['P','PRIMARY','PRI'],
2
'secondary':['S','SECONDARY','SEC'],
3
'tertiary':['T','TERTIARY','THIRD']}
4
5
6
d1 = {x: k for k, v in d.items() for x in v}
7
df['text'] = df['text'].str.upper().replace(d1)
8
print (df)
9
text
10
0 primary
11
1 secondary
12
2 primary
13
3 primary
14
4 secondary
15
5 secondary
16
6 secondary
17
7 tertiary
18
8 tertiary
19
9 tertiary
20