I have a DataFrame, df,  where I would like to replace several values
| user1 | user2 | user3 | 
|---|---|---|
| apple | yoo | apple | 
| mango | ram | mango | 
Instead of doing
df['user1'] = df['user1'].replace(['apple','mango'], [0, 1]) df['user3'] = df['user1'].replace(['apple','mango'], [0, 1]) df['user2'] = df['user2'].replace(['yoo','ram'], [2, 3])
to get the final DataFrame of
| user1 | user2 | user3 | 
|---|---|---|
| 0 | 2 | 0 | 
| 1 | 3 | 1 | 
Is there any way I make the code above more efficient such that I can change the values of apple, mango, yoo and ram with one line of code?
Advertisement
Answer
If need set range by unique values per columns use:
cols = ['user1','user2','user3'] s = df[cols].unstack() df[cols] = pd.Series(pd.factorize(s)[0], index=s.index).unstack(0) print (df) user1 user2 user3 0 0 2 0 1 1 3 1