I have two data frames: first:
JavaScript
x
8
1
l1=[123,345,546,245]
2
l2=["a","a","b","b"]
3
l3=["j","k","l","pp"]
4
l4=["m","n","m","n"]
5
6
df1=pd.DataFrame(list(zip(l1,l2,l3,l4)),columns=['id','X','Y','Z'])
7
df1.head()
8
second:
JavaScript
1
7
1
l1=["X","X","Y","Y","Y","Y","Z","Z"]
2
l2=["a","b","j","k","l","pp","m","n"]
3
l3=["1","2","1","2","3","4","1","2"]
4
5
df2=pd.DataFrame(list(zip(l1,l2,l3)),columns=["labelnames","levels","labels"])
6
df2
7
I need to do label encoding to the first data frame values by using the second data frame. I tried converting the second data frame to a dictionary and then replacing the first data frame values.
such as;
JavaScript
1
4
1
dict= {"X" : { a:1 , b:2}
2
"Y": {j:1,k:2,l:3, pp:4},
3
"Z" : {m:1,n:2}}
4
I couldn’t print the proper dictionary format.
I want to see the output like:
JavaScript
1
8
1
l1=[123,345,546,245]
2
l2=["1","1","2","2"]
3
l3=["1","2","3","4"]
4
l4=["1","2","1","2"]
5
6
df1=pd.DataFrame(list(zip(l1,l2,l3,l4)),columns=['id','X','Y','Z'])
7
df1.head()
8
So, How can I replace the first data frame values( X, Y, Z) with second data frame labels?
Advertisement
Answer
here is one way to do it, using melt, and pivot
JavaScript
1
13
13
1
# melt the dataframe to make it columner
2
# merge with the reference dataframe
3
# finally pivot to get back to the original format
4
(df.melt(id_vars=['id'], var_name='key' ).merge(df2,
5
left_on=['key','value'],
6
right_on=['labelnames','levels'],
7
how='left'
8
)
9
.pivot(index='id', columns='key', values='labels')
10
.reset_index()
11
.rename_axis(columns=None)
12
)
13
JavaScript
1
7
1
id X Y Z
2
0 123 1 1 1
3
1 245 2 4 2
4
2 345 1 2 2
5
3 546 2 3 1
6
7