Skip to content
Advertisement

how can I apply label encoding by using a dataframe without using labelencoder etc. in Python?

I have two data frames: first:

l1=[123,345,546,245]
l2=["a","a","b","b"]
l3=["j","k","l","pp"]
l4=["m","n","m","n"]

df1=pd.DataFrame(list(zip(l1,l2,l3,l4)),columns=['id','X','Y','Z'])
df1.head()

enter image description here

second:

l1=["X","X","Y","Y","Y","Y","Z","Z"]
l2=["a","b","j","k","l","pp","m","n"]
l3=["1","2","1","2","3","4","1","2"]

df2=pd.DataFrame(list(zip(l1,l2,l3)),columns=["labelnames","levels","labels"])
df2

enter image description here

I need to do label encoding to the first data frame values by using the second data frame. I tried converting the second data frame to a dictionary and then replacing the first data frame values.

such as;

 dict= {"X" : { a:1 , b:2}
     "Y": {j:1,k:2,l:3, pp:4},
      "Z" : {m:1,n:2}}

I couldn’t print the proper dictionary format.

I want to see the output like:

l1=[123,345,546,245]
l2=["1","1","2","2"]
l3=["1","2","3","4"]
l4=["1","2","1","2"]

df1=pd.DataFrame(list(zip(l1,l2,l3,l4)),columns=['id','X','Y','Z'])
df1.head()

enter image description here

So, How can I replace the first data frame values( X, Y, Z) with second data frame labels?

Advertisement

Answer

here is one way to do it, using melt, and pivot

# melt the dataframe to make it columner
# merge with the reference dataframe
# finally pivot to get back to the original format
(df.melt(id_vars=['id'], var_name='key' ).merge(df2,
                                                left_on=['key','value'],
                                                right_on=['labelnames','levels'],
                                                how='left'
                                               ) 
 .pivot(index='id', columns='key', values='labels')
 .reset_index()
 .rename_axis(columns=None)
)
         id     X   Y   Z
    0   123     1   1   1
    1   245     2   4   2
    2   345     1   2   2
    3   546     2   3   1

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement