How to reverse Label Encoder from sklearn for multiple columns?

Question

I would like to use the inverse_transform function for LabelEncoder on multiple columns. This is the code I use for more than one columns when applying LabelEncoder on a dataframe: Is there a way to modify the code and change it so that it be used to inverse the labels from the encoder? Thanks Answer In order to inverse transform

Accepted Answer

In order to inverse transform the data you need to remember the encoders that were used to transform every column. A possible way to do this is to save the LabelEncoders in a dict inside your object. The way it would work:when you call fit the encoders for every column are fit and savedwhen you call transform they get used to transform datawhen you call inverse_transform they get used to do the inverse transformationExample code:class MultiColumnLabelEncoder:    def __init__(self, columns=None):        self.columns = columns # array of column names to encode    def fit(self, X, y=None):        self.encoders = {}        columns = X.columns if self.columns is None else self.columns        for col in columns:            self.encoders[col] = LabelEncoder().fit(X[col])        return self    def transform(self, X):        output = X.copy()        columns = X.columns if self.columns is None else self.columns        for col in columns:            output[col] = self.encoders[col].transform(X[col])        return output    def fit_transform(self, X, y=None):        return self.fit(X,y).transform(X)    def inverse_transform(self, X):        output = X.copy()        columns = X.columns if self.columns is None else self.columns        for col in columns:            output[col] = self.encoders[col].inverse_transform(X[col])        return outputYou can then use it like this:multi = MultiColumnLabelEncoder(columns=['city','size'])df = pd.DataFrame({'city':    ['London','Paris','Moscow'],                   'size':    ['M',     'M',    'L'],                   'quantity':[12,       1,      4]})X = multi.fit_transform(df)print(X)#    city  size  quantity# 0     0     1        12# 1     2     1         1# 2     1     0         4inv = multi.inverse_transform(X)print(inv)#      city size  quantity# 0  London    M        12# 1   Paris    M         1# 2  Moscow    L         4There could be a separate implementation of fit_transform that would call the same method of LabelEncoders. Just make sure to keep the encoders around for when you need the inverse transformation.

Advertisement

Answer