How to build a custom scaler based on StandardScaler?

Question

I am trying to build a custom scaler to scale only the continuous variables on a dataset (the US Adult Income: https://www.kaggle.com/uciml/adult-census-income), using StandardScaler as a base. Here is my Python code that I used: However when I tried to run the scaler, I met this problem: So what is the error that I have on building the scaler? And

Accepted Answer

I agree with @AntoineDubuis, that ColumnTransformer is a better (builtin!) way to do this.  That said, I&#8217;d like to address where your code goes wrong.In fit, you have self.scaler.fit(X[self.columns], y); this indicates that self.columns should be a list of column names (or a few other options). But you&#8217;ve defined the parameter as continuous = df.iloc[:, np.r_[0,2,10:13]], which is a dataframe.A couple other issues:you should only set attributes in __init__ that come from its signature, or cloning will fail.  Move self.scalerto fit, and save its parameters copy etc. directly at __init__.  Don&#8217;t initialize mean_ or var_.you never actually use mean_ or var_.  You can keep them if you want, but the relevant statistics are stored in the scaler object.

Advertisement

Answer