Scaling / Normalizing pandas column

Question

I have a dataframe like: I&#8217;d like to create a newly scaled column in the dataframe called SIZE where SIZE is a number between 5 and 50. For Example: I&#8217;ve tried but got Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a sin…

Accepted Answer

Option 1sklearnYou see this problem time and time again, the error really should be indicative of what you need to do. You&#8217;re basically missing a superfluous dimension on the input. Change df["TOTAL"] to df[["TOTAL"]].df['SIZE'] = scaler.fit_transform(df[["TOTAL"]])df   TOTAL   Name       SIZE0   3232   Jane  24.4139591    382   Jack  10.0000002   8291  Jones  50.000000Option 2pandasPreferably, I would bypass sklearn and just do the min-max scaling myself.a, b = 10, 50x, y = df.TOTAL.min(), df.TOTAL.max()df['SIZE'] = (df.TOTAL - x) / (y - x) * (b - a) + adf   TOTAL   Name       SIZE0   3232   Jane  24.4139591    382   Jack  10.0000002   8291  Jones  50.000000This is essentially what the min-max scaler does, but without the overhead of importing scikit learn (don&#8217;t do it unless you have to, it&#8217;s a heavy library).

Advertisement

Answer