Why doesn’t the MinMaxScaler change the sns.pairplot of the dataset?

Question

I&#8217;m trying to create a pairplot of my dataset, where the variables are vastly different numbers (some are in the 0-1 range, some, like age and Monthly Income, can go way higher) and I want to scale those variables that go above 1 to 0-1 using the following code: My problem is that after scaling the vari…

Accepted Answer

The plots are expected to look the same, but not exactly &#8211; the tick labels should be different. The scaler does a linear transformation, and seaborn chooses the axis limits based on the range of values, so the arrangement of points in the scatter plots does not change.Since I do not have your data, here is the same effect with Ronald Fisher&#8217;s classic iris dataset:import pandas as pdimport seaborn as sns; sns.set()from sklearn.datasets import load_irisfrom sklearn.preprocessing import MinMaxScaleriris_dict = load_iris(as_frame=True)iris = iris_dict['data']iris['species'] = iris_dict['target']g = sns.pairplot(iris, hue='species', diag_kws={'bw_method':0.2})scale_vars = ['sepal length (cm)', 'sepal width (cm)',               'petal length (cm)', 'petal width (cm)']scaler = MinMaxScaler(copy=False)iris[scale_vars] = scaler.fit_transform(iris[scale_vars])g = sns.pairplot(iris, hue='species', diag_kws={'bw_method':0.2})Note that the column names should have been changed when the scaling was done, because these are no longer centimeters.

Advertisement

Answer