What is the difference between MinMaxScaler()
and StandardScaler()
.
mms = MinMaxScaler(feature_range = (0, 1))
(Used in a machine learning model)
sc = StandardScaler()
(In another machine learning model they used standard-scaler and not min-max-scaler)
Advertisement
Answer
From ScikitLearn site:
StandardScaler
removes the mean and scales the data to unit variance. However, the outliers have an influence when computing the empirical mean and standard deviation which shrink the range of the feature values as shown in the left figure below. Note in particular that because the outliers on each feature have different magnitudes, the spread of the transformed data on each feature is very different: most of the data lie in the [-2, 4] range for the transformed median income feature while the same data is squeezed in the smaller [-0.2, 0.2] range for the transformed number of households.StandardScaler therefore cannot guarantee balanced feature scales in the presence of outliers.
MinMaxScaler
rescales the data set such that all feature values are in the range [0, 1] as shown in the right panel below. However, this scaling compress all inliers in the narrow range [0, 0.005] for the transformed number of households.