SGDRegressor() constantly not increasing validation performance

Question

The model fit of my SGDRegressor wont increase or decrease its performance on the validation set (test) after around 20'000 training records. Even if I try to switch penalty, early_stopping (True/False) or alpha,eta0 to extremely high or low levels, there is no change in the behavior of the "stuck" validation score test. I used StandardScaler and shuffled the data for

Accepted Answer

EDIT: Regarding your output, my guess is that your results are so close for the validation set because a linear model like SGDregressor tends to underfit on complex dataTo see this you can check the weights outputted by the model at every iteration. You&#8217;ll see that they are the same or really closeTo enhance variability in the output you need to introduce non linearity and complexityYou are obtaining what is referred as &#8220;Bias&#8221; in machine learning (in contraposition to the &#8220;variance&#8221;)I think I got it now.SamAmani In the end I think that the problem is underfitting. And the fact that you are using incremental sizes of the dataset. The model underfit quite fast (which means that the model is stuck at the beginning to a more or less fixed model)Only the first training output a different result for the test set because it hasn&#8217;t reached the final model, more or lessThe underlying variability is in the incremental training sets.Simply speaking the test results are a more accurate estimate of the performance of the underfitted model. And adding training sample will lead in the end to near results between test and training without improving too much.You can check the fact that are the incremental datasets of the training to be different from the test set. What you did wrong was to check the stats on all the training setFirst of all, why are you training on incremental training set size? The strange results are due to the fact that you are training in an incremental fashion your dataset.When you do this:for m in my_rng:    modelSGD = SGDRegressor(alpha=0.00001, penalty='l1')    modelSGD.fit(X_train[:m], y_train[:m])    [...]you are basically training your model in incremental fashion, with this incremental sizes:for m in range(10, 180001, 30000):    print(m)10300106001090010120010150010If you are trying to make mini-batch gradient descent, you should split your dataset in independent batches instead of making incremental batches. Something like this:previous = 0for m in range(30000, 180001, 30000):    modelSGD.partial_fit(X_train[previous:m], y_train[previous:m])    previous = m# training set ranges0 3000030000 6000060000 9000090000 120000120000 150000150000 180000Also note that I am using partial_fit method, instead of fit (because I am not retraining the model from zero and I am making only a step, iteration of the gradient descent), and I am not going to initialize a new model every time (my sgd initialization is out of the for loop). The full code should be something like this:my_rng = range(0 ,len(X_train), 30000)previous = 0modelSGD = SGDRegressor(alpha=0.00001, penalty='l1')for m in my_rng:    modelSGD.partial_fit(X_train[previous:m], y_train[previous:m])    ypred_train = modelSGD.predict(X_train[previous:m])    ypred_test = modelSGD.predict(X_test)    mse_train = mean_squared_error(y_train[previous:m], ypred_train)    mse_test = mean_squared_error(y_test, ypred_test)    scores_train.append(mse_train)    scores_test.append(mse_test)In this way you are simulating one epoch mini-batch stochastic gradient. To make more epochs an outer loop is neededFrom sklearn:SGD allows minibatch (online/out-of-core) learning via the partial_fitmethod. For best results using the default learning rate schedule, thedata should have zero mean and unit variance.Details here

Advertisement

Answer