Can evaluation metrics for test set be better than training set?

Question

I am doing some Logistics Regression homework. I just wonder if in any case, the evaluation metrics for the test set are a bit better than the training set (like my results below)? And if yes, what gap is allowed? Below is my evaluation result for the test set and training set, given that both sets are extracted from the

Accepted Answer

Usually this is not the case, but it is not impossible. If you randomly split your data into a test and a training set, the test data can fit better to your model than the training data in some cases. Imagine the extreme case below, where your data consists of values with different levels of noise added to it. If the test data points are the ones with less noise, they will fit better to the model. However, such a split into test and training data is very unlikely.If the test score is just slightly above the training score, this is absolutely normal. If by chance, noise not captured by the model is slightly lower in the test data, the test samples will fit better to the model. Actually this a good sign, because it means that you are not over fitting. You may be able to increase overall performance by increasing the degrees of freedom in your model.If the test score is much higher then the training score, you should check whether the split between test data and training has been done in a reasonable way.import numpy as npimport matplotlib.pyplot as pltdef f(x):    return 2*x + 3x_test = np.random.rand(30)x_training = np.random.rand(70)y_test = f(x_test) + 1e-3 * np.random.randn(30)y_training = f(x_training) + 0.2 * np.random.randn(70)k, d = np.polyfit(x_training, y_training, 1)x = np.linspace(0, 1)plt.plot(x_training, y_training, 'o', label='training data')plt.plot(x_test, y_test, 'o', label='test data')plt.plot(x, k*x +d, 'k--', label='fitted model')plt.plot(x, f(x), 'k:', label='real model')plt.legend();plt.show()

Advertisement

Answer