Skip to content
Advertisement

Sklearn – Best estimator from GridSearchCV with refit = True

I’m trying to finds the best estimator using GridSearchCV and I’m using refit = True as per default. Given that the documentation states:

The refitted estimator is made available at the best_estimator_ attribute and permits using predict directly on this GridSearchCV instance

Should I do .fit on the training data afterwards as such:

            classifier = GridSearchCV(estimator=model,param_grid = parameter_grid['param_grid'], scoring='balanced_accuracy', cv = 5, verbose=3, n_jobs=4,return_train_score=True, refit=True)

            classifier.fit(x_training, y_train_encoded_local)

            predictions = classifier.predict(x_testing)

            balanced_error = balanced_accuracy_score(y_true=y_test_encoded_local,y_pred=predictions)

Or should I do it like this instead:

            classifier = GridSearchCV(estimator=model,param_grid = parameter_grid['param_grid'], scoring='balanced_accuracy', cv = 5, verbose=3, n_jobs=4,return_train_score=True, refit=True)

            predictions = classifier.predict(x_testing)

            balanced_error = balanced_accuracy_score(y_true=y_test_encoded_local,y_pred=predictions)

Advertisement

Answer

You should do it like your first verison. You need to always call classifier.fit otherwise it doesn’t do anything. Refit=True means that it trains on the entire training set after the cross validation is done.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement