“Not enough values to unpack” in sklearn.fit

Question

Here&#8217;s the piece of the code: This says: The train and test datasets had been prepared before, and they behave nicely with other classifiers. Such a generic error message tells me nothing. What is the problem here? Answer In short, the issue was that you passed the result of skf.split(titanic_dataset, s…

Accepted Answer

In short, the issue was that you passed the result of skf.split(titanic_dataset, surv_titanic) to the cv argument on LogisticRegressionCV when you needed to pass StratifiedKFold(n_splits=5) directly instead.Below I show the code that reproduced your error, and below that I show two alternative methods that accomplish what I believe you were trying to do.# Some example datadata = load_breast_cancer()X = data['data']y = data['target']# Set up the stratifiedKFoldskf = StratifiedKFold(n_splits=5)# Don't do this... only here to reproduce the errorskf_indicies = skf.split(X, y)# Some regularizationls_1 = np.logspace(-1.0, 2.0, num=5)# This creates your errorclf_error = LogisticRegressionCV(Cs=ls_1,                                 cv = skf_indicies,                                  scoring = "roc_auc",                                  n_jobs=-1,                                  random_state=17)# Error created by passing result of skf.split to cvclf_model = clf_error.fit(X, y)# This is probably what you meant to doclf_using_skf = LogisticRegressionCV(Cs=ls_1,                                     cv = skf,                                      scoring = "roc_auc",                                      n_jobs=-1,                                     random_state=17,                                      max_iter=1_000)# This will now fit without the errorclf_model_skf = clf_using_skf.fit(X, y)# This is the easiest method, and from the docs also does the# same thing as StratifiedKFold# https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.htmlclf_easiest = LogisticRegressionCV(Cs=ls_1,                                     cv = 5,                                      scoring = "roc_auc",                                      n_jobs=-1,                                     random_state=17,                                      max_iter=1_000)# This will now fit without the errorclf_model_easiest = clf_easiest.fit(X, y)

Advertisement

Answer