Here’s the piece of the code:
from sklearn.model_selection import StratifiedKFold from sklearn.linear_model import LogisticRegressionCV skf = StratifiedKFold(n_splits=5) skf_1 = skf.split(titanic_dataset, surv_titanic) ls_1 = np.logspace(-1.0, 2.0, num=500) clf = LogisticRegressionCV(Cs=ls_1, cv = skf_1, scoring = "roc_auc", n_jobs=-1, random_state=17) clf_model = clf.fit(x_train, y_train)
This says:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-130-b99a5912ff5a> in <module> ----> 1 clf_model = clf.fit(x_train, y_train) H:Anaconda_3libsite-packagessklearnlinear_model_logistic.py in fit(self, X, y, sample_weight) 2098 # (n_classes, n_folds, n_Cs . n_l1_ratios) or 2099 # (1, n_folds, n_Cs . n_l1_ratios) -> 2100 coefs_paths, Cs, scores, n_iter_ = zip(*fold_coefs_) 2101 self.Cs_ = Cs[0] 2102 if multi_class == 'multinomial': ValueError: not enough values to unpack (expected 4, got 0)
The train and test datasets had been prepared before, and they behave nicely with other classifiers.
Such a generic error message tells me nothing. What is the problem here?
Advertisement
Answer
In short, the issue was that you passed the result of skf.split(titanic_dataset, surv_titanic)
to the cv
argument on LogisticRegressionCV
when you needed to pass StratifiedKFold(n_splits=5)
directly instead.
Below I show the code that reproduced your error, and below that I show two alternative methods that accomplish what I believe you were trying to do.
# Some example data data = load_breast_cancer() X = data['data'] y = data['target'] # Set up the stratifiedKFold skf = StratifiedKFold(n_splits=5) # Don't do this... only here to reproduce the error skf_indicies = skf.split(X, y) # Some regularization ls_1 = np.logspace(-1.0, 2.0, num=5) # This creates your error clf_error = LogisticRegressionCV(Cs=ls_1, cv = skf_indicies, scoring = "roc_auc", n_jobs=-1, random_state=17) # Error created by passing result of skf.split to cv clf_model = clf_error.fit(X, y) # This is probably what you meant to do clf_using_skf = LogisticRegressionCV(Cs=ls_1, cv = skf, scoring = "roc_auc", n_jobs=-1, random_state=17, max_iter=1_000) # This will now fit without the error clf_model_skf = clf_using_skf.fit(X, y) # This is the easiest method, and from the docs also does the # same thing as StratifiedKFold # https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegressionCV.html clf_easiest = LogisticRegressionCV(Cs=ls_1, cv = 5, scoring = "roc_auc", n_jobs=-1, random_state=17, max_iter=1_000) # This will now fit without the error clf_model_easiest = clf_easiest.fit(X, y)