Sklearn ROC AUC Score : ValueError: y should be a 1d array, got an array of shape (15, 2) instead

Question

I have this dataset with target LULUS, it's an imbalance dataset. I'm trying to print roc auc score if I could for each fold of my data but in every fold somehow it's always raise error saying ValueError: y should be a 1d array, got an array of shape (15, 2) instead.. I'm kind of confused which part I did

Accepted Answer

Your output from model.predict_proba() is a matrix with 2 columns, one for each class. To calculate roc, you need to provide the probability of the positive class:Using an example dataset:from sklearn.datasets import make_classificationfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.metrics import roc_auc_scorefrom sklearn.model_selection import train_test_splitX, y = make_classification(n_classes=2)X_train, X_test, y_train, y_test = train_test_split(    X, y, test_size=0.33, random_state=42)rf = RandomForestClassifier()model = rf.fit(X_train, y_train)y_proba = model.predict_proba(X_test)It looks like this:array([[0.69, 0.31],       [0.13, 0.87],       [0.94, 0.06],       [0.94, 0.06],       [0.07, 0.93]])Then do:roc_auc_score(y_test, y_proba[:,1])

Advertisement

Answer