I was trying to plot a confusion matrix nicely, so I followed scikit-learn’s newer version 0.22’s in built plot confusion matrix function. However, one value of my confusion matrix value is 153, but it appears as 1.5e+02 in the confusion matrix plot:
Following the scikit-learn’s documentation, I spotted this parameter called values_format
, but I do not know how to manipulate this parameter so that it can suppress the scientific notation. My code is as follows.
from sklearn import svm, datasets from sklearn.model_selection import train_test_split from sklearn.metrics import plot_confusion_matrix # import some data to play with X = pd.read_csv("datasets/X.csv") y = pd.read_csv("datasets/y.csv") class_names = ['Not Fraud (positive)', 'Fraud (negative)'] # Split the data into a training set and a test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42) # Run classifier, using a model that is too regularized (C too low) to see # the impact on the results logreg = LogisticRegression() logreg.fit(X_train, y_train) np.set_printoptions(precision=2) # Plot non-normalized confusion matrix titles_options = [("Confusion matrix, without normalization", None), ("Normalized confusion matrix", 'true')] for title, normalize in titles_options: disp = plot_confusion_matrix(logreg, X_test, y_test, display_labels=class_names, cmap=plt.cm.Greens, normalize=normalize, values_format = '{:.5f}'.format) disp.ax_.set_title(title) print(title) print(disp.confusion_matrix) plt.show()
Advertisement
Answer
Just remove “.format” and the {} brackets from your call parameter declaration:
disp = plot_confusion_matrix(logreg, X_test, y_test, display_labels=class_names, cmap=plt.cm.Greens, normalize=normalize, values_format = '.5f')
In addition, you can use '.5g'
to avoid decimal 0’s
Taken from source