I am replicating code from this page and I am getting F1, precision and recall to be 0. I got accuracy as shown by the author. What could be reason?
I looked into compute_metrics
function and it seems to be correct. I tried some toy data as below and precision_recall_fscore_support
seems to be giving a correct answer
from sklearn.metrics import accuracy_score, precision_recall_fscore_support y_pred = [1, 1, 2] y_true = [1, 2, 2] print (accuracy_score(y_true, y_pred)) precision_recall_fscore_support(y_true, y_pred, average='binary') 0.6666666666666666 (0.5, 1.0, 0.6666666666666666, None)
as I am getting the accuracy it seems that the below part is working as expected
labels = pred.label_ids preds = pred.predictions.argmax(-1) acc = accuracy_score(labels, preds)
Advertisement
Answer
My guess is that the transformation of your dependend variable was somehow messed up. This I think because all your metrics which depend on TP (True Posivites) are 0 ->
Both Precision and Sensitivity(Recall) depend on TP as numerator:
Precision = TP / (TP + FP) Sensitivity = TP / (TP + FN)
F1-Score depends on both metrics and therefore on TP as numerator:
F1-Score = 2(Precision*Sensitivity)/(Precision+Sensitivity) = TP / (TP + 1/2(FN+FP))
->If the numerator is 0 because you have no TP the result will be 0 as well!
A good/moderate Accuracy can also be achieved if you only got the TN right:
Accuracy = (TP + TN) / Total
That is why you can have a valid looking accuracy and 0 for the other metrics.
So peek into your test/training sets and look whether the split was succesful and whether both possible outcomes of you binary variable are available in both sets! If one of them is lacking in the training set, this might explain a complete missclassification and lack of TP in the test-set.