Skip to content
Advertisement

Micro metrics vs macro metrics

To test the results of my multi-label classfication model, I measured the Precision, Recall and F1 scores. I wanted to compare two different results, Micro and Macro. I have a dataset with few rows, but my label count is around 1700. Why is the macro so low even though I get a high result in micro, which one would be more useful to look at when it is a multi class?

Accuracy: 0.743999 

Micro Precision: 0.743999
Macro Precision: 0.256570 

Micro Recall: 0.743999
Macro Recall: 0.264402 

Micro F1 score: 0.743999
Macro F1 score: 0.250033 

Cohens kappa: 0.739876

Advertisement

Answer

Micro-Average

The micro-average precision and recall score is calculated from the individual classes’ true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs) of the model.

Macro-Average

The macro-average precision and recall score is calculated as the arithmetic mean of individual classes’ precision and recall scores. The macro-average F1-score is calculated as the arithmetic mean of individual classes’ F1-score.

When to use micro-averaging and macro-averaging scores?

  • Use micro-averaging score when there is a need to weigh each instance or prediction equally.

  • Use macro-averaging score when all classes need to be treated equally to evaluate the classifier’s overall performance concerning the most frequent class labels.

  • Use a weighted macro-averaging score in case of class imbalances (different instances related to different class labels). The weighted macro-average is calculated by weighting the score of each class label by the number of true instances when calculating the average.

  • The macro-average method can be used when you want to know how the system performs overall across the sets of data. You should not come up with any specific decision with this average. On the other hand, micro-average can be a useful measure when your dataset varies in size.

Micro-Average & Macro-Average Precision Scores for Multi-class Classification

For multi-class classification problems, micro-average precision scores can be defined as the sum of true positives for all the classes divided by all positive predictions. The positive prediction is the sum of all true positives and false positives.

Micro-Average & Macro-Average Recall Scores for Multi-class Classification

For multi-class classification problems, micro-average recall scores can be defined as the sum of true positives for all the classes divided by the actual positives (and not the predicted positives).

References:

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement