What is the Matthews Correlation Coefficient?
The Matthews correlation coefficient is a single-value statistic derived from the confusion matrix—the 2×2 table of observed versus predicted binary outcomes. It measures how well a classifier discriminates between positive and negative cases.
Ranges and interpretation:
- +1: Perfect classification; all predictions match reality
- 0: Performance indistinguishable from random chance
- −1: Complete disagreement; predictions are perfectly wrong
MCC excels when class distributions are unbalanced (e.g., 95% healthy, 5% diseased). Traditional accuracy can mislead in such scenarios—a classifier labeling everyone as healthy achieves 95% accuracy despite being useless. MCC penalizes this false competence.
Matthews Correlation Coefficient Formula
The MCC calculation combines all four confusion matrix cells:
MCC = (TP × TN − FP × FN) / √[(TP + FP)(TP + FN)(TN + FP)(TN + FN)]
TP— True positives: positive cases correctly predictedTN— True negatives: negative cases correctly predictedFP— False positives: negative cases incorrectly predicted as positiveFN— False negatives: positive cases incorrectly predicted as negative
Related Binary Classification Metrics
Beyond MCC, five companion metrics provide complementary perspectives on classifier behaviour:
- Sensitivity (recall): Of actual positives, what fraction did the model catch?
TP / (TP + FN) - Specificity: Of actual negatives, what fraction was correctly rejected?
TN / (TN + FP) - Precision: Of predicted positives, how many were correct?
TP / (TP + FP) - Accuracy: Overall correctness across both classes.
(TP + TN) / (TP + TN + FP + FN) - F1 score: Harmonic mean balancing precision and recall.
2 × TP / (2 × TP + FP + FN)
MCC integrates all four confusion matrix terms, making it more robust than any single metric alone.
Common Pitfalls in Binary Classification Evaluation
Avoid these mistakes when assessing model performance with MCC and related statistics.
- Relying on accuracy for imbalanced data — If your positive class represents only 2% of observations, a naive classifier predicting everything as negative will score 98% accuracy. Always examine sensitivity and specificity separately, or use MCC, which penalises both false positives and false negatives equally.
- Confusing sensitivity with specificity — Sensitivity catches disease-positive patients (true positive rate); specificity identifies disease-free patients correctly (true negative rate). High sensitivity with low specificity means you flag everyone as sick, causing unnecessary treatment and harm.
- Forgetting the denominator can be zero — If your data contains no true positives and no false positives, the denominator in some formulas becomes zero, causing division errors. Ensure your confusion matrix has realistic distributions before computing metrics.
- Misinterpreting negative MCC values — Negative MCC does not mean the model is merely worse than random—it indicates systematic disagreement, as if predictions were inverted. This warrants investigation into labeling conventions, feature scaling, or data leakage rather than dismissal.
Practical Example: Quality Control in Manufacturing
A ceramic factory inspects 100 plates for defects. An automated system flags 30 plates as defective, but manual inspection reveals only 25 are actually defective. Of the 25 truly defective plates, the system caught 20.
Confusion matrix:
- TP (correctly flagged as defective): 20
- FP (incorrectly flagged as defective): 10
- TN (correctly passed): 65
- FN (missed defects): 5
MCC = (20 × 65 − 10 × 5) / √[(30)(25)(75)(70)] = 1200 / √3,937,500 ≈ 0.60. This moderate positive value indicates the system performs reasonably but has room for improvement in reducing false positives (wasted rework).