Understanding the 2×2 Contingency Table
Diagnostic tests produce four possible outcomes. A 2×2 table organizes them clearly:
- True positive (TP): Disease present and test positive
- False positive (FP): Disease absent but test positive
- True negative (TN): Disease absent and test negative
- False negative (FN): Disease present but test negative
These four counts form the foundation for all downstream calculations. Gathering accurate data from your study population or clinical cohort is essential—any misclassification will propagate through all metrics.
Core Diagnostic Metrics Formulas
Sensitivity and specificity measure intrinsic test properties independent of disease prevalence. Predictive values depend on how common the condition is in your population.
Sensitivity = TP ÷ (TP + FN)
Specificity = TN ÷ (FP + TN)
Accuracy = (TP + TN) ÷ (TP + TN + FP + FN)
PPV = (Sensitivity × Prevalence) ÷ [(Sensitivity × Prevalence) + ((1 − Specificity) × (1 − Prevalence))]
NPV = (Specificity × (1 − Prevalence)) ÷ [((1 − Sensitivity) × Prevalence) + (Specificity × (1 − Prevalence))]
Positive LR = Sensitivity ÷ (1 − Specificity)
Negative LR = (1 − Sensitivity) ÷ Specificity
TP— True positive count—cases correctly identified as diseasedFN— False negative count—diseased cases missed by the testTN— True negative count—healthy cases correctly identifiedFP— False positive count—healthy individuals incorrectly marked positivePrevalence— Proportion of the target population with the disease (as a decimal, 0–1)
Sensitivity vs. Specificity: What They Mean Clinically
Sensitivity answers: Of all people with the disease, how many does the test catch? A sensitive test rarely misses cases—it has few false negatives. Sensitive tests are preferred for serious conditions where missing a diagnosis is costly (e.g., cancer screening).
Specificity answers: Of all people without the disease, how many does the test correctly exclude? A specific test rarely over-diagnoses—it has few false positives. Specific tests are preferred when false positives lead to unnecessary treatment or anxiety (e.g., confirmatory tests after initial screening).
No test is perfect. Trade-offs between sensitivity and specificity are determined by adjusting the test threshold. Lowering the threshold increases sensitivity but decreases specificity, and vice versa.
Predictive Values and Prevalence Dependency
Positive predictive value (PPV) tells you: If a patient tests positive, what is the probability they truly have the disease? This depends critically on disease prevalence. In a rare disease, a positive result may be unreliable even if the test is highly sensitive and specific, because false positives outnumber true positives.
Negative predictive value (NPV) tells you: If a patient tests negative, what is the probability they are truly disease-free? NPV is generally high for rare diseases (since most people are healthy anyway) but may decline for common diseases.
This prevalence dependency explains why a test performing well in one population may perform poorly in another. Always consider your patient population's disease burden when interpreting results.
Key Pitfalls and Practical Considerations
Avoid these common mistakes when interpreting diagnostic test performance:
- Confusing sensitivity with PPV — Sensitivity is a test property; PPV is population-dependent. A test can have high sensitivity but low PPV in a low-prevalence setting. Always calculate or report prevalence alongside sensitivity to avoid misleading conclusions.
- Ignoring spectrum bias — Test performance varies by patient population. A test validated in hospitalized patients with advanced disease may perform very differently in primary care screening. Check whether published performance metrics match your target population.
- Overweighting accuracy in imbalanced datasets — If one outcome vastly outnumbers the other (e.g., 1% disease prevalence), accuracy can be misleadingly high even if the test is useless. Prioritize sensitivity and specificity instead.
- Forgetting that likelihood ratios shift pre-test probability — Likelihood ratios multiply your pre-test odds of disease to give post-test odds. A positive LR of 10 is strong; a negative LR of 0.1 is strong. Values near 1.0 have minimal diagnostic value.