What is McNemar's Test?
McNemar's test is a non-parametric statistical procedure designed for paired binomial data. It answers a specific question: do the marginal proportions of a 2×2 contingency table differ significantly? In other words, it tests whether the probability of a positive outcome before an intervention equals the probability after intervention.
The test assumes your data form naturally paired observations. Classic examples include:
- Patients receiving a diagnostic test before and after treatment
- Students sitting the same exam before and after tutoring
- Matched subjects in an intervention study where each treated participant is paired with a control
- Sequential testing where individuals are classified into two mutually exclusive categories at two time points
What makes McNemar's test powerful is its focus on discordant pairs—cases where outcomes differ between the two measurements. Concordant pairs (where both measurements agree) are ignored in the calculation, which simplifies the mathematics and strengthens the test's sensitivity to real changes.
Setting Up the 2×2 Contingency Table
Your data structure is crucial. Arrange your paired observations into a 2×2 table where rows represent the first measurement (e.g., before treatment) and columns represent the second (e.g., after treatment). Each cell (a, b, c, d) captures a combination of outcomes:
- Cell a: Both measurements positive
- Cell b: First negative, second positive (discordant)
- Cell c: First positive, second negative (discordant)
- Cell d: Both measurements negative
The test statistic depends entirely on cells b and c—the off-diagonal cells capturing disagreement. Cells a and d (concordant pairs) do not affect the outcome. Always verify your contingency table marginals before proceeding; row totals and column totals should match your expected sample structure.
McNemar's Test Formula
The standard McNemar's test uses a chi-squared approximation. The test statistic is derived from the discordant cell counts. Several variants exist (Yates correction, Edwards correction) that apply continuity adjustments for smaller samples. For exact inference when sample sizes are limited, use the binomial distribution directly.
χ² = (b − c)² ÷ (b + c)
Where b and c are the off-diagonal cell frequencies.
Under the null hypothesis, χ² follows a chi-squared distribution with 1 degree of freedom. Compare your computed χ² to the critical value at your chosen significance level (typically α = 0.05) to determine the p-value.
b— Count of subjects with negative result first, positive result secondc— Count of subjects with positive result first, negative result secondχ²— Test statistic following chi-squared distribution with 1 degree of freedom
Choosing Between Standard and Exact Tests
The standard McNemar's test relies on the chi-squared approximation, which works well when b + c ≥ 25. However, this approximation deteriorates with smaller samples, leading to unreliable p-values.
For sparse data (b + c < 25), use McNemar's exact binomial test instead. This method computes the exact probability without distributional assumptions, making it conservative and reliable regardless of sample size. The mid-p variant offers a compromise, reducing conservatism while remaining accurate.
Our calculator offers five options:
- Standard: Classical chi-squared test
- Yates correction: Applies continuity adjustment for stability
- Edwards correction: Alternative continuity adjustment
- Exact binomial: Direct probability calculation
- Mid-p binomial: Half-weight approach for rare events
Key Considerations for Correct Use
Avoid common pitfalls when applying McNemar's test to ensure valid inference.
- Verify pairing and independence — McNemar's test requires true pairing—each observation in group one must correspond exactly to one in group two. Violations (e.g., multiple controls per case) invalidate the method. Also confirm that pair outcomes are independent; correlated outcomes within clusters inflate false positives.
- Check sample size thresholds — When b + c is small (below 10–15), chi-squared approximation p-values become unreliable. Switch to exact or mid-p tests immediately. Conversely, with very large samples, all variants converge; choose standard McNemar's for simplicity.
- Interpret marginals carefully — A significant result means row and column marginal distributions differ. This doesn't imply paired agreement; high discordance with equal marginal movement produces significant results. Always examine the contingency table itself to understand the pattern of change.
- Remember null hypothesis scope — McNemar's test only addresses marginal equality, not the magnitude of effect size. Report not just the p-value but the proportions themselves (e.g., 40% positive before vs. 50% after) to convey practical significance.