What is McNemar's Test?

McNemar's test is a non-parametric statistical procedure designed for paired binomial data. It answers a specific question: do the marginal proportions of a 2×2 contingency table differ significantly? In other words, it tests whether the probability of a positive outcome before an intervention equals the probability after intervention.

The test assumes your data form naturally paired observations. Classic examples include:

  • Patients receiving a diagnostic test before and after treatment
  • Students sitting the same exam before and after tutoring
  • Matched subjects in an intervention study where each treated participant is paired with a control
  • Sequential testing where individuals are classified into two mutually exclusive categories at two time points

What makes McNemar's test powerful is its focus on discordant pairs—cases where outcomes differ between the two measurements. Concordant pairs (where both measurements agree) are ignored in the calculation, which simplifies the mathematics and strengthens the test's sensitivity to real changes.

Setting Up the 2×2 Contingency Table

Your data structure is crucial. Arrange your paired observations into a 2×2 table where rows represent the first measurement (e.g., before treatment) and columns represent the second (e.g., after treatment). Each cell (a, b, c, d) captures a combination of outcomes:

  • Cell a: Both measurements positive
  • Cell b: First negative, second positive (discordant)
  • Cell c: First positive, second negative (discordant)
  • Cell d: Both measurements negative

The test statistic depends entirely on cells b and c—the off-diagonal cells capturing disagreement. Cells a and d (concordant pairs) do not affect the outcome. Always verify your contingency table marginals before proceeding; row totals and column totals should match your expected sample structure.

McNemar's Test Formula

The standard McNemar's test uses a chi-squared approximation. The test statistic is derived from the discordant cell counts. Several variants exist (Yates correction, Edwards correction) that apply continuity adjustments for smaller samples. For exact inference when sample sizes are limited, use the binomial distribution directly.

χ² = (b − c)² ÷ (b + c)

Where b and c are the off-diagonal cell frequencies.

Under the null hypothesis, χ² follows a chi-squared distribution with 1 degree of freedom. Compare your computed χ² to the critical value at your chosen significance level (typically α = 0.05) to determine the p-value.

  • b — Count of subjects with negative result first, positive result second
  • c — Count of subjects with positive result first, negative result second
  • χ² — Test statistic following chi-squared distribution with 1 degree of freedom

Choosing Between Standard and Exact Tests

The standard McNemar's test relies on the chi-squared approximation, which works well when b + c ≥ 25. However, this approximation deteriorates with smaller samples, leading to unreliable p-values.

For sparse data (b + c < 25), use McNemar's exact binomial test instead. This method computes the exact probability without distributional assumptions, making it conservative and reliable regardless of sample size. The mid-p variant offers a compromise, reducing conservatism while remaining accurate.

Our calculator offers five options:

  • Standard: Classical chi-squared test
  • Yates correction: Applies continuity adjustment for stability
  • Edwards correction: Alternative continuity adjustment
  • Exact binomial: Direct probability calculation
  • Mid-p binomial: Half-weight approach for rare events

Key Considerations for Correct Use

Avoid common pitfalls when applying McNemar's test to ensure valid inference.

  1. Verify pairing and independence — McNemar's test requires true pairing—each observation in group one must correspond exactly to one in group two. Violations (e.g., multiple controls per case) invalidate the method. Also confirm that pair outcomes are independent; correlated outcomes within clusters inflate false positives.
  2. Check sample size thresholds — When b + c is small (below 10–15), chi-squared approximation p-values become unreliable. Switch to exact or mid-p tests immediately. Conversely, with very large samples, all variants converge; choose standard McNemar's for simplicity.
  3. Interpret marginals carefully — A significant result means row and column marginal distributions differ. This doesn't imply paired agreement; high discordance with equal marginal movement produces significant results. Always examine the contingency table itself to understand the pattern of change.
  4. Remember null hypothesis scope — McNemar's test only addresses marginal equality, not the magnitude of effect size. Report not just the p-value but the proportions themselves (e.g., 40% positive before vs. 50% after) to convey practical significance.

Frequently Asked Questions

What is the null hypothesis in McNemar's test?

The null hypothesis states that the marginal distributions of the 2×2 table are equal. Formally, if we denote the probability of each cell, the test claims that P(b) = P(c)—that is, the proportion changing from negative-to-positive equals the proportion changing from positive-to-negative. Rejection suggests an asymmetric shift, supporting the presence of a real effect (e.g., treatment impact).

When should I use the exact binomial version instead of the standard test?

Use the exact binomial test whenever the sum of discordant cells (b + c) is less than 25. Below this threshold, the chi-squared approximation becomes unstable and can produce misleading p-values. The exact test calculates probabilities directly from the binomial distribution, avoiding approximation error. For b + c ≥ 25, standard McNemar's test is reliable and simpler.

Does McNemar's test account for concordant pairs?

No, McNemar's test ignores concordant pairs entirely. Cells a and d (both positive and both negative) do not appear in the test statistic formula. The test focuses only on disagreement: how many subjects changed category between measurements. This design makes the test sensitive to real change while being robust to baseline imbalance.

How do I interpret a significant McNemar's test result?

A p-value below your significance level (e.g., p &lt; 0.05) indicates sufficient evidence against marginal homogeneity. In practical terms, this means the outcomes at time one and time two are significantly different in distribution. Examine which cell (b or c) is larger to determine the direction of change. However, statistical significance alone doesn't measure effect size; always report the actual proportions alongside the p-value.

Can I use McNemar's test for unmatched studies?

No. McNemar's test requires paired data where observations are matched or repeated. For independent groups with binary outcomes, use Fisher's exact test or the chi-squared test of independence instead. Misapplying McNemar's to unmatched data violates assumptions and produces invalid results.

What is the difference between Yates and Edwards corrections?

Both apply continuity adjustments to stabilize the test when sample sizes are moderate. Yates correction subtracts 0.5 from the absolute numerator difference; Edwards uses a slightly different formula. Neither is universally superior, though Yates is more traditional. For b + c &lt; 25, the exact binomial test is preferred over either correction method.

More statistics calculators (see all)