Understanding Fisher's Exact Test

Fisher's exact test assesses whether two binary variables are truly independent or if their relationship is statistically significant. Rather than approximating a distribution (like chi-squared does), Fisher's test calculates the exact probability of observing your data under the null hypothesis of independence.

The test is particularly valuable when:

  • Sample sizes are small (fewer than 30 observations)
  • One or more cells in your 2×2 table contain fewer than five cases
  • Marginal totals are heavily skewed
  • You need exact rather than approximate results

Researchers across medicine, epidemiology, and behavioural science rely on it for rare outcomes and finite populations where traditional parametric methods break down.

Fisher's Exact Test Formula

The test calculates the hypergeometric probability of observing a particular 2×2 table given fixed row and column totals. For a contingency table with cells a, b, c, and d, the one-tailed probability follows:

P = [(a+b)! × (c+d)! × (a+c)! × (b+d)!] / [a! × b! × c! × d! × n!]

Odds Ratio = (a × d) / (b × c)

  • a — Count in first group, first category
  • b — Count in first group, second category
  • c — Count in second group, first category
  • d — Count in second group, second category
  • n — Total sample size (a+b+c+d)

One-Tailed vs Two-Tailed Tests

The choice between test directions depends on your research hypothesis:

  • One-tailed: Use when you predict a specific direction of association before collecting data. For example, you hypothesise that a treatment reduces adverse events. The p-value includes only probabilities as or more extreme in your predicted direction.
  • Two-tailed: Use when testing for any association without a directional prediction. The p-value sums probabilities from both tails of the distribution, making it more conservative and harder to reach significance.

Two-tailed tests are standard in most research unless your hypothesis explicitly states a direction.

Why Fisher Over Chi-Squared?

Although chi-squared is faster to calculate and works well with large samples, it relies on asymptotic approximation. Fisher's test shines when conditions are unfavourable for chi-squared:

  • Small samples: Chi-squared can give misleading results with n < 30. Fisher remains reliable regardless of size.
  • Cell counts under 5: Chi-squared assumes sufficient expected frequencies in each cell. Fisher makes no such assumption.
  • Rare events: When one outcome is uncommon, Fisher's exact calculation beats approximation.

Modern computing makes Fisher feasible even for larger datasets, though chi-squared remains acceptable when assumptions are met.

Practical Considerations and Common Pitfalls

Avoid these mistakes when applying Fisher's exact test to your data.

  1. Don't ignore assumptions about marginals — Fisher's test conditions on fixed row and column totals. If your study design doesn't fix these totals in advance, the test may not be appropriate. Case-control studies typically fix row totals (numbers in each group), while retrospective studies may fix column totals. Ensure your design matches the test structure.
  2. Beware of power with small samples — Exact tests are conservative with tiny sample sizes, reducing statistical power. A true effect may fail to reach significance simply due to sample limitations. Report effect sizes (odds ratios, confidence intervals) alongside p-values to capture the practical magnitude.
  3. Choose your tail direction before analysis — Deciding between one- and two-tailed after seeing your data inflates false positives. Specify your hypothesis in your analysis plan. Two-tailed is the safer default unless you have strong pre-registered reasoning for a one-tailed test.
  4. Check cell counts and totals — Ensure your 2×2 table totals correctly and contains no negative values. A single data entry error (e.g., entering 5 instead of 50) will distort your result. Always cross-check inputs against your raw data.

Frequently Asked Questions

What does a p-value mean in Fisher's exact test?

The p-value is the probability of observing a result as or more extreme than yours if there were truly no association between the variables (the null hypothesis). A p-value of 0.03 means there's a 3% chance of getting your data (or more extreme) by random chance alone if the variables are independent. Smaller p-values suggest stronger evidence against independence. By convention, p < 0.05 is often considered statistically significant, though this threshold is arbitrary.

How do I interpret the odds ratio?

The odds ratio quantifies the strength and direction of association between your two variables. An odds ratio of 1 indicates no association. A ratio above 1 suggests increased odds of the second category in the second group; below 1 suggests decreased odds. For example, an odds ratio of 2 means the odds are twice as high in one group compared to the other. Always report confidence intervals around the odds ratio, as point estimates alone can be misleading with small samples.

When should I use Fisher's exact test instead of chi-squared?

Use Fisher's exact test when you have a 2×2 contingency table with small sample sizes (typically under 30), or when any cell contains fewer than 5 observations. Chi-squared assumes large samples and adequate cell frequencies. Fisher's test provides exact p-values without these assumptions. If you meet chi-squared conditions, either test is acceptable, but Fisher becomes essential when data is sparse or imbalanced.

Can Fisher's exact test handle tables larger than 2×2?

Standard Fisher's exact test is designed for 2×2 tables. For larger contingency tables (e.g., 3×3 or 2×3), you would typically use a chi-squared test or extensions like Monte Carlo simulation for exact p-values. Some software packages offer multidimensional versions, but they are computationally intensive. For categorical data beyond 2×2, consult a statistician about appropriate methods.

What's the difference between one-tailed and two-tailed Fisher's test?

A one-tailed test checks if the association goes in a specific direction predicted beforehand (e.g., treatment reduces adverse events). A two-tailed test checks for any association without directional prediction. Two-tailed tests are more stringent because they split the significance level across both distribution tails, requiring stronger evidence. Always decide which you need before analysing data to avoid p-hacking.

How do I calculate Fisher's exact test by hand?

List all possible 2×2 tables with the same row and column totals as your observed data. For each table, calculate the hypergeometric probability using the formula provided. Sum all probabilities equal to or smaller than your observed table's probability. This sum is your one-tailed p-value. For two-tailed, include tables from both extremes with probability ≤ your observed probability. With even modest samples, hand calculation becomes tedious—use statistical software instead.

More statistics calculators (see all)