What is ANOVA?

Analysis of variance (ANOVA) is a statistical technique that determines whether observed differences between group averages are genuine or merely due to random sampling variation. The core principle divides total variance into two sources: variation among group means and variation within each group.

When between-group variance substantially exceeds within-group variance, it suggests the groups genuinely differ. Conversely, when groups are similar, variance is distributed evenly throughout. This comparison provides an F-statistic that indicates whether group differences are statistically significant.

ANOVA assumes:

  • Observations within each group follow a normal distribution
  • Variances are approximately equal across all groups
  • Individual observations are independent
  • The dependent variable is measured on an interval or ratio scale

Types of ANOVA Tests

One-way ANOVA examines how a single factor influences a continuous outcome. For example, comparing sales performance across three different training programmes, with training type as the only grouping variable.

Two-way ANOVA evaluates the effects of two independent factors simultaneously, plus their interaction. A crop yield study might test both fertilizer type and irrigation method, revealing whether their combined effect differs from additive effects alone.

Repeated measures ANOVA applies when the same subjects are measured multiple times under different conditions. Blood pressure readings taken before, during, and after exercise for each participant exemplify this design, where observations are naturally correlated.

ANOVA Calculations

The ANOVA table systematically organises sums of squares, degrees of freedom, and mean squares. Key calculations follow:

SSB = Σ n_i (x̄_i − x̄)²

SSW = Σ (n_i − 1) s_i²

MSB = SSB ÷ (k − 1)

MSW = SSW ÷ (N − k)

F = MSB ÷ MSW

  • SSB — Sum of squares between groups; measures variance of group means around the grand mean
  • SSW — Sum of squares within groups; measures variance of observations around their respective group means
  • MSB — Mean square between; SSB divided by degrees of freedom (k − 1)
  • MSW — Mean square within; SSW divided by degrees of freedom (N − k)
  • k — Number of groups
  • N — Total number of observations

Critical Considerations for ANOVA Analysis

Ensure your data and design meet these practical requirements before interpreting results.

  1. Verify Sample Size Adequacy — Groups with fewer than 10–15 observations per cell reduce statistical power and increase Type II error risk. Small samples inflate variability estimates, making genuine differences harder to detect. Aim for balanced designs where groups have similar sizes.
  2. Check Variance Homogeneity First — Use Levene's test or Bartlett's test before running ANOVA. Severe heterogeneity of variance (e.g., one group with 10× the variance of another) violates assumptions and biases the F-statistic. Consider Welch's ANOVA if variances differ substantially.
  3. Test Normality Within Groups — Shapiro–Wilk or Q–Q plots reveal non-normal distributions. ANOVA is robust to moderate non-normality with larger samples, but heavy tails or extreme skew compromise inference. Transform data (log, square root) if warranted.
  4. Interpret Non-Significant Results Cautiously — Failing to reject the null hypothesis doesn't prove groups are identical—it merely indicates insufficient evidence. Report effect sizes (eta-squared) and confidence intervals alongside p-values for complete inference.

Frequently Asked Questions

How is the F-statistic computed in ANOVA?

The F-statistic is the ratio of mean square between (MSB) to mean square within (MSW). MSB quantifies variability among group means; MSW captures average variability within each group. A larger F-statistic indicates stronger evidence that groups differ. Under the null hypothesis (all group means equal), the F-statistic follows an F distribution with (k−1) and (N−k) degrees of freedom.

What sample size should each group have?

A practical minimum is 10–15 observations per group for adequate statistical power. For groups with 2–9 levels, 15 per cell provides reasonable sensitivity. Smaller samples (n < 10) substantially reduce power and inflate sampling error. Imbalanced designs (unequal group sizes) reduce efficiency but remain analysable; aim for balance where possible.

What happens if ANOVA assumptions are violated?

Violations compromise validity. Non-normality affects F-distribution behaviour, especially with small samples; transformation or non-parametric alternatives (Kruskal–Wallis) may help. Unequal variances bias the F-test; Welch's ANOVA provides a robust alternative. Independence violations (repeated measures, clustered data) require specialised models. Always verify assumptions before proceeding.

Can ANOVA detect which groups differ after a significant result?

ANOVA tests whether any differences exist but doesn't identify specific pairs. Post-hoc tests—Tukey's HSD, Bonferroni, or Scheffe—control multiple-comparison error and pinpoint differences. Choose post-hoc methods based on whether comparisons are planned (contrast tests) or exploratory (pairwise tests).

How is effect size measured in ANOVA?

Eta-squared (η²) and partial eta-squared are the standard measures. Eta-squared equals SSB ÷ SST and represents the proportion of total variance explained by group membership. Values of 0.01, 0.06, and 0.14 roughly correspond to small, medium, and large effects. Report effect sizes alongside p-values for complete interpretation.

More statistics calculators (see all)