Understanding Variance
Variance measures the average squared distance of each data point from the mean. A small variance means observations cluster tightly around the average; a large variance signals substantial dispersion. Consider test scores of 50, 50, 50 (variance = 0) versus 30, 50, 70 (variance = 400)—identical means, vastly different variability.
This metric underpins standard deviation, confidence intervals, and statistical hypothesis testing. Unlike the range or interquartile distance, variance uses every observation and emphasizes outliers through squaring. Researchers and analysts rely on it to characterise data behaviour before modelling or inference.
One distinction matters: population variance assumes you've measured all members of interest, while sample variance estimates the population parameter from a subset. The formulas differ slightly to correct for sampling bias.
Variance Formula
Variance is the average of squared deviations from the mean. The population formula treats all data as complete; the sample formula adjusts for estimation uncertainty.
Population variance: σ² = (1/N) × Σ(xᵢ − μ)²
Sample variance: s² = (1/(N−1)) × Σ(xᵢ − x̄)²
σ² or s²— Variance (population or sample)N— Number of observationsxᵢ— Individual data pointμ— Population meanx̄— Sample mean
Population vs. Sample Variance
When analysing an entire population, use population variance with divisor N. This is exact because no estimation occurs.
In practice, you often work with samples drawn from larger populations. Using the standard formula (dividing by N) underestimates true population variability—a bias called underestimation. To correct this, divide by N − 1 instead, a technique called Bessel's correction. This adjustment makes the sample variance an unbiased estimator.
Example: measuring blood pressure in 50 patients (sample) requires Bessel's correction; measuring weight across all 200 employees (population) does not.
Hand Calculation Method
Computing variance manually involves three steps. First, find the mean by summing all values and dividing by count. Second, calculate each point's deviation from the mean, then square it. Third, average these squared deviations (or divide by N − 1 for samples).
An alternative computational formula reduces rounding error:
σ² = (1/N) × [Σ(xᵢ²) − (1/N) × (Σxᵢ)²]
This approach requires fewer intermediate rounding steps and is particularly useful with calculators. You compute the sum of squared values and the square of the sum separately, then combine them—avoiding repeated subtraction of large numbers.
Common Pitfalls and Considerations
Avoid these frequent mistakes when interpreting or calculating variance.
- Confusing Population and Sample Formulas — Applying population variance to sample data inflates confidence in your estimates and narrows confidence intervals artificially. Always use Bessel's correction (N − 1 divisor) when working from a sample, unless you explicitly measure the entire population.
- Forgetting Units and Magnitude — Variance is in squared units—if measuring height in centimetres, variance is in cm². This makes raw variance hard to interpret intuitively. Standard deviation (the square root of variance) returns to original units and is often more useful for communication.
- Sensitivity to Outliers — Because deviations are squared, extreme values disproportionately influence variance. A single outlier can double or triple the metric. Always inspect your data visually and consider whether outliers are genuine or measurement errors before finalising analysis.