Understanding the Chi-Square Test
The chi-square test quantifies the discrepancy between what you observe in your data and what theory predicts. It applies to categorical variables—grades, color preferences, survey responses, or defect types—where you have frequency counts rather than continuous measurements.
The test works by comparing each observed count to its expected count under the null hypothesis. Large differences signal that your data may not conform to the proposed distribution. The chi-square value itself is dimensionless and always non-negative, since it involves squared deviations.
Common applications include:
- Testing whether a die is fair (equal probability for each face)
- Validating that product defects occur randomly across production batches
- Assessing whether survey responses match demographic expectations
- Confirming genetic inheritance ratios in controlled breeding
The Chi-Square Formula
For each category, calculate the squared difference between observed and expected counts, then divide by the expected count. The total chi-square statistic sums these individual contributions.
χ² = Σ(O − E)² ÷ E
For a single category:
χ² = (O − E)² ÷ E
O— Observed frequency (actual count in your data)E— Expected frequency (count predicted by the hypothesis)Σ— Sum across all categories in your distribution
Interpreting Results and Degrees of Freedom
After summing the chi-square components across all categories, you compare the result to a chi-square distribution table using degrees of freedom (df). The degrees of freedom equal the number of categories minus one.
For example, if you have four grade levels, df = 3. If you have six color categories, df = 5.
The chi-square table gives you a critical value for your chosen significance level (typically α = 0.05). If your calculated chi-square exceeds the critical value, you reject the null hypothesis—your data differs significantly from the expected distribution. A smaller chi-square suggests good agreement with expectations.
Note that chi-square is sensitive to sample size: very large samples can yield high chi-square values even for minor deviations, while small samples may fail to detect real differences.
Common Pitfalls in Chi-Square Testing
Avoid these mistakes when performing or interpreting chi-square tests:
- Low expected frequencies — If any expected count drops below 5, the chi-square test becomes unreliable. Combine adjacent categories if possible, or use Fisher's exact test for small samples. This assumption protects the validity of the chi-square distribution approximation.
- Confusing observed and expected values — Ensure you're comparing actual counts (observed) against theoretical or hypothesized counts (expected), not percentages. If the hypothesis specifies a 40% share, calculate 40% of your total sample size as the expected frequency.
- Forgetting to sum across all categories — The final chi-square statistic is the total of all individual category calculations, not just the largest one. Missing categories or calculating only a subset will give incorrect results.
- Misapplying degrees of freedom — The df formula is (number of categories − 1), but if you've estimated parameters from your data, you must subtract additional degrees of freedom. Ignoring this adjustment can lead to misleading p-values.
Practical Example: Grading Distribution
Suppose you expected a class of 60 students to earn grades as follows: 15% grade 5, 40% grade 4, 30% grade 3, and 15% grade 2. Your actual results were: 7 students grade 2, 26 grade 3, 22 grade 4, and 5 grade 5.
First, calculate expected counts:
- Grade 2: 0.15 × 60 = 9 students
- Grade 3: 0.30 × 60 = 18 students
- Grade 4: 0.40 × 60 = 24 students
- Grade 5: 0.15 × 60 = 9 students
Then compute chi-square for each grade:
- Grade 2: (7 − 9)² ÷ 9 = 0.444
- Grade 3: (26 − 18)² ÷ 18 = 3.556
- Grade 4: (22 − 24)² ÷ 24 = 0.167
- Grade 5: (5 − 9)² ÷ 9 = 1.778
Total χ² = 0.444 + 3.556 + 0.167 + 1.778 = 5.945. With df = 3, compare this against the critical value (3.841 at α = 0.05), suggesting a marginally significant difference from the intended distribution.