Understanding Hypothesis Testing
Hypothesis testing frames research questions as statistical decisions. You begin with a null hypothesis (H₀), which asserts no effect or no difference—the status quo. The alternative hypothesis (H₁) proposes the opposite: that a meaningful effect or difference exists.
The process relies on sample data to compute a test statistic, which is then compared against a critical threshold. This threshold depends on your chosen significance level (α)—typically 0.05—which represents the probability of rejecting H₀ when it is actually true (a Type I error). If your test statistic falls in the rejection region, you reject H₀ and conclude there is sufficient evidence for H₁. Otherwise, you fail to reject H₀.
The choice of test depends on your data: use a z-test for large samples (n ≥ 30) with known population standard deviation, a t-test for smaller samples or unknown population variation, and a chi-square test for categorical associations.
One-Tailed vs. Two-Tailed Tests
Hypothesis tests differ in directionality. A two-tailed test checks whether a parameter differs in either direction from the hypothesized value, splitting your significance level equally between both tails of the distribution. This is the most conservative approach, requiring stronger evidence.
A one-tailed test focuses on a single direction. A right-tailed test asks whether the parameter is greater than the hypothesized value, placing the entire rejection region on the right. A left-tailed test asks whether it is less than, placing rejection on the left. One-tailed tests are more powerful—they require less extreme evidence—but only when your research hypothesis genuinely has a directional prediction.
Choosing the wrong tail can invalidate your conclusions, so decide before analyzing your data based on your research question, not your results.
Test Statistic Formulas
The formula you use depends on your sample size and whether you know the population standard deviation.
Z-Test Statistic: Use when n ≥ 30 or the population standard deviation is known.
z = (x̄ − μ₀) ÷ (σ ÷ √n)
x̄— Sample meanμ₀— Hypothesized population meanσ— Population standard deviation (or sample standard deviation for large n)n— Sample size
T-Test Statistic for Small Samples
When your sample size is under 30 and the population standard deviation is unknown, use the t-test. The t-distribution has heavier tails than the normal distribution, accounting for extra uncertainty in small samples.
t = (x̄ − μ₀) ÷ (s ÷ √n)
x̄— Sample meanμ₀— Hypothesized population means— Sample standard deviationn— Sample size
Common Pitfalls in Hypothesis Testing
Avoid these mistakes when designing and interpreting your tests.
- Confusing p-value with probability of H₀ — A p-value is <em>not</em> the probability that H₀ is true. It is the probability of observing data as extreme as yours if H₀ were true. A small p-value means your data is unlikely under H₀, not that H₀ is unlikely to be true.
- Stopping your study early if results look good — Repeatedly checking results and stopping when you see significance inflates your Type I error rate. Decide your sample size and stopping rule before collecting data.
- Choosing your tail direction after seeing results — Selecting a one-tailed test because your sample mean is in that direction amounts to p-hacking. Define your hypothesis direction in advance based on theory, not data.
- Using the wrong test for your data type — T-tests assume roughly normal data; chi-square tests require categorical variables with adequate expected frequencies. Applying the wrong test produces invalid conclusions.