Understanding the F-Statistic
An F-statistic is a ratio derived from the F-distribution, used to compare variances or test the joint significance of multiple regression coefficients. Unlike the t-statistic, which examines a single parameter, the F-test evaluates whether an entire set of restrictions or exclusions improves or worsens model fit.
The F-statistic appears in two main contexts:
- Variance comparison: Testing whether two independent samples have equal population variances—a prerequisite for pooled t-tests.
- Regression analysis: Determining whether a full model (with more variables) provides significantly better fit than a restricted model (with fewer variables).
Because F-values are always positive (squared deviations divided by squared deviations), the distribution is right-skewed with shape determined by degrees of freedom in the numerator and denominator.
F-Statistic Formulas
Two distinct formulas apply depending on your context:
Basic variance-ratio test: Compare two sample variances directly.
Regression model comparison: Test whether restricted coefficients (excluded variables) contribute joint explanatory power.
F = S₁² ÷ S₂²
Where S₁² and S₂² are the sample variances of two groups.
F = [(SSRᵣₑₛₜ − SSRfᵤₗₗ) ÷ J] ÷ [SSRfᵤₗₗ ÷ (N − K)]
Where SSRfᵤₗₗ is the sum of squared residuals from the full model, SSRᵣₑₛₜ is from the restricted model, J is the number of restrictions (excluded coefficients), N is sample size, and K is total coefficients.
S₁²— Sample variance of the first groupS₂²— Sample variance of the second groupSSRfᵤₗₗ— Sum of squared residuals from the unrestricted regression modelSSRᵣₑₛₜ— Sum of squared residuals from the restricted regression modelJ— Number of linear restrictions or excluded coefficientsN— Total number of observations in the sampleK— Total number of coefficients (parameters) in the full model
F-Test in Regression Analysis
In regression, the F-test addresses a critical question: do the restrictions imposed on the model significantly worsen fit? This arises when testing whether a group of variables jointly influences the dependent variable.
Suppose you estimate a wage regression with education, experience, and gender. To test whether gender matters, you fit two models: one with all three predictors and one excluding gender. The F-statistic captures whether the increase in residual error from dropping gender is statistically significant.
A high F-value suggests the restricted variables do belong in the model. Whether it's "high enough" depends on critical values from the F-distribution table, which vary with the degrees of freedom (J and N − K). Researchers reject the null hypothesis (that restrictions are valid) when the calculated F exceeds the critical threshold at the chosen significance level (typically 5%).
F-Test vs. T-Test: Key Distinctions
Both F and t statistics are used in hypothesis testing, but they serve different purposes:
- Scope: The t-test examines a single regression coefficient or compares means of two groups. The F-test evaluates multiple coefficients jointly or entire model fit.
- Degrees of freedom: The t-test has one denominator df. The F-test has two df parameters (numerator and denominator), making its distribution asymmetric and right-skewed.
- Relationship: When testing one restriction, t² equals F—they are equivalent. For multiple restrictions, only the F-test applies.
- Practical use: Use t-tests for individual variable significance; use F-tests for overall model adequacy or nested model comparison.
Common Pitfalls When Interpreting F-Statistics
Avoid these frequent mistakes when calculating or interpreting F-values.
- Assuming F-distribution is symmetric — The F-distribution is heavily right-skewed, especially with unequal df. Critical values differ markedly from the mean. Always consult an F-table or statistical software rather than guessing critical regions based on a bell curve.
- Forgetting to square the t-statistic — When testing one restriction (J = 1), the relationship F = t² holds exactly. If your one-coefficient t-test yields t = 2.5, then F = 6.25. Confusing this relationship leads to wrong inferences.
- Misidentifying degrees of freedom — In regression comparisons, df₁ = J (restrictions) and df₂ = N − K (sample size minus full model coefficients). Swapping these gives wildly incorrect critical values and wrong reject/fail-to-reject decisions.
- Ignoring positive variance assumption — F-statistics are always positive because variances cannot be negative. A negative "F-value" signals a calculation error. In regression, if SSRᵣₑₛₜ < SSRfᵤₗₗ, the restricted model fits better, making (SSRᵣₑₛₜ − SSRfᵤₗₗ) negative—this is mathematically valid but suggests the restrictions improve fit, contradicting typical null hypotheses.