Understanding Polynomial Regression

Polynomial regression is a form of statistical modelling that describes the relationship between a dependent variable and one or more independent variables using a polynomial function. Unlike simple linear regression, which assumes a straight-line relationship, polynomial regression accommodates curved, non-linear patterns in data.

The core idea rests on the assumption that your data follows a polynomial equation. For a single independent variable, this equation takes the form where each power of the variable contributes to the overall fit. This flexibility makes polynomial models invaluable across disciplines: engineers use them to model stress-strain curves in materials testing, economists apply them to track diminishing returns in production functions, and environmental scientists employ them to analyse pollutant concentration gradients.

The degree of your polynomial determines its complexity. A degree-1 polynomial is simply a straight line. Degree-2 produces a parabola. Degree-3 creates an S-shaped curve. Higher degrees permit increasingly complex oscillations, though with diminishing practical utility and increased risk of overfitting to noise rather than capturing true underlying patterns.

The Polynomial Regression Equation

A polynomial regression model of degree n is defined by the equation below, where y is the dependent variable, x is the independent variable, and a₀, a₁, ..., aₙ are the coefficients determined from your data:

y = a₀ + a₁x + a₂x² + a₃x³ + ... + aₙxⁿ

  • y — The dependent variable (predicted value)
  • x — The independent variable (input value)
  • a₀, a₁, ..., aₙ — Regression coefficients computed from your dataset
  • n — The degree of the polynomial (1 for linear, 2 for quadratic, 3 for cubic, etc.)

The Least-Squares Method

Finding the best polynomial fit requires determining which coefficients minimise the overall prediction error. The least-squares method achieves this by finding coefficients that minimise the sum of squared residuals—the vertical distances between each observed data point and the polynomial curve.

Mathematically, for N data points, the method finds coefficients that minimise:

Σ(yᵢ − (a₀ + a₁xᵢ + a₂xᵢ² + ... + aₙxᵢⁿ))²

This leads to a system of n+1 linear equations (the normal equations) that can be solved simultaneously. The result is a unique set of coefficients that provides the optimal polynomial fit according to the least-squares criterion. Modern calculators solve these systems numerically, but the underlying principle remains: minimise the squared errors to obtain the best-fitting curve.

Linear vs. Polynomial Regression: Clarifying the Terminology

A common source of confusion: why is polynomial regression called "linear" regression when it clearly models curves?

The answer lies in mathematical terminology. Polynomial regression is linear in its coefficients—the equation is a linear combination of the unknown parameters a₀, a₁, ..., aₙ. However, because the equation contains powers of x, the relationship between the input variable and output is non-linear. You can fit parabolas, cubic functions, and complex curves, all while using the mathematical framework of linear regression.

This distinction matters because it allows statisticians to use powerful linear algebra techniques—matrix inversion, eigenvalue decomposition—to solve polynomial problems efficiently, despite the non-linear appearance of the final fitted curve.

Key Considerations for Successful Polynomial Fitting

Avoid these common pitfalls when applying polynomial regression to your data:

  1. Overfitting with high-degree polynomials — Using a polynomial of degree equal to or greater than your number of data points will produce a perfect fit that passes through every point—but will likely perform poorly on new data. A degree-4 polynomial fitted to exactly 5 points has no freedom to smooth noise or measurement error. Always validate your model on hold-out data.
  2. Insufficient data points — For a degree-<em>n</em> polynomial, you need at least <em>n+1</em> data points to solve the system of equations. With exactly <em>n+1</em> points, the fit is mathematically perfect but unvalidated. Aim for significantly more points—ideally 10–20 times the degree—to obtain a robust, generalisable model.
  3. Extrapolation beyond your data range — Polynomials can behave wildly outside the range of your input data, particularly high-degree ones. A cubic that fits temperatures across a calendar year will produce nonsensical predictions for years before or after your observation period. Restrict predictions to the domain of your original measurements.
  4. Ignoring residual patterns — After fitting, examine a plot of residuals (observed minus predicted values) against the independent variable. If residuals show a systematic pattern, your chosen polynomial degree may be inappropriate, or the relationship may be governed by omitted variables. A well-fitted model produces randomly scattered residuals.

Frequently Asked Questions

What is polynomial regression used for?

Polynomial regression models curved, non-linear relationships between variables when linear fitting proves inadequate. Common applications include modelling enzyme kinetics in biochemistry (where reaction rates saturate), fitting growth curves in ecology, analysing cost functions in economics that exhibit economies of scale, and predicting disease prevalence with age. Any domain where the relationship between cause and effect follows a smooth curve rather than a straight line benefits from this approach.

How do I know what degree polynomial to use?

Start conservatively: try degree 2 (quadratic) first. Visually inspect whether the fitted curve captures the overall trend without erratic oscillations. Examine the coefficient of determination (R²) and adjusted R² values—higher is better, but diminishing improvements suggest you've chosen appropriately. Plot residuals; they should scatter randomly without systematic patterns. Finally, use cross-validation: fit on a subset of data and test predictions on held-out observations. The degree that minimises test error is your target.

Why is polynomial regression called 'linear'?

The term refers to linearity in the regression coefficients, not in the input-output relationship. The polynomial equation y = a₀ + a₁x + a₂x² + ... + aₙxⁿ is a linear combination of the unknown coefficients (a₀, a₁, etc.), even though powers of <em>x</em> appear. This mathematical linearity allows statisticians to use linear algebra techniques—matrix methods, least-squares optimisation—to solve the problem, despite the final fitted curve being visibly curved.

What happens if I have fewer data points than the polynomial degree?

Mathematically, the system becomes underdetermined: more unknowns than equations. You cannot obtain a unique solution. With fewer than <em>n+1</em> points, there are infinitely many polynomials of degree <em>n</em> that fit the data perfectly. Even if a calculator forces a solution, it will overfit dramatically, capturing noise rather than structure, and predictions on new data will be unreliable.

Can polynomial regression fail?

Yes, but rarely with real-world data. The system of normal equations fails to have a unique solution only in pathological cases—typically when your independent variables exhibit perfect multicollinearity (e.g., if you accidentally input the same <em>x</em> value paired with different <em>y</em> values multiple times, or if your <em>x</em> values lie exactly on a lower-dimensional subspace). In practical applications with measurement noise and reasonable data distribution, polynomial regression fits successfully.

How do I interpret the polynomial equation once I've fitted it?

Each coefficient tells you the contribution of that term. The constant term (a₀) is the predicted <em>y</em>-value when <em>x</em> = 0. The first-order coefficient (a₁) approximates the initial slope. Higher-order terms capture curvature and oscillation. However, interpretation becomes difficult with high-degree polynomials, particularly when coefficients are large and opposite in sign—a hallmark of overfitting. Always examine the fitted curve visually alongside the coefficients to build intuition.

More statistics calculators (see all)