statistics

Pearson Correlation Calculator

Pearson's correlation coefficient quantifies how strongly two variables move together in a linear fashion. Researchers, data analysts, and statisticians use this metric to detect relationships in bivariate datasets—from medical studies examining drug efficacy against symptom improvement, to economics tracking inflation against unemployment. A coefficient near +1 or −1 indicates a tight linear bond; values near 0 suggest little to no linear connection. Input paired data points and get both the correlation value and a verbal strength assessment based on established statistical thresholds.

Last updated: April 27, 2026

Creators Anna Szczepanek, PhD Statistics

Reviewers Wojciech Sas and Davide Borchia

3,064 people find this calculator helpful

Understanding Pearson's Correlation Coefficient

Pearson's correlation coefficient, denoted r, measures whether two continuous variables exhibit a linear relationship. When you increase one variable by a fixed amount, a perfectly linear pairing means the other changes by a consistent amount—whether incrementing from 1 to 2 or from 100 to 101. Classical examples include the link between study hours and exam scores, or ambient temperature and ice cream sales.

Positive correlation: Both variables climb or fall together.
Negative correlation: One rises while the other descends.
No correlation: Variables move independently.

The coefficient ranges from −1 to +1. Magnitudes closer to the extremes signal stronger linear relationships, while values near zero indicate weak or absent linear patterns. If r = 1 or −1, every observation sits precisely on the fitted regression line; at r = 0, no linear trend exists.

Pearson Correlation Formula

Pearson's r is formally the covariance between two variables divided by the product of their standard deviations. This captures both how variables co-vary and their respective spreads:

r = [Σ(xᵢ − x̄)(yᵢ − ȳ)] / √[Σ(xᵢ − x̄)²] × √[Σ(yᵢ − ȳ)²]

xᵢ, yᵢ — Individual paired data points
x̄, ȳ — Mean (average) of x and y values respectively
Σ — Sum across all n observations

Interpreting Your Result

The sign and magnitude of r work together to reveal the relationship's character:

r between 0.8 and 1.0: Very strong positive linear relationship.
r between 0.6 and 0.8: Strong positive linear relationship.
r between 0.4 and 0.6: Moderate positive linear relationship.
r between 0.2 and 0.4: Weak positive linear relationship.
r between 0.0 and 0.2: Very weak or negligible linear relationship.
Negative values: Apply the same thresholds to |r| but denote inverse movement.

These benchmarks follow Evans' convention (1996), though field-specific standards may vary. Always consider your domain context; a correlation of 0.5 might be exceptional in psychology yet routine in engineering.

Pearson Correlation and Linear Regression

Pearson's r connects directly to the coefficient of determination, denoted R², in simple linear regression. Squaring r yields R², representing the fraction of variance in one variable explained by the other. For example, if r = 0.7, then R² ≈ 0.49, meaning roughly 49% of the target variable's variation is accounted for by the predictor.

The regression slope also incorporates Pearson's coefficient: the slope a equals r multiplied by the ratio of the standard deviations (s_y / s_x). This elegant relationship shows that stronger correlation between two variables with different spreads still produces proportional steepness in the fitted line.

Common Pitfalls and Key Caveats

Misinterpreting correlation is among the most frequent statistical errors; here are critical safeguards.

Correlation Does Not Imply Causation — A powerful correlation between sunglasses sales and drowning rates does not mean eyewear causes drowning. Typically, a hidden third variable—hot weather—drives both. Always investigate plausible causal mechanisms rather than assuming directionality from correlation alone.
Outliers Distort Results Significantly — A single extreme data point can shift r substantially, especially in small samples. Plot your data visually before trusting the coefficient. If you suspect outliers, consider reporting both the standard Pearson correlation and a robust alternative like Spearman's rank correlation.
Non-Linear Relationships Hide Below the Surface — Two variables may have a strong curved or parabolic relationship yet show r near zero. Pearson's coefficient only captures linear patterns. If your scatter plot reveals curvature or clusters, explore polynomial regression or non-parametric methods.
Minimum Sample Size Matters for Reliability — With fewer than 30 paired observations, confidence in the coefficient weakens. Tiny samples can yield misleading correlations by chance. Larger datasets provide more stable estimates and stronger statistical power for hypothesis testing.

Frequently Asked Questions

What does a Pearson correlation of 0.5 actually mean?

An r of 0.5 indicates a moderate positive linear relationship. In practical terms, roughly 25% of the variance in one variable is explained by the other (since 0.5² = 0.25). The two variables tend to increase together, but the relationship is not tight—substantial scatter remains around a fitted line. Field context determines whether 0.5 is considered acceptable; researchers in social sciences often work with correlations in this range, whereas precision engineering may require tighter associations.

How many data points do I need to calculate a meaningful Pearson correlation?

Technically, Pearson's r requires at least two paired observations, but such small samples are statistically unreliable. Most statisticians recommend at least 30 observations for stable estimates and valid inference. Below 10 points, the correlation becomes sensitive to individual outliers and prone to spurious results. If your dataset is smaller, acknowledge this limitation when reporting findings and consider whether the pattern holds when new data arrives.

Can Pearson correlation detect non-linear relationships?

No. Pearson's r is specifically designed to measure linear associations. If two variables follow a parabolic, exponential, or other curved pattern, r may remain close to zero despite a strong relationship. Always visualize your data with a scatter plot. If you observe curvature, polynomial regression or non-parametric methods like Spearman's rank correlation may be more appropriate for capturing the true dependency.

Why is my correlation coefficient negative when I expect a positive relationship?

A negative r means the variables move in opposite directions: as one increases, the other decreases on average. This can occur if you've inadvertently reversed the scale of one variable (e.g., coding high satisfaction as 1 and low as 5 while other measures increase with value). Double-check your data entry and variable coding. Alternatively, the relationship genuinely is inverse—for instance, workout intensity and recovery time often correlate negatively.

What is the difference between Pearson and Spearman correlation?

Pearson's r measures linear relationships between continuous variables and is sensitive to outliers and extreme values. Spearman's correlation ranks the data first, then applies the Pearson formula to the ranks, making it non-parametric and robust to outliers. Use Spearman when your data is ordinal (ranks), heavily skewed, or contains influential outliers. Pearson is preferred for normally distributed, interval-level data without extreme values.

Does a correlation of exactly 0 mean the two variables are completely unrelated?

Not necessarily. A Pearson correlation of zero indicates no linear relationship. The variables could still be strongly dependent in a non-linear way—for example, a U-shaped or exponential pattern would yield r ≈ 0 despite clear association. Additionally, zero correlation in a sample does not rule out a correlation in the broader population, especially with small sample sizes. Inspect the scatter plot and consider alternative statistical techniques if you suspect hidden dependency.

More statistics calculators (see all)

Rayleigh Distribution Calculator Birthday Paradox Calculator Lottery Calculator Matthews Correlation Coefficient Calculator MSE Calculator Sensitivity and Specificity Calculator Population Variance Calculator Sensitivity Calculator

Data

How many points (up to 30)?

x10

y10

x11

y11

x12

y12

x13

y13

x14

y14

x15

y15

x16

y16

x17

y17

x18

y18

x19

y19

x20

y20

x21

y21

x22

y22

x23

y23

x24

y24

x25

y25

x26

y26

x27

y27

x28

y28

x29

y29

x30

y30