Central Tendency: Mean, Median, and Mode

The three primary measures of central tendency describe where most values in a dataset cluster. Each serves a distinct purpose depending on your data's characteristics.

  • Mean is the arithmetic average: the sum of all values divided by how many values exist. It's sensitive to extreme values and works best for symmetric, normally distributed data.
  • Median is the middle value when data is sorted in order. Half the values fall below it and half above. The median resists the pull of outliers, making it ideal for skewed distributions.
  • Mode is the most frequently occurring value. A dataset can have one mode (unimodal), multiple modes (multimodal or bimodal), or no mode at all if every value appears equally often.

In practice, examining all three together provides a fuller picture. A salary dataset might have a mean pulled upward by high earners, a median closer to typical wages, and a mode reflecting the most common salary bracket.

Calculating Mean, Median, and Range

The mean uses a straightforward summation formula. Once you have your dataset sorted, finding the median requires identifying the position of the middle value(s). Range and midrange are computed from the extreme values in your set.

Mean = (x₁ + x₂ + x₃ + ... + xₙ) ÷ n

Median position = (n + 1) ÷ 2

Range = Max − Min

Midrange = (Max + Min) ÷ 2

  • x₁, x₂, ..., xₙ — Individual values in your dataset
  • n — Total count of values
  • Max — Largest value in the dataset
  • Min — Smallest value in the dataset

Finding the Median in Odd and Even Datasets

The median calculation differs slightly depending on whether you have an odd or even number of observations.

Odd-sized datasets: Sort the numbers from lowest to highest, then select the single middle value. For {3, 5, 7}, the median is 5.

Even-sized datasets: Sort the numbers, then calculate the average of the two centermost values. For {2, 4, 6, 8}, the median is (4 + 6) ÷ 2 = 5.

This approach works because it ensures exactly half your data lies on each side of the median. For large datasets, use the formula: position = (n + 1) ÷ 2 to locate which value(s) to extract.

Understanding Mode and Multimodal Distributions

The mode is straightforward in principle—it's the value appearing most often—yet distributions can present different scenarios. Count the frequency of each value to identify patterns.

  • Unimodal: One value dominates. In {2, 3, 3, 5, 8}, the mode is 3 (appears twice).
  • Bimodal: Exactly two values share the highest frequency. In {1, 1, 2, 2, 7}, both 1 and 2 are modes.
  • Multimodal: Three or more values tie for highest frequency. In {1, 1, 2, 2, 3, 3}, all three are modes.
  • No mode: All values appear with equal frequency. In {4, 5, 6, 7}, no mode exists.

Mode is particularly useful for categorical data and identifying the most popular category, making it valuable in market research and quality control.

Common Pitfalls When Using Central Tendency Measures

These practical considerations will help you choose and interpret the right statistic for your analysis.

  1. Mean and outliers — The mean can be dramatically skewed by a single extreme value. In {10, 12, 13, 100}, the mean jumps to 33.75, yet most observations cluster near 11–13. Always check for outliers using visualization or the median to validate whether the mean represents your typical value.
  2. Median vs. mean for real-world data — Real-world datasets like income, house prices, or medical costs often contain a long tail of high values. The median better represents the 'typical' person or transaction in such cases. Publishing mean salary in a company with one CEO and 99 workers can mislead about typical wages.
  3. Mode limitations in continuous data — Mode works well for categorical or discrete data (favourite colour, number of goals scored) but becomes unhelpful for continuous measurements like height or weight, where values rarely repeat. Summarize continuous data using mean and median instead.
  4. Range as a spread measure — Range only considers the two extreme values and ignores everything in between. A dataset spanning 0 to 100 has range 100, whether values cluster at the ends or spread evenly. Pair range with standard deviation or interquartile range for a fuller picture of variability.

Frequently Asked Questions

What is the difference between the mean and the median?

The mean is the arithmetic average of all values, computed by summing them and dividing by count. The median is the middle value when data is sorted. For symmetric distributions, they're nearly identical. For skewed data, they diverge: the mean gets pulled toward extreme values while the median remains stable. Example: in {10, 11, 12, 13, 1000}, the mean is 209.2, but the median is only 12. Use the median when outliers are present or data is skewed.

Can a dataset have more than one mode?

Yes. When two or more values share the highest frequency, the distribution is multimodal. A bimodal distribution has exactly two modes; multimodal covers three or more. For instance, in {2, 2, 5, 5, 8}, both 2 and 5 appear twice, making the dataset bimodal. If all values appear equally often, there is no mode. Multimodal datasets often signal that your data contains distinct subgroups worth investigating separately.

Why would I choose the median over the mean?

The median is superior when your dataset contains outliers or follows a skewed distribution. Because it depends on position rather than magnitude, extreme values don't distort it. Income data exemplifies this: median household income better reflects what a typical household earns than the mean, which billionaires push upward. The median is also more robust for non-normal data and smaller sample sizes where one unusual observation could mislead.

What does range tell you about your data?

Range is the difference between the maximum and minimum values, showing the spread or span of your data. A small range indicates values cluster tightly; a large range suggests wide dispersion. However, range only considers two data points and ignores the distribution in between. Two datasets with identical ranges can have very different distributions. Always pair range with other measures like standard deviation or interquartile range for a complete picture of variability.

How do I compute these statistics in Excel?

Excel provides built-in functions: use AVERAGE() for the mean, MEDIAN() for the median, and MODE.SNGL() for the mode. For a dataset in cells A1:A10, enter =AVERAGE(A1:A10) in a cell and press Enter. The functions work identically for range (MAX() − MIN()) and midrange ((MAX() + MIN()) ÷ 2). These functions handle large datasets efficiently and reduce calculation errors compared to manual computation.

Which statistic is most affected by extreme values?

The mean is highly sensitive to outliers. A single unusually large or small value can substantially shift the mean away from where most data concentrates. In {5, 6, 7, 8, 100}, the mean is 25.2 despite most values clustering near 6–8. The median (6.5) and mode (no mode) better represent the typical observation. This is why financial analysts often report median figures for income, house prices, and other variables prone to extreme values.

More statistics calculators (see all)