What Is a Five-Number Summary?

A five-number summary is a foundational descriptive statistic that partitions a dataset into quarters, showing you exactly how values spread across the entire range. Instead of collapsing everything into a mean (which can mislead), these five values—minimum, Q1, median, Q3, and maximum—paint an honest picture of distribution.

Consider a hiring manager who tells candidates their company's average salary is £40,000 annually. That sounds reasonable until you discover one employee earns £15,000 while the CEO takes £200,000. The mean obscures what's really happening. A five-number summary would reveal this immediately: minimum £15,000, Q1 £28,000, median £38,000, Q3 £52,000, maximum £200,000. Suddenly the compensation structure becomes transparent.

This approach works for any quantitative dataset: exam scores, response times, product measurements, or survey ratings. It's the first calculation analysts perform when exploring unfamiliar data.

Calculating the Five-Number Summary

To find each component, arrange your data in ascending order, then identify the extremes and quartiles:

Minimum = smallest value in dataset

Maximum = largest value in dataset

Median (Q2) = middle value (or average of two middle values if n is even)

Q1 = median of lower half of data

Q3 = median of upper half of data

  • n — total number of data points
  • Q1 — first quartile; 25th percentile separating the lowest quarter of values
  • Q2 (Median) — second quartile; 50th percentile dividing the dataset in half
  • Q3 — third quartile; 75th percentile separating the highest quarter of values

Understanding the Box-and-Whisker Plot

The five-number summary translates perfectly into a box-and-whisker plot, a visual that makes distribution instantly recognisable. The plot consists of:

  • Whiskers (lines): extend from minimum to Q1 and from Q3 to maximum, showing the outer 50% of data
  • Box: spans from Q1 to Q3, containing the middle 50% (interquartile range, or IQR)
  • Line inside the box: marks the median, often visually distinct

A symmetric box indicates balanced data. Whiskers or a box skewed left or right signal asymmetry. Isolated points beyond the whiskers may represent outliers worth investigating separately. Compare multiple box plots side by side to see how different groups or conditions affect distribution without getting lost in raw numbers.

Step-by-Step Calculation Process

Working through a five-number summary by hand reinforces what the calculator automates:

  1. Sort your data in ascending order: this is non-negotiable for accurate results.
  2. Identify the minimum and maximum: the first and last values after sorting.
  3. Find the median: if you have an odd number of values, pick the middle one; if even, average the two middle values.
  4. Split the dataset: divide your data at the median into lower and upper halves (exclude the median itself if n is odd).
  5. Calculate Q1 and Q3: find the median of each half. Q1 marks where 25% of data falls below; Q3 marks where 75% falls below.

For large datasets, rounding and interpolation between values may occur, but the principle remains: these five numbers tell you where your data clusters, spreads, and clusters again.

Common Pitfalls and Practical Advice

Avoid these mistakes when interpreting or calculating five-number summaries.

  1. Including the median in quartile calculations — When splitting data into halves for Q1 and Q3, exclude the median value itself (if n is odd). Including it biases your quartiles and breaks the symmetry of the summary. Always treat the halves as separate datasets.
  2. Confusing the summary with outlier detection — The five-number summary shows you the range and quartiles, but doesn't automatically label outliers. Use the interquartile range (IQR = Q3 − Q1) and multiply by 1.5 to identify extreme values: anything below Q1 − 1.5×IQR or above Q3 + 1.5×IQR warrants closer inspection.
  3. Assuming symmetry means normal distribution — A symmetric five-number summary (equal gaps between quartiles) suggests balanced spread, but doesn't guarantee a bell curve. Skewness, kurtosis, and other properties vary. Always visualise the data and check distributional assumptions if your analysis depends on normality.
  4. Overlooking the importance of sorting — Skipping the sort step introduces catastrophic errors. Minimum and maximum become meaningless, and quartile positions misalign. Even for 30 data points, always arrange them first—no shortcuts.

Frequently Asked Questions

What's the difference between the five-number summary and the mean and standard deviation?

The mean averages all values into one number, which can hide extreme values or asymmetry. Standard deviation measures spread but assumes the data follows a normal curve. The five-number summary avoids these assumptions: it shows you actual percentiles and is robust to outliers. For skewed or unknown distributions, the five-number summary reveals truth more clearly than single summary statistics.

How do I calculate quartiles if my dataset has an even number of values?

When you have an even number of observations, the median lies between the two middle values—average them to find Q2. For Q1, take the median of all values below the median position. For Q3, take the median of all values above the median position. This ensures Q1 and Q3 remain at the 25th and 75th percentiles, respectively.

Can the five-number summary detect outliers?

The summary itself doesn't label outliers automatically, but it provides the foundation for detection. Calculate the interquartile range (IQR) as Q3 minus Q1. Any value falling below Q1 − 1.5 × IQR or above Q3 + 1.5 × IQR is typically flagged as a potential outlier. Review these values in context: they may be data entry errors, legitimate extremes, or genuinely interesting cases.

Why is the five-number summary better than just looking at a histogram?

A histogram shows frequency distribution visually, which is excellent for seeing the overall shape, but the five-number summary gives you precise position markers. You can communicate the summary in a single sentence or table, compare multiple datasets instantly, and perform statistical tests based on quartiles. Use both together: the summary for precision and communication, the histogram for intuitive understanding of shape.

How do I interpret a skewed five-number summary?

If the gap between minimum and Q1 is much larger than the gap between Q3 and maximum, your data is left-skewed (tail points left). The reverse indicates right-skew. Skewness affects where most values cluster and how far outliers extend. In salary data, right-skew is common because salaries cluster at lower values but have no upper bound for executives. Recognise skew to avoid misinterpreting your data.

What happens if I have duplicate values in my dataset?

Duplicates are treated as separate data points during sorting and quartile calculation. If you have ten identical values, they occupy ten positions in the ordered list. This affects the quartile positions proportionally but doesn't break the calculation. The five-number summary remains valid and reflects the true nature of your data, including any clustering.

More statistics calculators (see all)