A boxplot turns a column of numbers into a single picture that shows center, spread, skew, and outliers at a glance. The picture is built from the five-number summary: minimum, first quartile, median, third quartile, maximum. This walkthrough takes one small data set and produces the summary, the outlier check, and the boxplot — and then reads the plot back into a description of the distribution.

The Five-Number Summary

The five-number summary is exactly what its name says: five numbers that together describe a data set's distribution.

  • Minimum: the smallest value.
  • Q1 (first quartile): the median of the lower half of the data. About 25% of the values fall at or below Q1.
  • Median (Q2): the middle value when the data are sorted. About 50% fall at or below the median.
  • Q3 (third quartile): the median of the upper half. About 75% fall at or below Q3.
  • Maximum: the largest value.

From these five numbers you also get the interquartile range IQR = Q3 − Q1, which measures the spread of the middle 50% of the data and is the most outlier-resistant spread statistic in common use.

The reason this particular five works: it covers the location (median), the spread (IQR), and the tails (min and max) all at once, and every one of them is a position in the sorted data rather than an average of values. That makes the whole summary robust to a few extreme observations — unlike the mean and standard deviation.

Finding Quartiles by Hand

There are several conventions for computing quartiles when n is not a multiple of 4 (textbooks differ, software packages differ). The most common in introductory courses:

  1. Sort the data in increasing order.
  2. Find the median. If n is odd, the median is the middle value. If n is even, the median is the average of the two middle values.
  3. Find Q1 as the median of the lower half of the data (excluding the overall median if n is odd). Find Q3 as the median of the upper half (also excluding the overall median if n is odd).

Different software (Excel, R, TI calculators) uses slightly different rules for interpolating quartiles, so your answer might disagree with a calculator by a fraction. As long as you state the method, you are fine.

The Worked Example

Eleven students record their hours of sleep last night: 5, 6, 7, 7, 7, 8, 8, 8, 9, 9, 11.

The data is already sorted, with n = 11.

A horizontal boxplot showing the box, median line, whiskers, and an outlier dot
A horizontal boxplot showing a box from Q1 to Q3 with the median line, whiskers to the min and max, and one outlier marked with a dot

Median (Q2). With n = 11 (odd), the median is the 6th value: 8 hours.

Q1. The lower half (excluding the median) is 5, 6, 7, 7, 7 — five values. The median of those is the 3rd value: Q1 = 7 hours.

Q3. The upper half (excluding the median) is 8, 8, 9, 9, 11 — five values. The median of those is the 3rd value: Q3 = 9 hours.

Min = 5, Max = 11.

The five-number summary is 5, 7, 8, 9, 11.

IQR = Q3 − Q1 = 9 − 7 = 2 hours.

The 1.5 × IQR Outlier Rule

A standard rule flags any value outside

[Q1 − 1.5 × IQR, Q3 + 1.5 × IQR]

as a potential outlier.

For the sleep data, 1.5 × IQR = 1.5 × 2 = 3. So the "fences" are 7 − 3 = 4 (lower) and 9 + 3 = 12 (upper). Any value below 4 or above 12 is an outlier.

The minimum 5 is above the lower fence (no outlier on the low side). The maximum 11 is below the upper fence (no outlier on the high side, even though 11 is well above the rest). So this data set has no outliers by the 1.5 × IQR rule.

If a 12th student reported 14 hours, that value would exceed the upper fence of 12 and be flagged as an outlier. The boxplot would mark it with a dot beyond the whisker.

Reading the Boxplot

A boxplot draws a rectangle from Q1 to Q3 with a line at the median, and "whiskers" extending out to the most extreme non-outlier values. Outliers are plotted as individual dots beyond the whiskers.

For the sleep data, the box runs from 7 to 9 with a median line at 8. The lower whisker runs from 7 down to the minimum, 5. The upper whisker runs from 9 up to the maximum, 11.

What the boxplot tells you at a glance:

  • Center: the median line, 8 hours.
  • Spread of the middle 50%: the box width, 2 hours.
  • Total range: from end of whisker to end of whisker, 5 to 11 (6 hours).
  • Skew: compare the whisker lengths and the median's position inside the box. Here the upper whisker (9 to 11 = 2) is longer than the lower whisker (5 to 7 = 2 also — tied), and the median is centered. The distribution is roughly symmetric.
  • Outliers: none in this data set.

A telltale sign of right skew on a boxplot is a long upper whisker, several outlier dots on the upper side, or a median sitting in the lower part of the box. Left skew is the mirror image.

Boxplots Side by Side

The most useful thing a boxplot does is compare two or more groups in one picture. Plot the boxplots side by side on a common scale — different test sections, different treatments, different years. You can read which group has the higher median, which is more spread out, and which has more extreme observations without ever looking at the underlying numbers.

This is why boxplots are the default exploratory plot for comparing groups: they compress a lot of information into a small space and are robust to a few weird values.

Common Mistakes

The first is mixing up the median with the mean. The median sits inside the box; the mean does not appear on a standard boxplot. If a problem asks for the mean, the boxplot will not give it to you.

The second is mis-counting which values go into the lower and upper halves. With an odd n, exclude the median from both halves before finding Q1 and Q3. Software sometimes uses other conventions; for hand computation, exclude-and-find-the-median is the standard.

The third is forgetting to apply the 1.5 × IQR rule before drawing whiskers. Whiskers end at the most extreme non-outlier values, not at the absolute min and max — outliers are drawn as separate dots, not at the end of a whisker.

Getting Help

For a tighter look at when the median or mean is the right center to report — and how each behaves with outliers — work through mean, median, mode. Boxplots also pair naturally with the question of variability: standard deviation vs. variance covers the spread statistics the boxplot does not show.

Conclusion

The five-number summary — minimum, Q1, median, Q3, maximum — is a robust description of a distribution that you can compute by hand and turn directly into a boxplot. The 1.5 × IQR rule flags outliers; whiskers stop at the most extreme non-outliers. Read the boxplot for center (median), spread (IQR), skew (whisker lengths and median position), and outliers (dots beyond the whiskers). The sleep example produced the summary 5, 7, 8, 9, 11 with no outliers and a roughly symmetric shape — the kind of answer you should be able to write in under two minutes.