Standard deviation and variance measure the same thing — how spread out a data set is — and one is just the square root of the other. So why do textbooks insist on teaching both? The short answer is units: variance is mathematically natural but reads in squared units, while standard deviation is back in the original units and is the one humans actually report. Here is the full picture, the formulas side by side, and a clear rule for which to use when.
The Definitions, Side by Side
For a sample of n values x₁, x₂, ..., xₙ with mean x̄:
- Sample variance s² = Σ (xᵢ − x̄)² / (n − 1)
- Sample standard deviation s = √s²
For a population of size N with mean μ:
- Population variance σ² = Σ (xᵢ − μ)² / N
- Population standard deviation σ = √σ²
The two pairs differ only by whether you divide by n − 1 or by N (more on that below). Within each pair, the relationship is the same: variance is the squared one, standard deviation is the un-squared one.
Both quantities are computed from the same sum of squared deviations Σ (xᵢ − x̄)². The squaring is what guarantees positive contributions — outcomes above and below the mean both push the spread up — and what creates the "squared units" problem.
The Units Issue
Suppose you measure five students' heights in inches: 64, 66, 67, 70, 73. The mean is 68 inches.
The squared deviations from the mean are (−4)², (−2)², (−1)², 2², 5² = 16, 4, 1, 4, 25, summing to 50.
- Sample variance s² = 50 / (5 − 1) = 12.5 inches²
- Sample standard deviation s = √12.5 ≈ 3.54 inches
The variance is in square inches, which is not a natural unit for height. The standard deviation is back in inches, the same unit as the data, and reads naturally: "students' heights vary by about 3.5 inches around the mean."
This is the headline reason both exist. Variance comes out of the math; standard deviation comes out of the math followed by a square root, and it is the one with sensible units.
Why Variance Is the Mathematically Natural One
If standard deviation is what humans want to report, why is variance the one in every theoretical formula? Because variance has properties that standard deviation does not:
- Variances of independent random variables add. If X and Y are independent, Var(X + Y) = Var(X) + Var(Y). Standard deviations do not add like this; only their squares do.
- Variance scales by a² under a constant multiplier: Var(aX) = a² Var(X). The square is built in.
- The variance of a sample mean is σ²/n, which is what gives the standard error its √n in the denominator.
Almost every formula in inferential statistics — confidence intervals, ANOVA, regression — is naturally written in terms of variance, then takes a square root at the end to report a standard deviation. The square root is a presentation step; the math underneath uses variance.
Why n − 1 (Bessel's Correction)
The sample variance divides by n − 1, not n. This is Bessel's correction, and it exists because using x̄ in place of the unknown μ slightly biases the variance estimate downward — the sample mean is closer to the sample values than the true μ is, so squared deviations from x̄ underestimate squared deviations from μ. Dividing by n − 1 instead of n nudges the estimate back up to compensate.
For populations where you know μ, the bias does not exist and you divide by N. For samples used to estimate a population variance, divide by n − 1. Almost every textbook problem you will see uses the sample formula.
The difference matters most for small n. With n = 5, dividing by 4 instead of 5 inflates the variance by 25%. With n = 100, the inflation is only 1%. For large n the two formulas are practically identical.
Which One to Report
A useful rule:
- Report standard deviation when you describe data to a reader — it is in the original units and is directly interpretable. "Test scores had a standard deviation of 8 points."
- Use variance internally when you compute things — adding independent variances, deriving standard errors, running ANOVA. Then convert to a standard deviation at the end if you need to communicate the result.
Some fields buck this. Finance uses variance directly to talk about risk (and a related "volatility," which is a standard deviation). Quality control reports both. But for an intro statistics audience, default to standard deviation.
The Empirical Rule Lives in Standard Deviations
The 68–95–99.7 rule for an approximately normal distribution is stated in standard deviations, not variances. About 68% of values fall within one standard deviation of the mean, 95% within two, and 99.7% within three. This is one more reason standard deviation is the everyday number — it has a direct visual meaning on a bell curve.
For the height example with mean 68 and s ≈ 3.5 inches, the empirical rule predicts most students between 64.5 and 71.5 inches (1 SD) and almost all between 57.5 and 78.5 inches (3 SDs). Try to do that with variance and the units get in the way immediately.
Quick Comparison Table (No Table, Just the Facts)
A side-by-side checklist for the two:
- Variance. Squared units. Adds for independent random variables. Used in derivations. Sample form: Σ(xᵢ − x̄)² / (n − 1).
- Standard deviation. Original units. Does not add. Used to report and compare to data values directly. Equals √variance.
You will rarely compute variance and not also need its square root, and you will rarely compute standard deviation without having gone through variance first. Treat them as the two faces of the same calculation.
Common Mistakes
The first is reporting variance with the wrong units. "The variance is 12.5 inches" is wrong — variance is in inches squared. Either write the units carefully or convert to standard deviation.
The second is forgetting Bessel's correction on sample data. Dividing by n gives a biased estimate. Most calculators and spreadsheets have separate functions: VAR.S and STDEV.S divide by n − 1 (sample); VAR.P and STDEV.P divide by N (population). Use the sample versions unless you have the whole population.
The third is adding standard deviations. The standard deviation of a sum of independent random variables is not the sum of their standard deviations — it is the square root of the sum of the variances. SD(X + Y) = √(Var(X) + Var(Y)), not SD(X) + SD(Y).
Getting Help
If you are running a hypothesis test or building a confidence interval, the standard deviation feeds into the standard error of the statistic. The mechanics of how that propagates are covered in sampling distributions explained. For a different view of spread that does not rely on the mean at all, five-number summary and boxplots covers the IQR and outlier-resistant measures.
Conclusion
Standard deviation and variance describe the same spread; they differ in units and in role. Variance is the mathematically natural quantity — it adds across independent random variables, scales by the square, and shows up inside every theoretical formula. Standard deviation is the square root of variance and is what you report to a reader because it is in the original units. Compute variance, take the square root, and report the standard deviation — that workflow handles almost every situation in an intro course.