R-squared is the most quoted number from any regression output, and the most misread. It looks like a percentage and is easy to summarize as "how good the model is," but that summary hides a lot. Interpreting R-squared correctly takes a sharp definition and a clear sense of what it does not measure. Here is what R² actually is, three traps students fall into, and how to use it without overclaiming.

What R-Squared Actually Measures

R-squared, written R², is the proportion of the variance in the response variable that is explained by the regression model. In a regression of Y on one or more predictors, the total variation in Y can be split into two pieces:

  • SSR (regression sum of squares): variation the model captures, i.e. variation in the predicted values Ŷ around the mean of Y.
  • SSE (error sum of squares): variation the model misses — the squared residuals.

Their sum is the total sum of squares SST. R² is

R² = SSR / SST = 1 − SSE / SST

R² ranges from 0 to 1. R² = 0 means the model explains none of the variation (the predictors give you no improvement over guessing Y = mean of Y). R² = 1 means the model explains all of it (every prediction is exact).

In simple linear regression, R² is exactly the square of the correlation coefficient r between X and Y. In multiple regression, it generalizes: it is the square of the correlation between Y and the fitted Ŷ.

A common interpretation: "Hours studied explained 65% of the variation in exam score." That is the right shape of sentence — it ties R² to variation in the response, not to the response itself.

Why "Variation Explained" Is the Right Wording

A student-friendly alternative — "the model is 65% accurate" — is wrong in a useful way. R² is not an accuracy. It is not the percentage of cases the model gets exactly right. Two regressions with the same R² can have very different prediction errors in original units, depending on the variance of Y.

Walk through a small example. Two studies regress weight on height.

  • Study A: SD of weight is 30 pounds. The model gives R² = 0.50, so 50% of weight's variance is explained. The unexplained variance is 0.50 × 30² = 450; the residual standard error is √450 ≈ 21 pounds. A typical residual is about ±21 pounds.
  • Study B: SD of weight is 8 pounds. R² = 0.50 again, so the residual standard error is √(0.50 × 8²) = √32 ≈ 5.7 pounds.

Same R², very different prediction precision. Always report the residual standard error (or root mean squared error) alongside R² if predictions matter.

A scatter plot with a fitted regression line and vertical residual segments showing distance to the line
A scatter plot with a fitted regression line and the residuals shown as vertical segments between points and the line

Trap 1: A High R² Does Not Mean the Model Is Right

A regression with R² = 0.95 looks impressive. But R² says nothing about whether the model's form is correct, whether the residuals behave well, or whether the predictors cause changes in Y.

Three specific ways a high R² can be misleading:

  • Wrong functional form. Fit a straight line to clearly curved data and R² can still be high (say 0.80) because the line catches the broad trend — but the residuals will show a U-shape that exposes the fit's failure. Always plot residuals against the predictors.
  • Outliers driving the fit. A single extreme point can both inflate R² and pull the fitted line off the bulk of the data. Without inspecting the scatter plot you cannot tell.
  • Spurious correlation. Two variables that move together in a sample (ice cream sales and drowning deaths, both rising in summer) can produce a high R² with no causal connection. R² describes the sample; it does not establish a mechanism.

The rule: R² is necessary but not sufficient. A model with low R² is usually weak, but a model with high R² is not automatically strong.

Trap 2: Adding Predictors Always Raises R² in Multiple Regression

In multiple regression, R² can only go up — or stay the same — as you add predictors. Even a column of random noise will explain a little variation in any finite sample, just by chance. So comparing models by R² rewards bloat.

The standard fix is adjusted R², which subtracts a penalty proportional to the number of predictors. Its formula:

Adjusted R² = 1 − (1 − R²) × (n − 1) / (n − p − 1)

where n is the sample size and p is the number of predictors. Adjusted R² goes down when you add a predictor that does not improve the fit enough to justify the extra parameter. It is the right summary statistic for comparing nested models or models of different sizes.

A worked feel: with n = 30 and one predictor, R² = 0.65 gives adjusted R² ≈ 0.638. With ten predictors and the same R² = 0.65, adjusted R² drops to about 0.475 — most of that apparent fit is being absorbed by the extra parameters, not by real signal.

Trap 3: R² Says Nothing About Causation or Prediction Outside the Range

Two final misreadings to retire.

R² is not causation. A regression fit to observational data describes a pattern in that sample; it does not show that changing X would change Y. For more on the conceptual divide, correlation vs. causation covers the specific traps. The same caveat applies to multiple regression: even after controlling for the other predictors in the model, you cannot rule out an omitted confounder driving the relationship.

R² is in-sample by default. It tells you how well the model fits the data it was trained on, not how well it will predict new data. A model with R² = 0.99 on the training set may have R² near zero on a holdout — that gap is called overfitting. For prediction quality, hold back a test set or use cross-validation, and report out-of-sample R² or root mean squared error.

A Useful Mental Model

A clean way to think about R²: the model has a budget of variance to explain — namely, the variance of Y in the data set. R² is the fraction of that budget the model spends successfully. It does not tell you how big the budget was, whether the spending was on the right things, or how much money is left for predicting tomorrow's Y. Always pair R² with the residual standard error, residual plots, and (if prediction matters) a holdout evaluation.

Getting Help

Before you can interpret R² in context you have to be reading the rest of the regression output cleanly. Reading regression output covers coefficients, t-statistics, p-values, and the F-statistic side by side. For the broader question of when a relationship in data implies a cause, correlation vs. causation is the right companion piece.

Conclusion

R-squared is the proportion of variance in Y the model explains — a number between 0 and 1, often quoted as a percentage. It is useful, but easy to over-interpret. A high R² does not prove the model is correctly specified, R² rises mechanically as you add predictors in multiple regression (use adjusted R² for cross-model comparisons), and R² is silent on causation and out-of-sample prediction. Quote it; pair it with residual plots and a residual standard error; do not let it be the only number you report.