Type I and Type II error are easy to define and almost impossible to keep straight under exam pressure. The names give no hint, and "false positive" versus "false negative" blur together fast. This guide gives you a definition for each, a courtroom analogy that makes the difference stick, and the link to alpha and power that exams test.

The Two Errors, Defined

Every hypothesis test starts with a null hypothesis (H₀, the "no effect" claim) and ends with one of two decisions: reject H₀ or fail to reject it. Because you are deciding based on a sample, you can be wrong in two distinct ways.

A Type I error is rejecting the null hypothesis when it is actually true. You declared an effect that is not there — a false positive.

A Type II error is failing to reject the null hypothesis when it is actually false. You missed an effect that really exists — a false negative.

Notice the symmetry. Type I is a false alarm; Type II is a missed detection. There is no error when you reject a false null (correct) or fail to reject a true null (also correct). The two errors are the only two wrong cells in a 2×2 grid of decision against truth.

A two-by-two grid drawn on a chalkboard with two cells marked
Decision against truth: only two of the four cells are errors.

The Courtroom Analogy That Makes It Stick

A criminal trial is a hypothesis test. The null hypothesis is the legal presumption: the defendant is innocent. The court does not try to prove innocence; it asks whether the evidence is strong enough to reject that presumption.

  • Type I error: the jury convicts an innocent person. The null (innocent) was true, but the jury rejected it. An innocent person goes to prison — a false positive.
  • Type II error: the jury acquits a guilty person. The null (innocent) was false, but the jury failed to reject it. A guilty person walks free — a false negative.

The analogy also explains why courts set the bar at "beyond a reasonable doubt." Society has decided that convicting the innocent (Type I) is the worse error, so the system is deliberately tuned to make Type I errors rare — at the cost of letting some guilty defendants go (more Type II errors). That trade-off is the whole point of the next section.

One memory hook: Type I = I convict the innocent — the "I" in Type I lines up with the false accusation.

How Alpha and Beta Control the Errors

Each error has a probability, and you control them when you design the test.

The probability of a Type I error is alpha (α) — the significance level. When you choose α = 0.05, you are accepting a 5% chance of a false positive if the null is true. Alpha is set directly by you before collecting data.

The probability of a Type II error is beta (β). Unlike alpha, you do not set beta directly; it depends on the true effect size, the sample size, and your chosen alpha.

The crucial relationship: lowering alpha raises beta, all else equal. If you make it harder to reject the null — say, dropping α from 0.05 to 0.01 — you cut false positives but make false negatives more likely. You demand stronger evidence, so you also miss more real effects. You cannot shrink both error rates at once without changing something else.

That "something else" is sample size. A larger sample lowers beta without raising alpha, which is why bigger samples improve a test on every front.

Power: The Flip Side of Type II Error

The power of a test is the probability of correctly rejecting a false null hypothesis — catching an effect that is genuinely there. Power and beta are two sides of one coin:

Power = 1 − β

A test with β = 0.20 has power 0.80, meaning an 80% chance of detecting a real effect of the assumed size. Power 0.80 is a common target in study design.

Power rises when the sample size grows, when the true effect is larger and easier to spot, when the data is less variable, and when alpha is more lenient. The single most controllable lever is sample size — which is why "the study was underpowered" is a standard criticism of a test that found nothing: it may simply have been too small to detect a real effect, producing a Type II error.

Getting Help

These errors are the consequences of the decision rule in a full test, so they make the most sense after walking through setting up a hypothesis test. The decision itself turns on a p-value, and p-values explained covers how that number drives the reject-or-not call.

Conclusion

Type I vs. Type II error comes down to false positive versus false negative: Type I rejects a true null (convicting the innocent), Type II fails to reject a false null (acquitting the guilty). Their probabilities are alpha and beta, and the two trade off — tightening alpha raises beta unless you also increase the sample size. Power, equal to 1 − β, is the test's ability to catch a real effect. Anchor the whole picture on the courtroom and the names stop being arbitrary.