Measurement & testing

False positive

Also known as: Type I error

An A/B test result declared significant when the apparent effect was actually due to chance — increases with each additional metric compared, with early peeking, and with multiple variants tested against a single control.

A false positive is an A/B test that appears to show a real effect but actually doesn't — the observed difference was noise. At a standard 95% confidence level, 5% of A/B tests where there is NO real effect will still produce a "significant" result by chance alone. The false-positive rate inflates further when operators: peek at results mid-test and stop early (peeking inflation, up to 3x the false-positive rate), compare multiple metrics and pick the winning one (multiple-comparisons problem — testing 5 metrics at p<0.05 produces an effective false-positive rate closer to 23%), or run many variants against one control without Bonferroni adjustment. Every falsely-declared winner rolls out changes that don't work, and over time the cumulative effect on program performance is substantial. Discipline: pre-declare primary metric, run to pre-computed sample size, declare significance once.

Try the tool

Read next

See also

← Back to the glossary