Measurement & testing

A/B testing

Also known as: split testing · bucket testing

A randomised experiment where two (or more) variants are shown to different user groups and compared on a primary metric — requires sufficient sample size, a pre-declared hypothesis, and statistical significance testing before declaring a winner.

A/B testing is the lifecycle discipline for attributing causation to changes. A variant (B) is compared against control (A) by randomly assigning users to each and measuring a primary metric — open rate, click rate, conversion rate. For results to be trustworthy: sample size must meet pre-computed statistical power requirements (usually 80% power, 95% confidence), the hypothesis must be declared before the test runs (not post-hoc), and the winner is declared only after reaching significance on the primary metric (not multiple metrics, not mid-test peeks). Most A/B tests in lifecycle email fail one of these: undersampled (tested on 500 users when 15,000 are needed), multiple comparisons (declared winner on whichever metric happened to move), or peeked early. The A/B test sample-size calculator gives the required volume per arm for any baseline rate + MDE combination.

Try the tool

Read next

See also

← Back to the glossary