Measurement & testing

Multi-armed bandit

Also known as: bandit test · Thompson sampling

An experimentation approach that dynamically shifts traffic toward the winning variant as results emerge — higher total conversion than traditional A/B tests when lift is real, but weaker statistical certainty about the final result.

Multi-armed bandits (MAB) are an alternative to fixed-allocation A/B testing where traffic is progressively reweighted toward the currently-winning variant. Thompson sampling is the most common flavour: each variant's conversion rate is modelled as a probability distribution, and each new user gets randomly assigned in proportion to each variant's probability of being the winner. Bandits are better for continuous optimisation where you care about total revenue during the test (ramp up the winner sooner = more revenue), and weaker for decision-making where you care about statistical certainty about which variant is best. Use bandits for high-volume ongoing optimisation (homepage hero rotation, send-time tuning); use traditional A/B for decision-quality launches (new product page, pricing change) where you need confidence in the final call.

Read next

See also

← Back to the glossary