Guides
Guides
experimentation
Measurement is where most lifecycle programs fool themselves. Running tests without sample-size math. Declaring winners from noise. Confusing last-click revenue with incremental revenue. These guides cover the discipline that separates real learning from confirmation theatre.
A lifecycle team that runs 20 A/B tests a year at p=0.05 should statistically expect 1 false-positive winner from pure noise. Most teams don't track how many tests they've run, so the false winners become 'learnings', propagate through the playbook, and quietly underperform. The gap between the claimed lifts and the aggregate program improvement is the tax of undisciplined experimentation.
The guides in this category cover the full testing stack. Sample size calculation — the 5-minute math that tells you whether a test can detect the effect you're looking for before you run it. The holdout group pattern — randomly suppressing a small population from a program so you can see its real incremental lift, not just its last-click attributed revenue. A/B testing structure — one primary metric, pre-registered, sized for a realistic effect, read at the end, not during.
Then the measurement stack. Cohort retention analysis — the one chart that tells you if retention is actually improving, stratified by cohort week or signup channel. Attribution models and which one to use for which question (first-touch for acquisition, last-click for transactional, multi-touch for anything in between, holdout for the honest incrementality answer). Send-time optimisation and the gap between vendor-claimed and measured lift. False-positive prevention and how to spot a 'winning' test that will not replicate.
Read these before you run the next test. Running an underpowered test isn't neutral — it spends the audience and produces conclusions that range from useless to actively wrong.
Most email A/B tests produce winners that don't reproduce. This guide covers the three reasons — under-powered samples, the novelty effect, and weak readout discipline — and how to design tests that actually drive decisions.
10 min read
Email is often the first place teams try to price-test, and it's often where the wrong lesson gets learned. This guide covers what can genuinely be tested in email, what can't, and the measurement traps that make most email price tests unreliable.
9 min read
Without a holdout, lifecycle ROI is attribution-model guesswork. With one, you get a defensible number you can put in front of finance. Here's how to size, run, and read a holdout group — and the three mistakes that invalidate the result.
9 min read
Attribution debates are half epistemology, half politics. Last-touch is wrong but defensible; multi-touch is more accurate but less defensible; incrementality is most correct but slowest. Here's which model to use for which question — and which is table stakes for each.
10 min read
A cohort retention curve is the single most useful analytical artifact in lifecycle marketing. It isolates the effect of program changes from compounding base effects, and it's the one view that survives every other metric's limitations. Here's how to build one and read it.
9 min read
Most email A/B tests are powered to detect effects far larger than they can actually produce. Here's the sample size calculation that tells you whether your test will find what you're looking for — before you run it.
8 min read
Every ESP now markets a send-time optimisation feature. They all show flattering internal case studies. The honest version: STO moves open rate 3–8%, not revenue, and only works for certain program types. Here's when it's worth turning on.
7 min read
Run enough A/B tests and some will show 'significant' lift from random noise. Programs that ship every significant winner end up with a collection of imaginary improvements. Here's how to tell real lift from noise and avoid the false-positive trap.
8 min read
Last-click attribution makes lifecycle programs look bigger than they are. Incrementality tests strip out the effect of users who would have converted anyway and reveal the real lift. Here's how to design one that produces a defensible number.
9 min read
A winning A/B test with 4% lift overall might be a 20% win in one segment and a 10% loss in another. Segment-based analysis reveals the real story — and lets you ship winners to the segments that benefit while avoiding users who would be hurt.
8 min read