Updated · 7 min read
Send-time optimisation: what it really moves, and what it doesn't
Send-time optimisation promises per-user delivery at the hour each person is most likely to engage. Every major ESP has shipped a version; every vendor deck shows lift. The real-world numbers are meaningfully smaller than the vendor decks, and the effect doesn't always land where programs want it to. Here's what STO actually does, when it helps, and when it's theatre.
Justin Williames
Founder, Orbit · 10+ years in lifecycle marketing
What STO actually does
Send-time optimisation looks at a user's historical open and click behaviour to predict the hour they're most likely to engage, then schedules the send for that hour (or the nearest available window). Different ESPs use different models — some simple (most recent open hour), some more sophisticated (time-of-day × day-of-week cohorts), a few using ML with other features.
STO moves the time the email lands. It doesn't change the email, the audience, or the offer. The lift is bounded by how much the time of delivery actually affects the user's engagement.
The maximum theoretical effect of STO is bounded by how much worse "random time" is vs "best time". For most users, the answer is: not much. Email sits in the inbox; users check at intervals. A 9am vs 2pm send is read around the same time by a user who checks at lunch. STO adds most value for users with predictable, concentrated engagement windows.
The measured effects
Vendor case studies show 20–40% open rate lift; independent benchmarks (Mailchimp, Litmus, academic studies) put the real lift at 3–8% on open rate, 1–4% on click rate, and typically no significant lift on revenue when measured against a proper holdout.
The gap between vendor claims and measured lift is because:
Apple MPP inflation. STO algorithms see machine-opens as real opens. Apple devices open all mail immediately after delivery; STO "learning" becomes "send at the time the user's Apple Mail pre-fetches", which is unrelated to real engagement. The reported open lift is often this inflation compounding.
Confounded comparisons. Many STO case studies compare STO-optimised sends to a control group at a different send time, not to a proper random-time holdout from the same population.
Small effect, noisy metric. Email opens are noisy; a 5% real lift is within the natural variance of many single sends.
,
When STO is worth turning on
Global audiences across time zones. Users in Sydney opening at 9am local and users in New York opening at 9am local require different send times. STO (or simple time-zone-aware sending) prevents one group from getting emails at 3am. Always worth doing.
Broadcast campaigns with no time-sensitivity. Newsletters, content emails, non-promotional broadcasts can spread over several hours without cost. STO gives modest lift at no downside.
Large, diverse audiences. With 500K+ users, individual send-time differences aggregate into measurable effects. STO works better at scale because the algorithm has more data per user.
When STO is worthless or harmful
Time-sensitive sends. A flash sale ending in 4 hours can't wait for each user's preferred time. Send now; STO is the wrong lever.
Triggered sends. Welcome emails, order confirmations, password resets — the trigger is the user action. STO would delay these, which is the opposite of what the user wants.
New users with no history. STO has no data to optimise on. Default to a sensible window and let the data accumulate. Many programs unthinkingly apply STO to welcome emails, producing worse performance than a fixed send time.
Small audiences (<50K users). Per-user optimisation relies on history. For small programs where most users have few data points, STO defaults to the category average, which is roughly the same as just picking a good time.
The alternative: simple time-zone-aware sending
For most programs, time-zone-aware sending captures 80% of STO's value without the complexity. Send to every recipient in their local 10am (or whatever default time the program uses). No individual-user optimisation; just respect for the time zone.
Easy to implement (most ESPs support it natively), no dependency on machine-open inflated engagement data, and the lift vs "everyone at 10am UTC" is usually comparable to what STO achieves.
The A/B testing playbook covers how to validate STO vs time-zone sending vs fixed send time for your specific audience.
What to do if your ESP pushes STO
Most modern ESPs charge for STO as a premium feature. The sales pitch uses vendor case studies with the methodology issues above. Before agreeing:
1. Ask for independent validation, not just vendor cases.
2. Ask how they measure lift — against what control?
3. Run the holdout test described above for 30 days before deciding.
4. Compare STO's measured lift to simple time-zone-aware sending (free in most ESPs).
Often the conclusion: time-zone-aware sending captures most of the lift; STO's marginal addition isn't worth the premium cost. For some programs, the premium is worth it. The discipline is to measure before deciding.
covers the holdout methodology for validating vendor-marketed features — STO is one of several where the vendor claim diverges meaningfully from the measured effect.
Frequently asked questions
- Does send-time optimisation actually work?
- Yes, modestly — 3–8% open rate lift, 1–4% click rate lift, negligible revenue lift for most programs. Vendor-claimed 20–40% lifts are inflated by machine opens (Apple MPP) and comparison-design issues. Worth using for broadcasts, but don't expect transformative numbers.
- What's the best time to send email?
- Depends on the audience and content. General patterns: 10am local time on Tue/Wed/Thu for B2B; 7pm local time on weekdays for consumer. But the 'best' time for broadcasts is usually within a 2-hour window of 'any reasonable work-hour time' — don't optimise further than that without data from your own audience.
- Should I A/B test send times?
- Yes, but test wider than you think. Testing 9am vs 10am usually shows no significant difference (too similar). Testing 10am vs 6pm or weekday vs weekend can reveal real patterns. Once you've identified the best window (morning vs evening, weekday vs weekend), fine-tuning within that window rarely moves the needle.
- Does STO work for triggered emails?
- No. Triggered emails (welcome, order confirmation, password reset) should send immediately — the trigger is the user action. Delaying them for 'optimal time' is actively worse for the user experience. STO applies to broadcast/campaign sends where timing is a choice.
- What's the difference between STO and time-zone-aware sending?
- Time-zone-aware sending hits every user at the same local time (e.g., 10am local). STO hits each user at their individually-optimised time based on history. Time-zone-aware is free in most ESPs and captures ~80% of the lift; STO adds marginal improvement at premium cost.
- If STO's lift is small, why is it marketed so heavily?
- Three reasons: it's a clearly measurable feature that sounds sophisticated, it's easy to show case studies with selection-biased numbers, and it justifies enterprise-tier pricing. The feature is real and useful; the marketed impact is overstated. Enable it where sensible; don't make strategic decisions based on vendor-claimed lift.
Related guides
Browse all →Sample size: the calculation everyone gets wrong in email A/B tests
Most email A/B tests are powered to detect effects far larger than they can actually produce. Here's the sample size calculation that tells you whether your test will find what you're looking for — before you run it.
False positives in email A/B tests: why half of winning tests don't actually win
Run enough A/B tests and some will show 'significant' lift from random noise. Programs that ship every significant winner end up with a collection of imaginary improvements. Here's how to tell real lift from noise and avoid the false-positive trap.
Incrementality testing: the measurement that tells you if a program actually works
Last-click attribution makes lifecycle programs look bigger than they are. Incrementality tests strip out the effect of users who would have converted anyway and reveal the real lift. Here's how to design one that produces a defensible number.
Segment-based testing: when your average lift is hiding opposing effects
A winning A/B test with 4% lift overall might be a 20% win in one segment and a 10% loss in another. Segment-based analysis reveals the real story — and lets you ship winners to the segments that benefit while avoiding users who would be hurt.
A/B testing in email: sample size, novelty, and what to report
Most email A/B tests produce winners that don't reproduce. This guide covers the three reasons — under-powered samples, the novelty effect, and weak readout discipline — and how to design tests that actually drive decisions.
Price-testing through email: what's testable, what isn't
Email is often the first place teams try to price-test, and it's often where the wrong lesson gets learned. This guide covers what can genuinely be tested in email, what can't, and the measurement traps that make most email price tests unreliable.
Found this useful? Share it with your team.