Updated · 11 min read
Building a personal chief-of-staff AI on Claude Routines
Most operators end the week unsure what they actually shipped, what slipped, and which threads went stale. The idea of a chief of staff — someone who reads the room, sequences the day, surfaces what's at risk — is reserved for executives with budgets to fund one. Anthropic's Routines feature changed the math. A working chief-of-staff AI now runs on a single GitHub repo and a handful of Routine entries in claude.ai, with no infrastructure to maintain. This guide is the architecture, the trade-offs, and the rollout sequence.
By Justin Williames
Founder, Orbit · 10+ years in lifecycle marketing
The category, and what nothing else solves
A chief of staff sequences your work, surfaces what's at risk before you ask, and holds you to standards you set when you were sharper than you are right now. The personal AI version does the same job, scoped to one operator, run on a schedule.
The category fills a gap nothing else fills cleanly:
| Approach | What it does | What's missing |
|---|---|---|
| Daily digest tool | Lists what happened — emails, calendar items, GitHub commits | No view on what mattered |
| Chat assistant | Answers when you ask | Produces nothing on its own |
| Productivity dashboard | Surfaces metrics continuously | No prose, no recommendation |
| Personal chief-of-staff | Briefs, replies, acts, debriefs with daily coaching | — |
The system you actually want, if you're trying to operate at a senior level without a human one, briefs you before the day starts with the day's through-line and biggest risk; checks in mid-day when something has shifted; debriefs you at the end with a retrospective, a coaching rating, and one specific note for tomorrow; and responds within the hour when you post a question or ask for an action.
Built right, that costs roughly zero infrastructure and lives on a model plan you may already have.
The architecture: brain in git, runtime in claude.ai
The system splits cleanly in two.
The brain is a GitHub repo with markdown files that describe persona, voice, format conventions, and the operator's working context. Editable. Version-controlled. Diffable. The same repo is referenced by every routine, so a single edit to chief-of-staff.md propagates to every brief on the next fire.
The runtime is Anthropic's Routines feature inside claude.ai. Each routine is a Routine entry: a schedule (or API trigger), a model choice, a connector list (Slack, Calendar, Gmail, Notion, GitHub, whatever you have authenticated), and a small Instructions textarea that points Claude at the spec file in the repo. No server, no deployment, no CI pipeline to maintain.
The Instructions textarea is intentionally thin — it says, in effect, "read these files from the repo and execute the spec." Claude reads them fresh on every fire. The implication: you maintain the brain in git, not in the textarea. Voice updates, format changes, new section conventions — all of it ships through commits, not through clicking through Routine entries one by one.
The persona stack — four files, four lifecycles
Four files inside persona/, each with a different lifecycle:
Voice — the canonical voice the assistant speaks in. A blend of postures, encoded as rules: operator-first practicality, POV-first directness, declarative confidence, calm technical confidence over hedge-everything reporting, dry observational humour. Pick the postures you want; encode them as rules with named "sounds like / doesn't sound like" examples. Never changes.
Identity — the operator (you). Career, expertise, working style, side projects. Survives company moves. Edited only when something material in your career or domain shifts.
Chassis — format rules, the day's rhythm, multi-day continuity, coaching stance, stoicism allowance, what to never do (sycophancy, AI tells, padded prose). Almost never changes.
Context — current employer, output channel, and self-populating sections: standing priorities, people whose mention always surfaces, recurring projects, open threads. Reset on company moves.
The portability win is real: when you change jobs, you edit one file (Context), wipe the self-populating sections, and the system survives. Voice, identity, and chassis don't move. The architecture isn't coupled to one company.
The day's rhythm — five routines, distinct vantages
The system runs on five Routine entries in claude.ai. Each one fires on a schedule and reads the same persona files, but the spec it executes is different.
Morning brief — daily, weekday early. The forward read of the day. Opening paragraph carries the through-line and biggest risk. Sections: The day (calendar with one-line prep notes per meeting), Overnight (signals worth flagging from email and Slack), Pending (what's in flight and at risk), Looking ahead (forward calendar friction), Questions (where the assistant needs your input). Closing call when one imperative dominates the day.
Channel pulse — hourly, 24/7. The interactive layer. Fires every hour, reads only your messages newer than its last post in the output channel, and either replies, takes an action, briefs an upcoming meeting, or stays silent. The single most differentiated piece of the architecture — covered in detail below.
Midday pulse — daily, weekday early afternoon. Drift check against the morning brief. Stays silent on most days; fires when something has materially shifted (cancelled meeting, urgent inbound, a slipping deadline now visible).
Evening debrief — daily, weekday early evening. The retrospective plus a coaching layer: a 1–10 score on a dimension you pick, grounded in real examples from today, with a calibration paragraph, a colour-coded bar chart, and one specific coaching note for tomorrow. The dimension is yours to choose — communication is the example used in this build, but the mechanic supports anything you want to develop a blind spot on (decisiveness, hill-choice, follow-up integrity, focus discipline, energy management). Reads what got done, what slipped, your meeting notes, your sent emails, and the relevant Slack signal.
Weekly review — Sunday evening. Strategic frame for the week. Communication-rating trend across the five days. Priority-alignment audit (green/amber/red on each standing priority). Week ahead: big rocks, calendar conflicts, deadlines on the horizon.
The five vantages — forward, interactive, course-correction, retrospective, strategic — don't overlap. If a section appears in two routines, one of them is doing the wrong job. Duplication is a design failure; the discipline of keeping each routine's vantage distinct is what makes the system feel coherent rather than noisy.
The interactive layer — channel pulse
A digest tool broadcasts at you on a schedule. A chief of staff that responds within the hour when you ask, takes an action on request, and briefs you before the next meeting starts is a different category.
Channel pulse fires every hour, every day. Three things happen on each fire, in order:
1. Reply or take action on your most recent unanswered message in the output channel. Authorisation is intentionally full-power for a personal channel — the assistant executes whatever you explicitly ask, using whichever connector applies. "Create a Notion page titled X with content from Anna's email," "email the team that I'm running 15 min late," "summarise yesterday's 1:1 notes" — all execute directly. After any action, the assistant posts a short confirmation reply describing what was done with a link or identifier.
2. Brief upcoming meetings starting in the next 60 minutes that haven't been briefed yet. Format: a single header (📋 MEETING PREP · 14:30 Anna 1:1) followed by last touch with the attendee, what they'll likely raise, recommended position, and any open threads touching the topic. 6–12 lines. Fires independently of the reply bucket — even on a quiet day, an upcoming 1:1 still gets prep ~30–60 min before it starts.
3. Update open threads in the brain (the Context file). Closes resolved threads, appends new ones from observed signal.
The non-obvious correctness loop: the read scope is "your messages newer than my most recent post in the channel." Once the assistant has acted and posted a confirmation, that confirmation becomes the new "most recent post," and the next fire doesn't re-execute the same request. The reply itself is the dedup mechanism. Without it, the polling architecture would re-fire the same Notion page creation every hour.
Skip mechanism. Post "skip pulse" in the channel. The assistant appends an active open thread to the Context file ("channel_pulse skipped through <date>"). Every subsequent fire reads the thread first; if active, exits silently. Resumes when you close the thread or the date passes. The mute is durable across fires without requiring any external state.
Cost, model choice, and the hybrid strategy
5
Scheduled routines, distinct vantages, no overlap.
0servers
Runtime lives in claude.ai. Nothing to deploy, nothing to maintain.
2–3hrs
End-to-end setup with Claude Code doing the file-writing via the bootstrap prompt.
The five routines fire at very different rates. The model choice doesn't need to be uniform.
Sonnet 4.6 handles four of the five — morning brief, midday pulse, weekly review, channel pulse. Fast, cost-conscious, indistinguishable from the bigger model on format-heavy work and simple replies.
Opus 4.7 for evening debrief only. The coaching rating and the calibration paragraph are the most voice-fidelity-sensitive output the system produces — distinguishing 6/10 from 7/10 reliably across days, calling out a real failure with confidence rather than hedging. Worth the premium on the one routine where calibration matters most.
On a metered Claude API plan: roughly $20–40/month for the daily routines, with channel pulse the variable line item depending on how much you ask of it. On a Claude Max subscription: bundled. The cost gate isn't the variable; the rate-limit gate matters more if the routines are firing constantly and you're also using claude.ai chat heavily.
Setup, end-to-end
The full rollout takes a few hours, max. Claude Code does the heavy file-writing via the bootstrap prompt below; the rest is review, commit, and clicking through Routine entries in claude.ai. Don't hand-write the persona and routine specs — the architecture has enough surface area that a manual first pass loses to an interview-driven one.
1. Create the GitHub repo. Private, empty, named however you want. The folder structure (persona/, routines/, README.md) gets generated by Claude Code in the next step.
2. Open Claude Code at the repo root in a terminal. Paste the bootstrap prompt from the section below. Answer the five customisation questions thoughtfully — Claude Code is told to push back on vague answers, but specific answers are what produce the version you'll actually use. Voice, identity, output channel, coaching dimension, connectors. About 20 minutes if you're ready with the answers.
3. Review what Claude Code generated. Walk the persona files first — voice, identity, chassis, context. These are the brain; everything else reads them. Edit anything that doesn't sound like you. The routine specs and .txt setup files come from the chassis, so getting the chassis right pays off everywhere else.
4. Commit and push to GitHub. Claude Code can run the git commands directly from the same session if you ask it to.
5. Authorise the GitHub App on the repo. claude.ai needs read access; the assistant also writes back to the Context file, so write access is needed too — scoped to this repo only. Set the GitHub App's repository access to include it.
6. Create the Routine entries in claude.ai → Agents → Routines → Create. One per routine. Open the corresponding .txt file from the repo and copy field-by-field: name, repository, trigger, cron expression in UTC, model, environment, connectors. Paste the Instructions textarea content exactly. Five Routines = roughly 25 minutes of clicking.
7. Smoke-test each one with a manual run from the Routine UI. Read the output in your channel. If voice or format drifts, edit the chassis file, push, manual-run again. The textarea doesn't need re-pasting — Claude reads the chassis file fresh on every fire.
8. Configure your output channel. Mobile notifications on so the morning brief lands somewhere you'll see it. The first scheduled fire tomorrow morning is the real validation.
Outcomes after two weeks
Four outcomes that show up consistently after two weeks of running the full system:
- Less context-switching. The morning brief is the first thing you read. Walking into a 1:1 at 14:30, the meeting prep post landed at 14:00 with everything you need — last touch, likely raise, recommended position. You stop opening seventeen tabs to remember where you left things.
- Faster decision-making in async. Channel pulse means you can post a question in the channel — "does the deploy plan hold? Worried about the migration sequencing" — and have a stress-tested response within an hour. Not real-time, not chat-fast, but fast enough that decisions stop accumulating.
- Honest retrospectives. The coaching layer is the unexpected emergent benefit. Pick the dimension where you have a blind spot — communication, decisiveness, hill-choice, follow-up integrity, focus discipline, whatever — and the evening debrief grades it daily with citations from real Slack and meeting signal, never generic. After a week of being graded on whether you defended a hill that didn't need defending, status-positioned instead of problem-solving, or hedged when a verdict was earned — the patterns become visible. The 6/10 days look different from the 8/10 days, and you start to know which is which in the moment.
- Reduced thread drift. Things you said you'd get back to people on stop falling off the radar — the open threads file tracks them, and the brief surfaces ones overdue past 48 hours. Trust integrity holds without requiring you to maintain a separate task system.
The thing it doesn't do: replace your judgement. The system surfaces, sequences, and drafts. You still decide. The chief-of-staff metaphor is exactly right — they don't do your job; they make it possible for you to do it well.
A copyable prompt to bootstrap your own
If you want Claude Code to walk you through the customisation and generate the repo structure for you, paste the prompt below into a fresh Claude Code session. It asks you the questions that matter — voice, identity, output channel, coaching dimension, available connectors — then produces the file tree with everything wired together.
I want you to help me build a personal chief-of-staff AI using Anthropic's Routines feature, following the architecture documented at https://get.yourorbit.team/guides/personal-chief-of-staff-claude-routines. Read that guide first. The architecture is: brain in a GitHub repo (markdown specs for persona, chassis, context, and per-routine specs), runtime in claude.ai → Routines (one Routine entry per spec, each pointing at the repo). Before generating any files, ask me five questions: 1. Identity — my name, role, employer, timezone, and the domains where I have deep expertise. This populates persona/about-me.md. 2. Voice — five reference authors or styles I want the assistant to blend (tone only, not their content). Suggest a default mix if I'm unsure, and let me swap any of them. 3. Output channel — where the assistant posts (a Slack channel, a Teams channel, an email thread, a Notion daily-notes page). The name lives in persona/context.md and every routine reads it from there. 4. Coaching dimension — the blind spot the evening debrief grades me on daily. Examples: communication, decisiveness, hill-choice, follow-up integrity, focus discipline, energy management. Pick one. Don't run multiple at once. 5. Connectors — which integrations I have authenticated in claude.ai (Slack, Calendar, Gmail, Notion, GitHub, Drive, others). The channel_pulse routine uses all of them; the briefs use a narrower subset. Then generate the repo: persona/voice.md — blended voice rules with sounds-like / doesn't-sound-like examples persona/about-me.md — durable identity (career, expertise, working style) persona/chief-of-staff.md — the chassis: format conventions, day's rhythm, multi-day continuity, coaching stance, the coaching dimension I picked, header conventions per routine persona/context.md — current employer, output channel, empty self-populating sections routines/morning_brief.md + .txt — 06:30 weekdays in my timezone routines/midday_pulse.md + .txt — 13:05 weekdays routines/evening_debrief.md + .txt — 19:05 weekdays routines/weekly_review.md + .txt — 18:00 Sunday routines/channel_pulse.md + .txt — hourly 24/7 README.md Each .txt file holds paste-in-ready Routine settings: Name, Repository, Trigger, Cron expression in UTC mapped from my timezone, Model, Environment, Connectors, full Instructions textarea content. After the files exist, give me an ordered checklist of what to do in claude.ai: create the Routine entries one by one, copy values from the .txt files, smoke-test each with a manual run, watch the first morning brief tomorrow. Push back on me if my answers are vague. Generic answers produce a generic system; specific answers produce the version I'll actually use.
The prompt is structured around customisation on purpose. The architecture only earns its keep when the voice, the identity, the coaching dimension, and the working context are tuned to one specific operator. A bootstrap that skips the questions produces a working system you won't use; a bootstrap that asks the questions produces one you will.
Trade-offs that are real
This isn't a free lunch. Three trade-offs worth landing before you commit:
It only works in companies that allow third-party AI integrations. The whole architecture depends on Anthropic's connectors authenticating against your work tools — Slack, Calendar, Gmail, Notion. Many enterprises restrict this. If yours does, you have two choices: deploy in your personal account with personal-workspace tools (loses the work signal), or wait for IT approval (timeline unknown). Worth checking before you build.
The interactive layer requires trust. Channel pulse's full-power authorisation means it can email people on your behalf, create Notion pages, edit calendar events. The dedup loop and the post-action confirmation reply prevent duplicates and surface mistakes fast — but a misinterpreted instruction can still produce a wrong email or a wrong page. The system is recoverable but the actions are real.
Voice drift is on you. No automated voice regression test catches subtle drift the way reading tomorrow's morning brief does. The discipline is: when a brief lands wrong — wrong tone, wrong shape, wrong omission — fix the chassis file the same day. The system is only as sharp as the most recent commit.
None of these is a deal-breaker; all three are worth knowing about before you start building.
Frequently asked questions
- Can I run this on a non-Slack output channel?
- Yes — the architecture doesn't depend on Slack specifically. The 'output channel' is whatever the routine posts to via a connector. Microsoft Teams, Discord, Telegram (with a webhook bridge), or even an email thread you reply to all work. The Slack connector is the smoothest because it's first-party, but the persona stack is channel-agnostic. The output channel name lives in one file (Context) and every routine reads it from there.
- How long does it take to set up?
- A few hours, max — and most of that is reviewing what Claude Code generated rather than writing anything yourself. Claude Code chews through the file generation in 10–20 minutes via the bootstrap prompt; the meaningful work is reading the persona files (voice, identity, chassis) and tuning anything that doesn't sound like you. Routine setup in claude.ai is another 25–30 minutes of clicking. After two weeks of running it, you'll edit the chassis a handful of times to tune format and voice; after a month, edits become rare.
- Does the assistant remember anything across days?
- Across days, two mechanisms keep memory live. The Context file's Open Threads section — anything in flight, any question outstanding, any followup owed — persists across runs and is read on every fire. The output channel itself is the persistent log; routines read recent channel history at the start of every run, so the conversation is continuous across days. Replies you send in the channel get folded into the next morning's brief.
- What happens when I change jobs?
- Edit one file (Context) — replace current employer, output channel name, and clear the self-populating sections. The voice, identity, and chassis files don't change. Re-authorise the connectors in your new workspace. The whole migration is roughly an hour. The architecture deliberately doesn't couple your personal AI to a specific employer — that's the point of the four-file split.
- Is the daily coaching rating actually accurate?
- First few days, the calibration drifts — every day reads as a 7. After a week of you correcting it in the channel ('that 8 was generous,' 'the 5 was harsh'), it self-tunes. The rating is grounded in real examples from the day's Slack and meeting notes — never generic — so when it scores you a 4 it cites the specific behaviour. Honest 'not enough signal today to rate' beats a fabricated number on days when input is sparse. This applies to any coaching dimension, not just communication; the calibration loop is the mechanic.
- Should I add multiple coaching dimensions at once?
- Possible but a bad idea early on. Pick one dimension, run for a month, watch the patterns surface. Adding three at once produces three diluted ratings instead of one sharp one — and the daily brief starts feeling like a performance review rather than a useful read. Once one dimension has stabilised and you've internalised the patterns, swap or add. The chassis file is the only edit needed; the routine spec doesn't change.
- Can I run this without a coding background?
- With caveats, yes. The persona files and routine specs are markdown — plain prose. The .txt files for routine setup are plain text with table-style settings. No code is involved. The one prerequisite that trips up non-technical operators is comfort with GitHub: creating a repo, committing files, granting an app access. If those words mean nothing, partner with someone for the initial setup; once the repo exists, day-to-day editing is just text in markdown.
Read to the end
Scroll to the bottom of the guide — we'll tick it on your reading path automatically.
This guide is backed by an Orbit skill
Related guides
Browse allAI personalisation at scale: the architecture that actually works
Every ESP now sells an AI personalisation layer. Most lifecycle programs deploy them and quietly find the lift smaller than the deck claimed. The reason isn't the model — it's the architecture underneath. Here's the data → model → activation stack that decides whether AI personalisation moves revenue or moves nothing.
Predictive models in lifecycle: churn, propensity, and recommendations without the magic
Predictive models in lifecycle are mostly three things: churn risk, conversion propensity, and product recommendations. Each one earns or loses its place based on how its score actually changes a decision. Here's the operator view of what's worth deploying, what to expect from ESP-native suites, and when to build your own.
Generative AI for lifecycle content: where it earns its place and where it embarrasses you
Generative AI inside lifecycle ESPs has moved from novelty to default in 18 months. BrazeAI (formerly Sage AI), Iterable Copy Assist, Klaviyo's subject line generator — they all promise per-message copy at scale. Some uses are genuinely useful. Others are a fast path to brand drift, factual errors, and reputational damage. Here's the line.
What is lifecycle marketing? A field guide for operators starting from zero
If you're new to CRM and lifecycle, the field reads like a pile of acronyms and vendor demos. It's actually one simple idea executed across five canonical programs. Here's the frame that makes the rest of the library make sense.
Choosing which lifecycle programs to build first
New lifecycle lead, empty Braze account, a laundry list of programs you could build. The question nobody trains you for is which to build first. This is the selection framework — by business type, by team size, by data maturity, and the programs I'd actively wait on.
Segmentation strategy: beyond RFM
RFM is the floor of audience segmentation, not the ceiling. Every program that stops there ends up describing what users already did without ever predicting what they'll do next. Here's the segmentation stack that actually drives lifecycle decisions — and how to build it in Braze without ending up with 400 segments nobody understands.
Found this useful? Share it with your team.
Use this in Claude
Run this methodology inside your Claude sessions.
Orbit turns every guide on this site into an executable Claude skill — 62 lifecycle methodologies, 84 MCP tools, native Braze integration. Free for everyone.