Capstone: Design and Ship a Roster
Pick a real recurring task in your work. Design a roster. Run it. Measure it. Ship it.
The Course 3 capstone is the proof. Pick a recurring task in your team's work that's currently human-driven and tedious — triage, dependency bumps, doc updates, dead-code sweeps. Design the roster, build it, run it for a week, measure the result.
Pick the task
Real, recurring, currently human. Not a new green-field idea — something you'd pay an intern to do.
I'm rostering: <one sentence>. Current pain: <one sentence>. Done when: <one observable outcome>.VerifyThree-sentence brief. If it doesn't fit, the task is too vague.Design the roles
Use the four-role default first. Add specialists only if the task obviously demands one.
// roles.json { "planner": { "model": "gemini-pro", "timeout_s": 300 }, "builder": { "model": "gemini-flash", "timeout_s": 900 }, "reviewer": { "model": "gemini-pro", "timeout_s": 300 }, "deployer": { "model": "gemini-flash", "timeout_s": 600 } }VerifyRoster spec checked into your repo so it's reviewable like any other config.Run for a week
Don't tweak mid-week. Let one week of runs accumulate, then evaluate the data.
antigravity run schedule --cron '0 2 * * *' pipeline.jsonVerifyNightly runs land for seven consecutive days; logs and traces preserved.Measure
Three numbers: tasks completed, human rework rate, total cost. Compare to the human baseline.
# Postmortem template # - Tasks completed: 47 # - Human rework after merge: 6 (12.7%) # - Total cost: $38.20 # - Human baseline (1 week prior): 41 tasks, 18% rework, ~14 hours # - Verdict: keep / tune / killVerifyPostmortem written. Verdict reached. If keep, document the role configs; if tune, list the specific changes; if kill, document why.Ship it
If the verdict is keep, the roster joins your team's permanent fleet. Add it to your runbook and your on-call rotation.
# In your team's runbook: # - Roster: nightly-triage # - Owner: @alice # - Failure escalation: page on 3 consecutive cancelled runs # - Cost cap: $80/weekVerifyRunbook entry exists. The roster is now infrastructure, not an experiment.