Phase agents
The phase-agent pipeline decomposes a worker run from one long claude -p oneshot session
into nine short-lived skills the orchestrator dispatches one at a time. Each phase is its own
claude --bg job: it starts with empty context, reads the prior phase’s artifact, does one thing,
emits a phase.<name>.complete.<ticket> event, and exits. The orchestrator wakes on the event and
dispatches the next phase.
The decomposition is the answer to the 2026-06-15 Agent-SDK-Credit billing change. After that
date every claude -p call bills against the Agent-SDK-Credit budget; claude --bg jobs continue
to draw from the Max 20x subscription pool. Moving worker dispatch off -p to --bg keeps a
typical orchestrator run inside the $200/month Max 20x envelope.
:::note[Default vs phase-agents] The orchestrator still ships with dispatchMode: "oneshot-legacy"
as the default — a single long claude -p worker per ticket running
/catalyst-dev:oneshot. Opt into the phase-agent pipeline by
setting catalyst.orchestration.dispatchMode to "phase-agents" in .catalyst/config.json. The
legacy oneshot path is preserved indefinitely for single-shot interactive use and as a fallback. :::
The 9 phases
Section titled “The 9 phases”| # | Skill | Model | Turn cap | Goal artifact | Linear state | Delegates to |
|---|---|---|---|---|---|---|
| 1 | phase-triage | Opus | 10 | triage.json | label triaged | — |
| 2 | phase-research | Opus | 35 | thoughts/shared/research/<date>-<ticket>.md | researching | /catalyst-dev:research-codebase |
| 3 | phase-plan | Opus | 25 | thoughts/shared/plans/<date>-<ticket>.md | planning | /catalyst-dev:create-plan |
| 4 | phase-implement | Opus (see Cost) | 75 | commits + phase-implement.json | inProgress | /catalyst-dev:implement-plan |
| 5 | phase-verify | Opus | 20 | verify.json | verifying | gates + adversarial sub-agents |
| 6 | phase-review | Opus | 25 | review.json + remediation commit | reviewing | /review (gstack) |
| 7 | phase-pr | Opus | 12 | open PR + phase-pr.json | inReview | /catalyst-dev:create-pr |
| 8 | phase-monitor-merge | Opus | 50 | merged PR + phase-monitor-merge.json | done | (lifts oneshot Phase 5 loop) |
| 9 | phase-monitor-deploy | Haiku | 30 | phase-monitor-deploy.json | — | /canary (gstack) |
Default models come from phase-agent-dispatch:51 (Opus) plus the per-phase override in
phase-agent-dispatch:55-66. Resolution order is --model CLI flag >
catalyst.orchestration.phaseAgents.modelOverrides[phase][ticket] >
catalyst.orchestration.phaseAgents.models[phase] > default Opus.
Turn caps follow the same precedence: --turn-cap CLI flag >
catalyst.orchestration.phaseAgents.turnCaps[phase] > the per-phase default above.
1. phase-triage
Section titled “1. phase-triage”Entry phase. Reads the Linear ticket, expands acronyms, classifies it
(feature/bug/docs/refactor/chore), identifies dependencies, estimates scope, writes
${ORCH_DIR}/workers/${TICKET}/triage.json, posts a triaged comment to Linear, and applies the
triaged label. Emits phase.triage.complete.<TICKET>.
The bash body does the deterministic work; Opus is used to refine ambiguous fields.
2. phase-research
Section titled “2. phase-research”Reads triage.json from the prior phase, delegates to
/catalyst-dev:research-codebase, and emits
phase.research.complete.<TICKET> once thoughts/shared/research/<date>-<ticket>.md exists with
the standard frontmatter, Summary, Findings (≥10 file:line references), and References sections.
3. phase-plan
Section titled “3. phase-plan”Reads the research document by glob (thoughts/shared/research/*-<ticket>.md), delegates to
/catalyst-dev:create-plan, and emits phase.plan.complete.<TICKET> once
thoughts/shared/plans/<date>-<ticket>.md exists with Overview plus phased Tests First (Red) →
Implementation (Green) → Refactor → Success Criteria sections.
4. phase-implement
Section titled “4. phase-implement”Reads the plan, delegates to /catalyst-dev:implement-plan via the Task
tool, commits each plan phase as it lands. /goal succeeds when git diff <base>..HEAD is
non-empty AND the targeted tests pass. Emits phase.implement.complete.<TICKET>.
The cost projection assumes Sonnet on this phase — that’s a config flip
(catalyst.orchestration.phaseAgents.models.implement = "sonnet"), not a code change. The shipped
default is Opus; switch it once you have a baseline to compare against. Per-ticket overrides via
modelOverrides.implement.<TICKET> give you an escape hatch when a particularly ambiguous plan
needs Opus.
5. phase-verify
Section titled “5. phase-verify”Read-only adversarial verification. Runs tsc, tests, lint, security scan, reward-hacking scan, plus
the code-reviewer, pr-test-analyzer, and silent-failure-hunter sub-agents. Writes
${ORCH_DIR}/workers/${TICKET}/verify.json with a regression_risk score and findings list. Never
writes application code; test files are the only writable target. Emits
phase.verify.complete.<TICKET>.
This is the independent verification layer that replaces the orchestrator’s adversarial recheck — see Verification and reward-hacking defense.
6. phase-review
Section titled “6. phase-review”Reads verify.json, runs the /review skill (gstack) against the diff, writes
${ORCH_DIR}/workers/${TICKET}/review.json, and creates a remediation commit for any HIGH-severity
finding with a deterministic fix. Emits phase.review.complete.<TICKET>.
Explicitly skips /ultrareview (per the source plan); operator can still run it manually after the
PR opens.
7. phase-pr
Section titled “7. phase-pr”Delegates to /catalyst-dev:create-pr (which already runs describe-pr
and transitions Linear to inReview), then writes PR number + URL into phase-pr.json so the
downstream phase-monitor-merge skips a redundant gh query. Emits phase.pr.complete.<TICKET>.
8. phase-monitor-merge
Section titled “8. phase-monitor-merge”The active listen loop. Lifts the body of oneshot Phase 5 Step 2 verbatim: event-driven wait
on catalyst-events wait-for, inline resolution of CI failures, bot review threads, and BEHIND
rebases, then gh pr merge --squash --delete-branch when the PR reaches CLEAN. Transitions Linear
to done and emits phase.monitor-merge.complete.<TICKET>.
Stalls (CI red after 3 fix attempts, unresolvable conflicts, human changes-requested) write
status: "stalled" and post attention to the comms channel — the orchestrator’s monitor loop
handles dispatch from there.
9. phase-monitor-deploy
Section titled “9. phase-monitor-deploy”Optional. Subscribes via catalyst-events wait-for to deployment_status events on the merge SHA
matching $PHASE_DEPLOY_ENV (default production), then runs the /canary skill (gstack) to
verify the live deployment. Writes phase-monitor-deploy.json and emits
phase.monitor-deploy.complete.<TICKET>, .failed.<TICKET>, or .skipped.<TICKET>. Uses Haiku by
default — most of the work is polling and reading deployment status events.
Skipped automatically when no deploy event arrives within the wall-clock timeout, so projects without deployment hooks don’t block on this phase forever.
Broker phase_lifecycle interest
Section titled “Broker phase_lifecycle interest”The orchestrator subscribes once per ticket via
catalyst-broker’s deterministic phase_lifecycle interest
type:
| Field | Value |
|---|---|
interest_type | phase_lifecycle |
ticket | the Linear ticket ID, e.g. CTL-123 |
phase_names | ["triage","research","plan","implement","verify","review","pr","monitor-merge","monitor-deploy"] |
notify_event | filter.wake.<ORCH_NAME> |
Event pattern matched: phase.<name>.(complete|failed).<TICKET>. Routing is purely deterministic —
no Groq call, no prose evaluation. The broker auto-cleans the interest when the orchestrator emits
agent.checkout.
Dispatch flow
Section titled “Dispatch flow”Orchestrator phase-agent-dispatch Phase agent (claude --bg) │ │ │ ├─ creates worker signal ──────>│ │ │ workers/CTL-N.json │ │ │ ├─ writes phase signal: │ │ │ workers/CTL-N/phase-triage.json │ │ {status: "dispatched"} │ │ ├─ launches claude --bg ───────>│ │ │ /catalyst-dev:phase-triage │ reads triage.json │ │ --orch-dir … │ does its one job │ │ │ writes artifact │ │ │ emits phase.triage.complete.CTL-N │<─── filter.wake.<ORCH_NAME> ──┴──────────────────────────────│ → broker matches, fires wake │ │ exits ├─ orchestrate-phase-advance --completed-phase triage │ │ --ticket CTL-N │ │ → phase_next(triage) = research │ │ → orchestrate-dispatch-next --phase research --ticket CTL-N │ → (loops back to phase-agent-dispatch for next phase) │ ▼Each phase agent is idempotent: if phase-agent-dispatch finds an existing signal with status
of dispatched, running, or done, it exits 0 without re-spawning. Only failed signals are
overwritten.
Configuration
Section titled “Configuration”All phase-agent config lives under catalyst.orchestration in .catalyst/config.json:
{ "catalyst": { "orchestration": { "dispatchMode": "phase-agents", "phaseAgents": { "models": { "implement": "sonnet", "pr": "sonnet", "monitor-deploy": "haiku" }, "modelOverrides": { "implement": { "CTL-501": "opus" } }, "turnCaps": { "implement": 100 } } } }}| Key | Default | Purpose |
|---|---|---|
dispatchMode | "oneshot-legacy" | "phase-agents" enables the pipeline; "oneshot-legacy" keeps a single long claude -p worker per ticket |
phaseAgents.models[phase] | "opus" | Per-phase default model |
phaseAgents.modelOverrides[phase][ticket] | none | Per-phase, per-ticket override (highest precedence after CLI) |
phaseAgents.turnCaps[phase] | see table above | Hard cap on Claude turns per phase |
Resolution order — both for model and turn-cap — is CLI flag > modelOverrides[phase][ticket] >
models[phase] > built-in default.
Cost economics
Section titled “Cost economics”The pipeline is a billing change first and an architecture change second. Two things matter:
- Subscription-pool dispatch. Every phase agent runs as
claude --bg, which draws from the Max 20x subscription pool. The orchestrator itself is the only-pcall on the dispatch path, and it’s a short coordination loop. Per-run Agent-SDK-Credit spend on the worker path: $0. - Per-phase model assignment. The fresh context per phase costs ~$2.25 of “rehydration tax” per run (each phase re-reads prior artifacts). That tax is recouped by assigning Sonnet to Phase 4 (implement) and Phase 7 (pr) and Haiku to Phase 9 (monitor-deploy).
Cost projection (Opus 4.7 = $5/$25 per Mtok in/out; Sonnet 4.6 ≈ $3/$15; Haiku 4.5 ≈ $0.80/$4, sourced from the planning research; per-phase rows are still projection-only — see the validation status below):
| Phase | Model | Est. turns | Cost (mixed) | Cost (all-Opus) |
|---|---|---|---|---|
| 1 Triage | Opus | 5–8 | $0.30 | $0.30 |
| 2 Research | Opus | 25–30 | $2.50 | $2.50 |
| 3 Plan | Opus | 15–20 | $1.50 | $1.50 |
| 4 Implement | Sonnet | 40–60 | $2.70 | $4.50 |
| 5 Verify | Opus | 10–15 | $0.80 | $0.80 |
| 6 Review | Opus | 15–20 | $1.20 | $1.20 |
| 7 PR | Sonnet | 5–8 | $0.18 | $0.30 |
| 8 Monitor-merge | Opus | 10–30 | $0.50 | $0.50 |
| 9 Monitor-deploy | Haiku | 5–15 | $0.03 | $0.20 |
| Total | mixed | ~150 | ~$9.71 | $11.80 |
Measured oneshot-legacy baseline (the calibration point for the table above —
CTL-485, orchestrator
o-ctl-476-...-486, 2026-05-17, n=7, all-Opus 4.7 with the 1M-context flag):
| Statistic | Value (USD) |
|---|---|
| Min / max | $4.48 / $15.13 |
| Median | $8.08 |
| Mean | $7.95 |
| Mean cache-read | 4.78M tokens |
| Mean duration | 555 s (~9 min) |
| Mean turns | 46 |
The all-Opus projection ($11.80) is 48% higher than the measured oneshot-legacy mean. The
documented per-run rehydration tax ($2.25) plus coordinator overhead accounts for most of the
gap. The order-of-magnitude is right; the per-phase split still needs a phase-agents run to
validate row-by-row.
:::note[Validation status (2026-05-18)] End-to-end cost capture for phase-agents mode is
landed (CTL-496). update-dashboard.sh --roll-usage sweeps both workers/<TICKET>.json
(legacy) and workers/<TICKET>/phase-<NAME>.json (phase-agents) on every monitor wake-up via
the CTL-487 wiring, populating signal.cost, state.workers[T].usage (aggregated across phases
with +=), state.usage, and per-phase session_metrics rows.
Phase-mode USAGE is sourced from the bg session conversation JSONL
(~/.claude/projects/<wt>/<sid>.jsonl) by
extract-cost-from-jsonl.sh
because claude --bg does not emit the stream-json result event that legacy roll-usage
parses. The extractor reads per-message usage blocks, splits cache_creation by 5m/1h TTL, and
applies per-model pricing from
plugins/dev/scripts/claude-pricing.json
(version-pinned, manually updated when Anthropic publishes rate changes).
session_metrics attribution is by session_id. Each phase agent starts its own
catalyst-session.sh row with skill_name = phase-<name>, and the prelude persists
catalystSessionId into the phase signal so roll-usage mirrors cost into the right row without
DB heuristics. A ticket_key + skill_name fallback handles in-flight runs that predate the
prelude change.
Open follow-up:
- CTL-488 — per-phase rows in the projection
table still need an end-to-end
dispatchMode: "phase-agents"run to validate row-by-row now that the data path is reliable. This is now a data-collection task, not a wiring gap. :::
Observing a run
Section titled “Observing a run”Real-time:
- HUD —
catalyst-hudshows per-ticket phase, signal status, and (oncesession_metricsis populated) cumulative cost. - Events —
catalyst-events tailstreamsphase.<name>.complete.<TICKET>as each phase lands. - Signal files —
${ORCH_DIR}/workers/<TICKET>/phase-<name>.jsonrecords dispatch state, artifact path, and timing for every phase.
After the run:
# Aggregate cost per workflow (once session_metrics has data)sqlite3 ~/catalyst/catalyst.db -header -column <<'SQL'SELECT s.workflow_id, s.ticket_key, s.skill_name, printf('$%.4f', sm.cost_usd) AS cost, sm.input_tokens, sm.output_tokens, sm.duration_ms / 1000 AS dur_sFROM sessions sLEFT JOIN session_metrics sm ON s.session_id = sm.session_idWHERE s.skill_name LIKE 'phase-%' AND s.workflow_id = '<your-orchestrator-session-id>'ORDER BY s.started_at;SQL# Count completion events emitted by a ticket's full runcatalyst-events tail --since 24h \ | jq -c 'select(.attributes."event.name" | startswith("phase.") and (.attributes."event.name" | endswith(".CTL-N")))' \ | wc -l# Expect 9 (one per phase) for a successful run.End-to-end runbook
Section titled “End-to-end runbook”For the first real run against a low-risk ticket:
- Pick a small ticket (docs fix, small refactor, no external deps).
- Set dispatch mode in
.catalyst/config.json:{ "catalyst": { "orchestration": { "dispatchMode": "phase-agents" } } } - Verify
session_metricspopulates with a single test session before committing to a full run. The orchestrator’s monitor loop callsupdate-dashboard.sh --roll-usageon every wake-up, which sweeps bothworkers/<TICKET>.json(legacy) andworkers/<TICKET>/phase-<NAME>.json(phase-agents) and writes through to signal + state + DB. Ifcost_usdis still0after a phase reachesstatus: "done", check${ORCH_DIR}/.roll-usage.log—bg-state-missingorjsonl-missingaction codes point at a stale or relocated~/.claude/jobs/<bg>/state.json;extract-failedpoints at theextract-cost-from-jsonl.shscript. You can manually flush a single phase withorchestrate-roll-usage.sh --orch <orch-id> --ticket <TICKET> --phase <NAME> -vto validate the data path independently. - Dispatch the orchestrator:
Terminal window /catalyst-dev:orchestrate <TICKET> --auto-merge --max-parallel 1 - Watch the HUD through to merge. Expect 9
phase.<name>.complete.<TICKET>events incatalyst-events tail. - Pull cost data via the SQL query above. Compare per-phase numbers to the projection table; deltas >1.5× warrant tightening the relevant phase’s turn cap or trimming its system prompt.
Related
Section titled “Related”- Workers and signal files — signal-file schema (shared between oneshot and phase-agent workers)
- Verification and reward-hacking defense — the
phase-verifyandphase-reviewadversarial layers - Semantic event routing (catalyst-broker) — protocol for
phase_lifecycleand related interest types - Orchestration overview — Level 2/3 orchestration model and full config reference