Event architecture
Catalyst’s observability layer has three tiers of durability, and frontends (web UI, terminal UI, scripts) consume a fused stream built from all three.
The three sources of truth
Section titled “The three sources of truth”| Source | Kind | Writer | Lifetime |
|---|---|---|---|
Worker signal file (<orch-dir>/workers/<ticket>.json) | Mutable snapshot | Worker, orchestrator | Archived with the orchestrator dir |
Global state (~/catalyst/state.json) | Mutable snapshot | catalyst-state.sh worker / orchestrator | Persists across all orchestrations |
Global event log (~/catalyst/events.jsonl) | Append-only | catalyst-state.sh event | Never truncated automatically |
Snapshots answer “what is the state right now?” — the event log answers “what happened and when?”. Both matter.
Why three, not one
Section titled “Why three, not one”An earlier design used only the signal file. It was insufficient because:
- Workers exit before merge — the subprocess running
/oneshotreliably terminates at its final tool-use, which happens before PR merge completes. Ifpr.mergedAtlived only in the signal file, it would never be written by the worker itself. - Multi-orchestrator aggregation — a single signal file describes one worker. The dashboard needs to query across all orchestrators to show “how many waves are active right now?”
- Audit trails — status snapshots overwrite each other. The event log keeps the full history (
researching → planning → implementing → ...) even when the worker is long gone.
So signal files handle the first layer (per-worker snapshot), global state handles the second (fleet snapshot), and the event log handles the third (append-only history).
How writes propagate
Section titled “How writes propagate”Using a status transition as the example:
Worker writes ───> Signal file (local snapshot, atomic via tmp+mv) │ └───> catalyst-state.sh worker ───> Global state (atomic via jq+flock) └──> Event (appended to events.jsonl)
orch-monitor ──> fs.watch on signal files ───> Recomputes snapshot └──> fs.watch on state.json ───> Fan out via SSE └──> tail -f on events.jsonl ───> SSE event streamThe monitor never writes — it only reads. Remediation (advancing a ticket, re-dispatching a worker) always goes through the skill layer, which in turn goes through catalyst-state.sh to maintain the write ordering invariants.
SSE event stream
Section titled “SSE event stream”The orch-monitor exposes GET /events as a Server-Sent Events stream. Events are JSON objects following the same schema as events.jsonl:
event: worker-updatedata: {"orchestrator":"orch-...","worker":"CTL-48","status":"implementing","phase":3,"ts":"2026-04-14T19:03:01Z"}
event: pr-updatedata: {"orchestrator":"orch-...","worker":"CTL-48","pr":123,"ciStatus":"passing","ts":"2026-04-14T19:20:44Z"}
event: liveness-changedata: {"orchestrator":"orch-...","worker":"CTL-48","alive":false,"pid":63709,"ts":"2026-04-14T19:22:00Z"}
event: snapshotdata: {"orchestrators":[...],"generatedAt":"2026-04-14T19:22:00Z"}Event types the monitor emits
Section titled “Event types the monitor emits”| Event | Source | When |
|---|---|---|
snapshot | Generated by monitor | On connect, every 60s, and on any state change |
worker-update | Signal file change | Worker writes a new status or phase |
pr-update | GitHub poll (every 30s) | PR state or CI status changed |
liveness-change | PID check (every 5s) | A worker’s PID stopped responding |
attention-raised | Global state change | Orchestrator added an attention item |
wave-completed | Event log tail | Orchestrator emitted wave-completed |
Clients (web UI, terminal UI, custom dashboards) subscribe once and render incrementally. No polling.
Connecting your own frontend
Section titled “Connecting your own frontend”Any SSE-capable client works. Example in Node:
import { EventSource } from 'eventsource';
const es = new EventSource('http://localhost:7400/events');
es.addEventListener('worker-update', (e) => { const { worker, status } = JSON.parse(e.data); console.log(`${worker}: ${status}`);});
es.addEventListener('pr-update', (e) => { const { worker, pr, ciStatus } = JSON.parse(e.data); if (ciStatus === 'failing') notifySlack(`${worker} PR #${pr} CI failed`);});Or in Bash, for quick ad-hoc piping:
curl -N http://localhost:7400/events \ | grep -E '^event: (worker-update|pr-update)' -A 1 \ | grep ^data:Backpressure and reconnection
Section titled “Backpressure and reconnection”If a client disconnects (network blip, process restart), the SSE spec gets it reconnected automatically — the browser/library handles retry. On reconnect the monitor immediately sends a fresh snapshot event so the client can reconcile any missed updates.
The event log (events.jsonl) is the durable fallback: if a client missed events while disconnected, it can replay from the log by its last-seen timestamp:
awk -F'"ts":' '$2 > "\"2026-04-14T19:00:00Z\"" {print}' ~/catalyst/events.jsonlWhy not a real event bus?
Section titled “Why not a real event bus?”File-based append-only logs and filesystem watches are intentionally boring. They:
- Require no additional process (no Redis, no Kafka)
- Survive monitor restarts (events.jsonl is the source of truth)
- Are debuggable with
cat,tail, andjq - Work offline
The cost is that you’re limited to one machine — if you need multi-host aggregation, pipe events.jsonl into your regular log shipping stack (Vector, Fluent Bit, whatever you already run). The schema is stable.