clawdie/colibri

Fork 0

Sam & Claude 04370dd869

CI / rust (pull_request) Has been cancelled

Details

CI / markdown (pull_request) Has been cancelled

Details

CI / port (pull_request) Has been cancelled

Details

CI / agent-jail-pkgs (pull_request) Has been cancelled

Details

docs: post-Phase-3 wiki accuracy + task-dispatch-flow page

- model-selection-and-eval: status Design → Phases 1–3 shipped (#264/#280/#285);
  mark Phase 2/3 deliverables, add 3a scope note, fix stale routing-gap row.
- hive-routing: status → partially shipped; scheduler row reflects pick_agent +
  select_model.
- README + index: model-selection row reflects shipped, not "design".
- New task-dispatch-flow.md: the verified queued→claim→spawn→register→dispatch→
  cost chain with code anchors + "why a task stalls" (stale build, not RPC mode,
  registration linkage). Indexed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-28 18:53:09 +02:00

5.6 KiB

Raw Blame History

Task dispatch flow (queued → processed → cost)

← index

What this is

The end-to-end path a task takes from submission to a running agent and back. This page exists because the chain spans three modules (task-board scheduling, agent-harness spawning, and the daemon poll loop), and it's a recurring source of "why won't the agent pick up my task?" confusion. Every stage below is in main today.

The chain

operator submits        intake-task (socket)            cmd_intake_task
   → task row (queued) ───────────────────────────────► store.create_task
                                                              │
scheduler tick (~30s)   pick best-fit agent              Scheduler::tick
   → claim_task ──────────────────────────────────────► pick_agent + claim_task
                                                              │
autospawn (once)        spawn `zot rpc` (stdin piped)    autospawn_agent_if_configured
   → register agent ───────────────────────────────────► register_agent (name = spawn id)
                                                              │
daemon poll loop        task text → agent stdin          poll_tasks
   → send_prompt ──────────────────────────────────────► rpc_sender().send_prompt(task)
   → status: Started                                          │
                                                              ▼
agent works, emits JSONL → glasspane state → completion → set_task_cost
   → write_task_eval (self-report) + background local eval → push_cost_to_mother → dashboard

Stages

Stage	What happens	Code
Submit	A task row is created in the store with status `queued`.	`cmd_intake_task`
Claim	Each tick, the scheduler picks the best-fit agent by capability and claims the task (`queued → claimed`).	`Scheduler::tick` → `pick_agent`, `claim_task`
Spawn	Autospawn starts the harness as `zot rpc` (provider `local`), so stdin is piped and an `RpcSender` is available.	`autospawn_agent_if_configured`, `default_agent_args`
Register	The spawned agent is registered in the store; its `name` column holds the live spawn-handle id used in `state.agents`.	`register_agent` (store row `name` = spawn id)
Dispatch	The poll loop resolves the spawn handle from the claimed task's `agent_id`, gets `rpc_sender()`, and writes the task text to the agent's stdin, transitioning the task to `Started`.	`poll_tasks` → `send_prompt`
Process + cost	The agent works and emits JSONL (glasspane). On completion, cost and an eval record are written, then pushed to mother for the dashboard.	`set_task_cost`, `write_task_eval`, `push_cost_to_mother`

Why a task can stall (and what it is not)

The dispatch logic above is all in main — a stalled task is almost never missing code. The usual causes, in order:

Stale deployed build. The host is running a colibri binary older than the current poll_tasks dispatch or agent-registration fixes. Check git rev-parse HEAD on the host against origin/main; reset, rebuild, restart the daemon.
Agent not in RPC mode. If the process is zot --mode json (not zot rpc), stdin isn't piped, rpc_sender() is None, and no dispatch happens. Confirm with ps.
Registration linkage broken. Dispatch needs the store agent row's name to equal the live spawn id in state.agents. A mismatch (older build) means poll_tasks can't find the sender.

If you're told "merge branch X to enable dispatch," verify X against main first — the chain is already merged, and re-pushing an auto-deleted branch hits the branch recreation hazard.

5.6 KiB Raw Blame History

Task dispatch flow (queued → processed → cost)

What this is

The chain

Stages

Why a task can stall (and what it is not)

See also

5.6 KiB

Raw Blame History