New docs/MCP-INTEGRATION.md: how the two Hermes instances connect via MCP using colibri-mcp as the hub-and-spoke front-end to the shared board, rather than a direct mesh. Grounded in actual code: - Hermes is both MCP server (hermes mcp serve) and client (mcp_servers config) - colibri-mcp tool surface + env vars (COLIBRI_MCP_SOCKET/WRITE), socket transport - ties into the live board + poller/worker loop and the socat cross-host bridge - LIVE/SETUP/PLANNED tags; security, rejected mesh alternative, external-MCP future Cross-linked from CAPABILITY-ROUTING.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
4.4 KiB
Capability-Based Task Routing
LIVE VS PLANNED. Colibri's capability matcher exists (Colibri daemon) and works for a single-host daemon/agent pool. Cross-host transport is now LIVE (2026-06-19): a socat bridge over Tailscale exposes osa's daemon, and a poller/worker loop runs assigned tasks across hosts — validated on the debby↔osa lane (colibri PR #83). Full capability-scored routing across all three hosts is maturing on top of this transport. Sections below are labelled [LIVE] or [PLANNED].
Principle: a tool that one OS can't support is not a loss — it's a routing constraint. In a multi-agent, multi-OS fleet we don't force every capability onto every host. We let each host advertise what it can do, let each task declare what it needs, and let the scheduler send the task to a host that qualifies. FreeBSD stays lean; the capability simply lives where it's cheap.
[LIVE] What Colibri already provides (single host)
The matching engine exists today in colibri-daemon — this is working, per-host:
- Agents carry capability tags —
agents.capabilities(JSON array) in the store (colibri-storeschema); registered viacolibriclient /--capabilities. - Tasks declare requirements — jobs and intake requests carry
required_capabilities(colibri intake-task --capabilities <csv>). - The scheduler matches —
pick_agent(required, agents)scores each idle/active agent withcapability_match_scoreand picks the best fit. - Unmatched = parked, not failed — if requirements are non-empty and no online agent
matches,
pick_agentreturnsNone: the task is created but left unassigned until a capable agent appears.
Note: the daemon itself listens on a local Unix socket only. Cross-host reach is provided by the bridge below, not by the daemon binding a network port directly.
[LIVE] Cross-host topology
Implemented 2026-06-19 (colibri PR #83), using the socat-over-Tailscale approach:
socatbridge (colibri_bridgerc.d, daemon(8)-supervised) maps osa's daemon Unix socket to a TCP port on the Tailscale interface only (100.72.229.63:9190, never0.0.0.0), with apfrule ontailscale0. The debby orchestrator reaches it over the tailnet.- Poller/worker loop —
colibri_poll.py(filters by agent UUID) andcolibri_task_done.py(transition-task), driven on the live 2 min / 5 min cadence by Hermes' internal scheduler (seepackaging/freebsd/colibri-agent-loop.md), not OS cron. - Validated on the debby↔osa lane (real tasks completed end-to-end). domedog joins via the same bridge pattern.
- Alternative (heavier, not pursued): daemon-to-daemon federation.
[LIVE] Capability vocabulary (initial)
| Piece | Status | Action |
|---|---|---|
| Capability vocabulary | tags are free-form (rust, python, linux) |
Agree a shared tag set (below) |
Flat, explicit tags — the matcher does exact string comparison, no implied hierarchy.
Sourced from the probe and recorded per host in HOST-MATRIX.md.
| Category | Tags |
|---|---|
| OS | linux, freebsd |
| Isolation | docker, freebsd-jail |
| Display | gui, screenshot, wayland |
| Hardware | gpu, zfs |
| Runtime | python3.12, node24, rust, go |
| Media | ffmpeg, pillow/image-render |
Hosts advertise only what they truly have. Example from the current fleet:
- domedog / debby (Linux):
linux,docker,gui,screenshot,image-render, … - osa (FreeBSD):
freebsd,freebsd-jail,zfs,rust, … (noscreenshot/image-render)
[DESIGN] Worked example: the tmux-screenshot skill
This illustrates the routing flow (now runnable over the [LIVE] cross-host topology above):
- FreeBSD image drops Pillow — stays lean (
pkg-listcarries onlypython312). - The skill manifest declares
required_capabilities: ["screenshot"](orimage-render). - Only Linux hosts advertise
screenshot(Pillow is trivial there). - Colibri routes any screenshot task to debby/domedog automatically; if both are offline the task parks until one returns.
The capability moved hosts. It was never lost.
See MCP-INTEGRATION.md for connecting agents to the board over
MCP (Colibri as the coordination hub), AGENTS.md for the agent matrix,
HOST-MATRIX.md for per-host facts, and
TOOLCHAIN.md for runtime versions.