2026-06-17 16:06:00 +02:00
# Capability-Based Task Routing
2026-06-19 16:51:27 +02:00
**LIVE VS PLANNED.** Colibri's capability matcher exists (Colibri daemon) and works for a single-host daemon/agent pool. **Cross-host transport is now LIVE** (2026-06-19): a `socat` bridge over Tailscale exposes osa's daemon, and a poller/worker loop runs assigned tasks across hosts — validated on the debby↔osa lane (colibri PR #83 ). Full capability-scored routing across all three hosts is maturing on top of this transport. Sections below are labelled `[LIVE]` or `[PLANNED]` .
2026-06-17 21:39:29 +02:00
2026-06-17 16:06:00 +02:00
**Principle: a tool that one OS can't support is not a loss — it's a routing
constraint.** In a multi-agent, multi-OS fleet we don't force every capability onto
every host. We let each host advertise what it can do, let each task declare what it
needs, and let the scheduler send the task to a host that qualifies. FreeBSD stays lean;
the capability simply lives where it's cheap.
2026-06-17 21:39:29 +02:00
## [LIVE] What Colibri already provides (single host)
2026-06-17 16:06:00 +02:00
2026-06-17 21:39:29 +02:00
The matching engine exists today in `colibri-daemon` — this is working, per-host:
2026-06-17 16:06:00 +02:00
- **Agents carry capability tags** — `agents.capabilities` (JSON array) in the store
(`colibri-store` schema); registered via `colibri` client / `--capabilities` .
- **Tasks declare requirements** — jobs and intake requests carry `required_capabilities`
(`colibri intake-task --capabilities <csv>` ).
- **The scheduler matches** — `pick_agent(required, agents)` scores each idle/active agent
with `capability_match_score` and picks the best fit.
- **Unmatched = parked, not failed** — if requirements are non-empty and no online agent
matches, `pick_agent` returns `None` : the task is created but left **unassigned until a
2026-06-17 21:39:29 +02:00
capable agent appears**.
2026-06-17 16:06:00 +02:00
2026-06-19 16:51:27 +02:00
> **Note:** the daemon itself listens on a **local Unix socket only**. Cross-host reach is
> provided by the bridge below, not by the daemon binding a network port directly.
2026-06-17 16:06:00 +02:00
2026-06-19 16:51:27 +02:00
## [LIVE] Cross-host topology
2026-06-17 16:06:00 +02:00
2026-06-19 16:51:27 +02:00
Implemented 2026-06-19 (colibri PR #83 ), using the `socat` -over-Tailscale approach:
2026-06-17 16:06:00 +02:00
2026-06-19 16:51:27 +02:00
- **`socat` bridge** (`colibri_bridge` rc.d, daemon(8)-supervised) maps osa's daemon Unix
2026-06-19 18:19:32 +02:00
socket to a TCP port on the **Tailscale interface only** (`${OSA_TS_IP}:9190` , never
2026-06-19 22:50:16 +02:00
`0.0.0.0` ), with a `pf` rule on `tailscale0` . **osa is the always-on VPS** and hosts the
board + orchestrator (hermes-osa); agents on debby/domedog reach it over the tailnet. (debby
is an intermittent laptop — a client, never the hub.)
2026-06-19 16:51:27 +02:00
- **Poller/worker loop** — `colibri_poll.py` (filters by agent UUID) and
`colibri_task_done.py` (transition-task), driven on the live 2 min / 5 min cadence by
Hermes' internal scheduler (see `packaging/freebsd/colibri-agent-loop.md` ), not OS cron.
2026-06-19 18:19:32 +02:00
- **Validated** on the debby↔osa lane (real tasks completed end-to-end). **domedog joined
2026-06-19** via the same pattern — a client-side `socat` shim → osa `${OSA_TS_IP}:9190` .
2026-06-19 16:51:27 +02:00
- Alternative (heavier, not pursued): daemon-to-daemon federation.
2026-06-17 16:06:00 +02:00
2026-06-17 21:39:29 +02:00
## [LIVE] Capability vocabulary (initial)
| Piece | Status | Action |
| ----- | ------ | ------ |
| Capability vocabulary | tags are free-form (`rust` , `python` , `linux` ) | Agree a shared tag set (below) |
2026-06-17 16:06:00 +02:00
Flat, explicit tags — the matcher does exact string comparison, no implied hierarchy.
Sourced from the probe and recorded per host in [`HOST-MATRIX.md` ](./HOST-MATRIX.md ).
| Category | Tags |
| -------- | ---- |
| OS | `linux` , `freebsd` |
| Isolation | `docker` , `freebsd-jail` |
| Display | `gui` , `screenshot` , `wayland` |
| Hardware | `gpu` , `zfs` |
| Runtime | `python3.12` , `node24` , `rust` , `go` |
| Media | `ffmpeg` , `pillow` /`image-render` |
2026-06-19 18:19:32 +02:00
Hosts advertise only what they truly have. Actual registered agents (2026-06-19):
2026-06-17 16:06:00 +02:00
2026-06-19 18:19:32 +02:00
- **domedog (Linux, headless):** `linux` , `python3.12` , `rust` , `go` , `node` , `ffmpeg` ,
`image-render` — the media/compute lane. **No** `screenshot` /`gui` (headless VM), no `docker` .
- **debby / hermes-debby (Linux):** `linux` , `docker` , `shell` , `gateway` , `hermes` , `tailscale` .
- **osa / hermes-osa (FreeBSD):** `freebsd` , `shell` , `gateway` , `tailscale` , `rc.d` , `pf` ,
`nginx` , `acme` , `hermes` — no `image-render` (Pillow dropped on FreeBSD).
2026-06-17 16:06:00 +02:00
2026-06-17 21:39:29 +02:00
## [DESIGN] Worked example: the tmux-screenshot skill
2026-06-17 16:06:00 +02:00
2026-06-19 16:51:27 +02:00
This illustrates the routing flow (now runnable over the [LIVE] cross-host topology above):
2026-06-17 16:06:00 +02:00
1. FreeBSD image drops Pillow — stays lean (`pkg-list` carries only `python312` ).
2026-06-19 18:19:32 +02:00
2. The skill manifest declares `required_capabilities: ["image-render"]` (or `screenshot` ).
3. Only a Linux host advertises these — today **domedog** carries `image-render` /`ffmpeg`
(osa dropped Pillow). `screenshot` additionally needs a display, so a *headless* host
does not qualify for it.
4. Colibri routes the task to a matching host automatically — **proven 2026-06-19: an
`image-render` task routed to domedog**; with no match it parks until a capable agent appears.
2026-06-17 16:06:00 +02:00
The capability moved hosts. It was never lost.
2026-06-19 21:03:50 +02:00
_See [`HIVE-ONBOARDING.md` ](./HIVE-ONBOARDING.md ) for the hive-onboarding vision built on
this routing layer, [`MCP-INTEGRATION.md` ](./MCP-INTEGRATION.md ) for connecting agents to the
board over MCP, [`AGENTS.md` ](../AGENTS.md ) for the agent matrix,
2026-06-19 17:51:29 +02:00
[`HOST-MATRIX.md` ](./HOST-MATRIX.md ) for per-host facts, and
[`TOOLCHAIN.md` ](./TOOLCHAIN.md ) for runtime versions._