layered-soul/docs/MCP-INTEGRATION.md
Sam & Claude a5139b5f7f security(docs): mask Tailscale IPs + bot handles behind fleet.env
Real tailnet IPs and Telegram bot handles were being committed in docs/
memories/skills. Scrubbed all tracked markdown to ${VAR} placeholders; real
values now live in fleet.env (gitignored) and stay live via 'tailscale status'.

- add fleet.env.example (committed) + fleet.env (gitignored); .gitignore *.env
- AGENTS.md + HOST-MATRIX: masking convention so it can't recur
- also: domedog registered as Colibri agent (image-render/ffmpeg/build lane);
  correct CAPABILITY-ROUTING example to real registered caps (domedog headless)

Past commits not rewritten (history moves to Codeberg at v1.0); this fixes HEAD.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 18:19:32 +02:00

177 lines
8.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# MCP Integration — Colibri as the Agent Coordination Hub
**LIVE VS PLANNED.** The building blocks are all real and in the repos today
(Hermes speaks MCP both directions; `colibri-mcp` exists; the board + poller/worker
loop and the cross-host bridge are live). What is **not yet wired** is the one setup
step this document describes: pointing each Hermes at `colibri-mcp` so the two
instances coordinate through the shared board. Sections are tagged `[LIVE]`,
`[SETUP]` (the work to do), or `[PLANNED]`.
---
## 1. What MCP is — and what "connect two Hermes" actually means
MCP (Model Context Protocol) is a **client → server tool-calling** protocol over
JSON-RPC. A client (an agent's LLM loop) connects to a server that advertises
**tools** and **resources**; the client calls them and gets results. It is **not** a
peer-to-peer chat bus and **not** a message queue.
So "connect two Hermes instances" has two distinct meanings:
- **(a) Tool sharing** — Hermes A invokes Hermes B's *own* tools (B's browser, B's
files) by treating B as an MCP server. Point-to-point.
- **(b) Coordination** — the two instances hand work back and forth and share state.
Our fleet already does **(b)** through the Colibri board (register-agent → poll →
execute → done; see [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md)). The simplest,
highest-leverage MCP move is therefore to make MCP the **in-conversation interface to
that board**, not to wire the two Hermes mouth-to-mouth.
> **Expectation-set:** this gives each Hermes's LLM conversational read/write access to
> the shared board. It is **not** a live two-way chat between the instances. The
> **board is the shared state they meet at** — durable, inspectable, restart-safe.
---
## 2. [LIVE] What already exists
**Hermes is both an MCP server and an MCP client.**
- **Server** — `hermes mcp serve` (`mcp_serve.py`, FastMCP over stdio) exposes Hermes
tools/conversations to any MCP client. Client config shape:
```json
{ "mcpServers": { "hermes": { "command": "hermes", "args": ["mcp", "serve"] } } }
```
- **Client** — Hermes consumes external MCP servers from its `mcp_servers` config
(`hermes_cli/mcp_config.py`, managed via the `hermes mcp` subcommand; loaded by
`tui_gateway/server.py`; refreshable in-session with `reload-mcp`). Each entry is
the standard `command` / `args` / `env` (or `url`) shape; presets exist (e.g. Codex).
**Colibri ships a ready-made MCP server fronting the board: `colibri-mcp`.**
- Crate `crates/colibri-mcp` (binary `colibri-mcp`), a stdio JSON-RPC MCP server that
wraps `colibri-client` and talks to the daemon over its Unix socket.
- Tool surface (`crates/colibri-mcp/src/lib.rs`):
| Tool | Access | Description |
|------|--------|-------------|
| `colibri_status` | read | Daemon status (agents, sessions) |
| `colibri_snapshot` | read | Glasspane snapshot (pane states) |
| `colibri_list_tasks` | read | Tasks by status |
| `colibri_list_skills` | read | Registered skills catalog |
| `colibri_create_task` | write-gated | Create a task |
| `colibri_intake_task` | write-gated | Submit intake task with `required_capabilities` |
| `colibri_set_cost_mode` | write-gated | Switch cost mode (fast/smart/max) |
- Environment (`crates/colibri-mcp/src/main.rs`):
- `COLIBRI_MCP_SOCKET` — daemon socket path (override)
- `COLIBRI_DAEMON_SOCKET` — fallback socket path
- `COLIBRI_MCP_WRITE=1` — enable the write-gated tools
- `COLIBRI_MCP_EXTERNAL_CONFIG` / `COLIBRI_MCP_EXTERNAL_CALL=1` — proxy external MCP
servers (see §6)
- Default daemon socket on FreeBSD: `/var/run/colibri/colibri.sock` (from the
`colibri_daemon` rc.d). `colibri-mcp socket-path` prints the resolved path.
**Cross-host reach is already solved** — `colibri-mcp` connects to a daemon socket; a
*remote* daemon is reached via the `socat` bridge on `${OSA_TS_IP}:9190` (Tailscale-only;
see CAPABILITY-ROUTING `[LIVE] Cross-host topology`). osa-local instances just use the
local socket.
---
## 3. Architecture — hub-and-spoke, not mesh
```
Hermes-osa-cli ──MCP──┐ ┌──MCP── Hermes-osa-web
▼ ▼
colibri-mcp (stdio JSON-RPC, one per Hermes)
│ │
└────► colibri-daemon / board (SQLite) ◄────┘
▲ poller (2 min) / worker (5 min) loop
│ executes tasks assigned by agent UUID
```
Each Hermes configures Colibri **once**. Adding a third agent is one more spoke — no
N×N wiring. The instances never connect to each other directly; they meet at the board.
**Flow:** Hermes A's LLM calls `colibri_create_task {required_capabilities:["freebsd"]}`
→ the daemon's scheduler assigns it to a matching agent's UUID → that agent's poll loop
(`scripts/colibri_poll.py`) picks up its own tasks, executes, and marks done
(`scripts/colibri_task_done.py`) → A reads the result with `colibri_list_tasks`.
This layers cleanly on the coordination model already built:
| Layer | Role | Source of truth |
|-------|------|-----------------|
| `colibri-mcp` tools | conversational read/write to the board (this doc) | — |
| poller / worker loop | autonomous execution of assigned tasks | scripts (PR #83) |
| board (SQLite) | shared state: agents, tasks, lifecycle | `colibri-store` |
---
## 4. [SETUP] Wiring it up (config, not code)
Per Hermes instance on osa:
1. **Provide the binary.** Build or stage `colibri-mcp`:
```sh
cargo build --release -p colibri-mcp # target/release/colibri-mcp
```
Confirm it reaches the daemon: `colibri-mcp socket-path`.
2. **Register the server** in each Hermes's `mcp_servers` config (via `hermes mcp add`
or the config file), giving the two instances distinct agent identities:
```yaml
mcp_servers:
colibri:
command: /usr/local/bin/colibri-mcp
env:
COLIBRI_MCP_SOCKET: /var/run/colibri/colibri.sock
COLIBRI_MCP_WRITE: "1" # enable create/intake
```
3. **Reload tools** — `reload-mcp` in each Hermes; confirm the `colibri_*` tools appear.
4. **Validate end-to-end** — from cli-Hermes, create a `freebsd` task; confirm
web-Hermes's loop runs it and the task flips to `done`.
> Keep the two instances on **separate `HERMES_HOME`** (shared `.env` is fine, shared
> state home is not — single-writer rule). Give them distinguishing capability tags if
> a task must land on a specific one (e.g. `web-ui` vs `cli`).
---
## 5. [LIVE] Security
- **Write tools are gated** by `COLIBRI_MCP_WRITE=1`. Leave it unset for read-only
agents; set it only where an instance should create/assign work.
- **Socket, not network.** `colibri-mcp` talks to the daemon's Unix socket; the only
network surface is the bridge, bound to the Tailscale IP with a `pf` rule — never
`0.0.0.0`.
- **License:** `colibri-mcp` is AGPL-3.0-only; keep that in mind for any redistribution.
---
## 6. [PLANNED] Beyond coordination
- **External MCP proxying.** `colibri-mcp` can host *third-party* MCP servers
(`COLIBRI_MCP_EXTERNAL_CONFIG` + `COLIBRI_MCP_EXTERNAL_CALL=1`), jail-wrapped on
FreeBSD (`colibri-mcp` `external.rs` → `colibri_daemon::spawner::jail_wrap`). This lets
the hub aggregate outside tools behind one MCP endpoint, confined per the
capability/isolation model.
- **Tool-sharing mode (Option A).** If a real need arises for one Hermes to call
another's *own* tools, expose the target with `hermes mcp serve` and add it as a spoke
— but prefer the board for coordination; reserve direct tool-sharing for genuine
capability borrowing, and accept the point-to-point cost.
---
## 7. Rejected alternative: direct Hermes ↔ Hermes mesh
Connecting A's client straight to B's `hermes mcp serve` was considered and **not
chosen** for coordination: it is a mesh (N×N config), stdio transport would have A
*spawn a new* B rather than reach the running one, and it bypasses the board that
already gives us durable, inspectable shared state. The hub (Option B above) reuses
everything and scales by adding spokes.
---
_See [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md) for the routing engine and
cross-host transport, and [`../AGENTS.md`](../AGENTS.md) for the agent matrix._