layered-soul/docs/MCP-INTEGRATION.md
Sam & Claude a5139b5f7f security(docs): mask Tailscale IPs + bot handles behind fleet.env
Real tailnet IPs and Telegram bot handles were being committed in docs/
memories/skills. Scrubbed all tracked markdown to ${VAR} placeholders; real
values now live in fleet.env (gitignored) and stay live via 'tailscale status'.

- add fleet.env.example (committed) + fleet.env (gitignored); .gitignore *.env
- AGENTS.md + HOST-MATRIX: masking convention so it can't recur
- also: domedog registered as Colibri agent (image-render/ffmpeg/build lane);
  correct CAPABILITY-ROUTING example to real registered caps (domedog headless)

Past commits not rewritten (history moves to Codeberg at v1.0); this fixes HEAD.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 18:19:32 +02:00

8.2 KiB
Raw Permalink Blame History

MCP Integration — Colibri as the Agent Coordination Hub

LIVE VS PLANNED. The building blocks are all real and in the repos today (Hermes speaks MCP both directions; colibri-mcp exists; the board + poller/worker loop and the cross-host bridge are live). What is not yet wired is the one setup step this document describes: pointing each Hermes at colibri-mcp so the two instances coordinate through the shared board. Sections are tagged [LIVE], [SETUP] (the work to do), or [PLANNED].


1. What MCP is — and what "connect two Hermes" actually means

MCP (Model Context Protocol) is a client → server tool-calling protocol over JSON-RPC. A client (an agent's LLM loop) connects to a server that advertises tools and resources; the client calls them and gets results. It is not a peer-to-peer chat bus and not a message queue.

So "connect two Hermes instances" has two distinct meanings:

  • (a) Tool sharing — Hermes A invokes Hermes B's own tools (B's browser, B's files) by treating B as an MCP server. Point-to-point.
  • (b) Coordination — the two instances hand work back and forth and share state.

Our fleet already does (b) through the Colibri board (register-agent → poll → execute → done; see CAPABILITY-ROUTING.md). The simplest, highest-leverage MCP move is therefore to make MCP the in-conversation interface to that board, not to wire the two Hermes mouth-to-mouth.

Expectation-set: this gives each Hermes's LLM conversational read/write access to the shared board. It is not a live two-way chat between the instances. The board is the shared state they meet at — durable, inspectable, restart-safe.


2. [LIVE] What already exists

Hermes is both an MCP server and an MCP client.

  • Serverhermes mcp serve (mcp_serve.py, FastMCP over stdio) exposes Hermes tools/conversations to any MCP client. Client config shape:
    { "mcpServers": { "hermes": { "command": "hermes", "args": ["mcp", "serve"] } } }
    
  • Client — Hermes consumes external MCP servers from its mcp_servers config (hermes_cli/mcp_config.py, managed via the hermes mcp subcommand; loaded by tui_gateway/server.py; refreshable in-session with reload-mcp). Each entry is the standard command / args / env (or url) shape; presets exist (e.g. Codex).

Colibri ships a ready-made MCP server fronting the board: colibri-mcp.

  • Crate crates/colibri-mcp (binary colibri-mcp), a stdio JSON-RPC MCP server that wraps colibri-client and talks to the daemon over its Unix socket.

  • Tool surface (crates/colibri-mcp/src/lib.rs):

    Tool Access Description
    colibri_status read Daemon status (agents, sessions)
    colibri_snapshot read Glasspane snapshot (pane states)
    colibri_list_tasks read Tasks by status
    colibri_list_skills read Registered skills catalog
    colibri_create_task write-gated Create a task
    colibri_intake_task write-gated Submit intake task with required_capabilities
    colibri_set_cost_mode write-gated Switch cost mode (fast/smart/max)
  • Environment (crates/colibri-mcp/src/main.rs):

    • COLIBRI_MCP_SOCKET — daemon socket path (override)
    • COLIBRI_DAEMON_SOCKET — fallback socket path
    • COLIBRI_MCP_WRITE=1 — enable the write-gated tools
    • COLIBRI_MCP_EXTERNAL_CONFIG / COLIBRI_MCP_EXTERNAL_CALL=1 — proxy external MCP servers (see §6)
  • Default daemon socket on FreeBSD: /var/run/colibri/colibri.sock (from the colibri_daemon rc.d). colibri-mcp socket-path prints the resolved path.

Cross-host reach is already solvedcolibri-mcp connects to a daemon socket; a remote daemon is reached via the socat bridge on ${OSA_TS_IP}:9190 (Tailscale-only; see CAPABILITY-ROUTING [LIVE] Cross-host topology). osa-local instances just use the local socket.


3. Architecture — hub-and-spoke, not mesh

   Hermes-osa-cli ──MCP──┐                  ┌──MCP── Hermes-osa-web
                         ▼                  ▼
                     colibri-mcp   (stdio JSON-RPC, one per Hermes)
                         │                  │
                         └────► colibri-daemon / board (SQLite) ◄────┘
                                  ▲  poller (2 min) / worker (5 min) loop
                                  │  executes tasks assigned by agent UUID

Each Hermes configures Colibri once. Adding a third agent is one more spoke — no N×N wiring. The instances never connect to each other directly; they meet at the board.

Flow: Hermes A's LLM calls colibri_create_task {required_capabilities:["freebsd"]} → the daemon's scheduler assigns it to a matching agent's UUID → that agent's poll loop (scripts/colibri_poll.py) picks up its own tasks, executes, and marks done (scripts/colibri_task_done.py) → A reads the result with colibri_list_tasks.

This layers cleanly on the coordination model already built:

Layer Role Source of truth
colibri-mcp tools conversational read/write to the board (this doc)
poller / worker loop autonomous execution of assigned tasks scripts (PR #83)
board (SQLite) shared state: agents, tasks, lifecycle colibri-store

4. [SETUP] Wiring it up (config, not code)

Per Hermes instance on osa:

  1. Provide the binary. Build or stage colibri-mcp:
    cargo build --release -p colibri-mcp     # target/release/colibri-mcp
    
    Confirm it reaches the daemon: colibri-mcp socket-path.
  2. Register the server in each Hermes's mcp_servers config (via hermes mcp add or the config file), giving the two instances distinct agent identities:
    mcp_servers:
      colibri:
        command: /usr/local/bin/colibri-mcp
        env:
          COLIBRI_MCP_SOCKET: /var/run/colibri/colibri.sock
          COLIBRI_MCP_WRITE: "1"          # enable create/intake
    
  3. Reload toolsreload-mcp in each Hermes; confirm the colibri_* tools appear.
  4. Validate end-to-end — from cli-Hermes, create a freebsd task; confirm web-Hermes's loop runs it and the task flips to done.

Keep the two instances on separate HERMES_HOME (shared .env is fine, shared state home is not — single-writer rule). Give them distinguishing capability tags if a task must land on a specific one (e.g. web-ui vs cli).


5. [LIVE] Security

  • Write tools are gated by COLIBRI_MCP_WRITE=1. Leave it unset for read-only agents; set it only where an instance should create/assign work.
  • Socket, not network. colibri-mcp talks to the daemon's Unix socket; the only network surface is the bridge, bound to the Tailscale IP with a pf rule — never 0.0.0.0.
  • License: colibri-mcp is AGPL-3.0-only; keep that in mind for any redistribution.

6. [PLANNED] Beyond coordination

  • External MCP proxying. colibri-mcp can host third-party MCP servers (COLIBRI_MCP_EXTERNAL_CONFIG + COLIBRI_MCP_EXTERNAL_CALL=1), jail-wrapped on FreeBSD (colibri-mcp external.rscolibri_daemon::spawner::jail_wrap). This lets the hub aggregate outside tools behind one MCP endpoint, confined per the capability/isolation model.
  • Tool-sharing mode (Option A). If a real need arises for one Hermes to call another's own tools, expose the target with hermes mcp serve and add it as a spoke — but prefer the board for coordination; reserve direct tool-sharing for genuine capability borrowing, and accept the point-to-point cost.

7. Rejected alternative: direct Hermes ↔ Hermes mesh

Connecting A's client straight to B's hermes mcp serve was considered and not chosen for coordination: it is a mesh (N×N config), stdio transport would have A spawn a new B rather than reach the running one, and it bypasses the board that already gives us durable, inspectable shared state. The hub (Option B above) reuses everything and scales by adding spokes.


See CAPABILITY-ROUTING.md for the routing engine and cross-host transport, and ../AGENTS.md for the agent matrix.