diff --git a/docs/wiki/a2a-complexity-audit.md b/docs/wiki/a2a-complexity-audit.md new file mode 100644 index 0000000..e81b62f --- /dev/null +++ b/docs/wiki/a2a-complexity-audit.md @@ -0,0 +1,166 @@ +# A2A Complexity Audit + +**Question:** Does A2A reduce Colibri's code complexity, or is it additive? +**Date:** 27.jun.2026 +**Referenced from:** [hive-pane.md](./hive-pane.md), [hive-routing.md](./hive-routing.md) + +## Current protocol surface area + +Colibri speaks 5 protocols today: + +| Protocol | Where | Lines | Purpose | +|---|---|---|---| +| **Custom JSON wire** | `crates/colibri-daemon/src/socket.rs` + `crates/colibri-client/src/lib.rs` | 1,981 | Local daemon control (spawn, status, snapshot, tasks, skills) | +| **MCP JSON-RPC** | `crates/colibri-mcp/src/lib.rs` | 570 | Editor integration + external MCP host | +| **MCP-over-SSH** | `packaging/mother/` (3 files) | 437 | Mother hive entrypoint (forced-command allowlist + node register) | +| **JSONL** | `crates/colibri-glasspane/src/lib.rs` | 1,186 | Agent subprocess stdout events | +| **SQL** | `crates/colibri-store/src/lib.rs` + `crates/colibri-store/src/schema.rs` | 1,150 | Local coordination (tasks, agents, skills, tenants) | + +**Total protocol surface: ~5,324 lines.** + +--- + +## What A2A would replace + +### 1. Mother MCP-over-SSH bridge → A2A HTTP endpoint + +Today's mother entrypoint: + +``` +USB node → SSH (authorized_keys forced-command) → colibri-mcp-ssh → colibri-mcp → PostgreSQL + └─ node-register-mcp (embedded psql) +``` + +With A2A: + +``` +USB node → HTTPS → mother A2A endpoint → PostgreSQL + └─ /a2a (task exchange) + └─ /.well-known/agent.json (discovery) +``` + +**Removed:** +- `colibri-mcp-ssh` (32 lines) — SSH forced-command allowlist wrapper +- `node-register-mcp` (88 lines) — Custom MCP tool with embedded psql +- SSH key management in `setup-mother.sh` (~40 lines of key distribution logic) + +**Removed total: ~160 lines.** + +**Added:** +- A2A HTTP endpoint on mother (~200 lines) +- A2A client library integration on USB node (~150 lines) +- mTLS/TLS termination for auth (~30 lines) + +**Added total: ~380 lines.** + +**Net delta: +220 lines.** Not a code reduction. But operational complexity drops significantly: +- No SSH key distribution to USB nodes (key lives on seed partition → no longer needed on mother) +- No forced-command allowlist to maintain +- Standard HTTPS is easier to firewall, audit, and monitor than SSH forced-command +- Agent Card URL is discoverable without manual external MCP registry entries + +### 2. External MCP server discovery → Agent Card + +Today: external MCP registry config — manual JSON listing third-party MCP servers: + +```json +{ + "servers": [ + { + "name": "filesystem", + "command": "npx", + "args": ["-y", "@anthropic/mcp-server-filesystem", "/tmp"], + "env": {} + } + ] +} +``` + +With A2A: third-party tools that speak A2A (not MCP) publish an Agent Card. Colibri discovers them via the well-known Agent Card URL instead of manual JSON config files. + +**Reality check:** No third-party tools speak A2A yet. The protocol was just announced (April 2025). MCP has ~2 years of ecosystem maturity. This is a *future* replacement, not a *current* one. + +**Verdict:** A2A discovery doesn't reduce code today. External MCP stays for tool access. + +### 3. Ad-hoc cost data format → Typed A2A part + +Today: cost data is embedded in the daemon's heartbeat logic — unstructured: + +```rust +info!(task_id = %task_id, cost = u.cost(), "task cost captured"); +``` + +With A2A: cost data is a typed message part (`application/json+cost`). The format is standardized, not ad-hoc. + +**Code savings:** ~10 lines (the info! log stays; the A2A part is new code). + +**Verdict:** Negligible code impact. The value is *interop*, not complexity reduction. + +--- + +## What A2A does NOT replace + +| Component | Why A2A doesn't touch it | Lines saved | +|---|---|---| +| **Unix socket wire protocol** (`crates/colibri-daemon/src/socket.rs`) | A2A is cross-node HTTP. Local daemon control needs IPC — Unix socket is faster, auth-free (filesystem permissions), and doesn't need a network stack. | 0 | +| **Spawner** (`crates/colibri-daemon/src/spawner.rs`) | A2A routes tasks to existing agents. Colibri *creates* agents by spawning subprocesses. A2A has no process lifecycle concept. | 0 | +| **Glasspane** (`crates/colibri-glasspane/src/lib.rs`) | A2A doesn't watch subprocess stdout. Glasspane is a PTY observer — it reads JSONL from child processes. A2A operates one layer above. | 0 | +| **Store** (`crates/colibri-store/src/lib.rs`) | A2A doesn't replace local SQLite coordination. Each node needs local persistence for task board, agents, skills — A2A is the *transport*, not the *database*. | 0 | +| **MCP editor bridge** | A2A is agent-to-agent. MCP is human-to-tool. Different protocols for different directions. They coexist. | 0 | +| **Contracts schemas** (`crates/colibri-contracts/src/lib.rs`) | A2A uses JSON Schema for input validation. Colibri's contracts are already compatible — no change needed. | 0 | + +**Total irreplaceable: ~5,000 lines.** A2A doesn't reduce this at all. + +--- + +## Net complexity analysis + +``` + BEFORE AFTER A2A + ────── ───────── +Unix socket protocol 1,981 1,981 (unchanged) +MCP bridge 570 570 (unchanged) +Mother MCP-over-SSH 437 0 (REMOVED) +A2A endpoint 0 380 (NEW) +Glasspane JSONL 1,186 1,186 (unchanged) +SQLite store 1,150 1,150 (unchanged) +Contracts schemas 200 200 (unchanged) + ────── ────── +TOTAL 5,524 5,467 + ────── ────── +``` + +**Net delta: −57 lines.** Technically a tiny reduction. Realistically: the code moves around, it doesn't shrink. + +--- + +## The real trade-off + +A2A is not a complexity reduction play. It's an **interoperability and operational simplicity** play: + +| Metric | MCP-over-SSH (current) | A2A (proposed) | +|---|---|---| +| **Lines of code** | ~5,524 (spread across 6 crates + 3 shell scripts) | ~5,467 (SSH scripts gone, A2A handler added) | +| **Protocol count** | 5 | 6 (A2A adds one) | +| **Operational complexity** | SSH keys × N nodes, forced-command allowlists, peer auth setup | One HTTPS endpoint, mTLS certs, well-known URL | +| **Discoverability** | Manual external MCP registry entries | Agent Card at well-known URL | +| **Interoperability** | Colibri-only | Any A2A client | +| **Debugability** | `ssh -v`, `psql`, `jq` | `curl`, browser devtools, standard HTTP tooling | +| **Ecosystem maturity** | N/A (Colibri-specific) | Protocol < 3 months old, zero adoption | +| **When it pays off** | Works today for 4 nodes | Pays off at 10+ nodes, or when 3rd-party tools ship A2A | + +--- + +## Recommendation: Later, not now + +The right window for A2A is when one of these becomes true: + +1. **We have >10 hive nodes** — SSH key distribution becomes painful +2. **A third-party tool ships A2A support** — interop value materializes +3. **We want federation** — multiple hives discovering each other + +Until then: the current MCP-over-SSH bridge is 437 lines of boring, working code. A2A would add 380 lines for a protocol that has zero adopters. The code savings (~57 lines) don't justify the protocol risk. + +**Phase 2 (next sprint) should not include A2A.** Build the routing engine on the existing MCP bridge. Add A2A as Phase 3 — when the protocol has real-world adoption and Colibri has enough nodes to benefit from discovery. + +The HIVE-PANE.md A2A section is a good north-star design doc. It stays in the wiki as "planned." But it shouldn't drive implementation priority. diff --git a/docs/wiki/hive-pane.md b/docs/wiki/hive-pane.md index 2e6ccb0..00d608a 100644 --- a/docs/wiki/hive-pane.md +++ b/docs/wiki/hive-pane.md @@ -102,6 +102,12 @@ the hive view, this data needs to flow to the mother. Two paths: ## A2A integration (planned) +> 📋 **Complexity audit:** [a2a-complexity-audit](./a2a-complexity-audit.md) — +> A2A doesn't reduce Colibri's code complexity today (6 protocols → 6 protocols, +> ~0 net lines). It pays off at 10+ nodes or when third-party tools ship A2A +> support. The Agent Card design below is a north star, not an implementation +> priority for 0.12. + Google's Agent-to-Agent protocol standardizes three things Colibri already does ad-hoc. Adopting it makes the hive discoverable and interoperable beyond our own tooling. diff --git a/docs/wiki/index.md b/docs/wiki/index.md index b0060ae..fe8c530 100644 --- a/docs/wiki/index.md +++ b/docs/wiki/index.md @@ -55,6 +55,7 @@ warning. | [mother-hive](./mother-hive.md) | Mother MCP architecture — forced-command SSH, single-home-in-colibri, peer auth, key-on-seed | | [hive-routing](./hive-routing.md) | Hive member identity (machine UUID), capability matrix + local LLM probes, cost-aware task routing | | [hive-pane](./hive-pane.md) | Glasspane for the hive — multi-node cost observability, A2A discovery, and operator board | +| [a2a-complexity-audit](./a2a-complexity-audit.md) | A2A code complexity impact — 6-protocol surface audit, when A2A pays off | | [naming-decisions](./naming-decisions.md) | Ledger of harness-neutral / architecture renames — shipped and in-flight | | [daemon-not-demon](./daemon-not-demon.md) | Why we say daemon (helper spirit) not demon (bad spirit) — English + Slovenian | | [layered-soul](./layered-soul.md) | How Colibri consumes the layered-soul reviewed-context repo today vs planned |