# Hive Onboarding — `colibri-vault` and the "join the hive" primitive **LIVE VS PLANNED.** This is a **design/vision** doc. The building blocks are real and proven (Bastille jails on osa, capability routing, `register-agent`, and the `clawdie-vault-fetch` flow validated end-to-end on domedog 2026-06-19). The _platform_ described here — `colibri-vault` as a crate, multi-tenant buckets, the mother skill — is `[PLANNED]`. The thesis: it is mostly **composition of pieces we already have**, not new invention. Sections are tagged `[LIVE]` / `[PLANNED]`. --- ## Status — 2026-06-20 The four MVP steps (§8) are **code-complete on colibri `main`**: | MVP step | Status | Landed via | | --------------------------- | ------------------------- | ------------------------------------------- | | 1. `colibri-vault` crate | done; hardening in flight | #85 → #94 → #100 (server-match + serialize) | | 2. `tenants` table | on `main` | (PR #90 closed as superseded) | | 3. spawner → provision hook | done | #91 (root-verify) → #94 (wired) | | 4. `mother` skill | done (draft) | layered-soul | Supporting pieces merged: `agent-jail-bootstrap.sh` (#96 → #97 version-pin → #104 cold-cache guard), `provider.env` staging (#69/#99), vault-fetch shell helper server-match (#67/#68/#69), and the first-proof runbook (#103). **First proof is _not_ code-blocked** — the chain works today via the interim manual path in [`../docs/VAULT-PROVISION-FIRST-PROOF.md`](https://code.smilepowered.org/clawdie/colibri) (colibri). Critical path now: operator runs the runbook (scratch jail + test collection, manual SQLite tenant insert, raw-socket jailed spawn) → verify `.env` at `0600` + tenant `active`. Open work, categorized: - **Hardening:** #92 (path canonicalization/containment). - **CLI-driveability (post-proof ergonomics, not proof blockers):** #101 (`register-tenant` command), #102 (`--jail` on `spawn-agent`) — these replace the runbook's manual steps. - **Source-of-truth/naming:** #98 (`npm-node24` vs `npm`), clawdie-iso #70 (agent-jail section in `pkg-list-jails.txt`). - **Cost/source-of-truth:** fill `docs/HOST-MATRIX.md` cost provenance rows before buying or retiring build capacity; compare OVH quotes/invoices against measured self-host power. **One-line plan:** run the first-proof runbook → then land #101/#102 for CLI driveability, #92 before promoting past scratch, and fill verified OVH/self-host cost data before buying or depending on a new mother/build host. --- ## 1. The core idea The Vaultwarden→`.env` fetch we proved is not a utility — it is the **onboarding primitive**. Promote it from the `clawdie-vault-fetch` shell helper to a first-class crate, **`colibri-vault`**, sitting beside `colibri-spawner` / `colibri-store`: - **in:** a tenant id (→ a bucket) + a target jail/home - **out:** a `0600` `.env` materialized _inside the jail_, owned by the jail user - wraps the `bw` CLI for now (do **not** reimplement the Bitwarden protocol), fail-closed, idempotent, no-op when there is no bucket It stops being "a thing you run" and becomes "a thing the hive does to you when you join." ## 2. [PLANNED] "Join the hive" = one composed step ``` spawn jail → colibri-vault provision → register-agent (spawner,LIVE) (new crate, PLANNED) (LIVE) ``` The first and third primitives already exist. **Vault-provision is the missing limb** between an empty Bastille jail and a participating hive member. Once secrets land and the agent registers its capabilities, everything else — capability routing, poll/worker loop, the cross-host bridge — is already live (see [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md)). ## 3. The mapping (decided) **`tenant_id` == Bastille jail name == Vaultwarden bucket**, 1:1:1. One row in `colibri-store`: `(tenant_id, jail, collection_id, status, created_at)`. No more indirection than that. On "folder vs bucket": - **Folders** are personal-vault organization → fine for _Clawdie's own internal_ agents. - **Organization + Collections** give _access-scoped isolation_ → the multi-tenant primitive. One customer = one Collection; a scoped credential reads only that collection. - **Do not** run a separate Vaultwarden instance per customer — Collections are exactly this feature. ## 4. The "one key" ideal — actually two ones - **Customer's one key:** a single provider key in their bucket. **OpenRouter** is the exemplar (one key → every model), but a single direct-provider key works too — DeepSeek alone is the currently validated single-key case. The point is **one secret per tenant**. - **Operator's one key:** the Vaultwarden **org service-account** credential, held only on the orchestrator, that can read any tenant collection to provision jails. Everything non-secret — harness, base config, model-routing prefs — **ships in the clawdie-iso image**. The image is the _body_; the bucket is the _one private nerve_. ## 5. [PLANNED] The mother skill The genesis routine every image carries — the one skill that turns a jail into an agent: ``` mother := resolve-identity (layered-soul) ∘ acquire-secrets (colibri-vault) ∘ register (colibri capabilities) ∘ heartbeat / poll ``` - **Narrow:** onboarding — births one working agent from a bare jail. - **Wide:** self-replication. An agent that _holds_ the mother skill can spawn and provision more jails (a queen births workers, each inheriting the mother skill), gated by capability/policy so it cannot run away. That is "agent swarms with a mother skill," and `colibri-vault` is how each birth gets its one nerve. osa/FreeBSD/Bastille is the natural womb — cheap, dense, isolated jails. ## 6. The product, and the moat > A customer pastes **one key** → gets a **private agent in an isolated jail** → that lean > agent transparently borrows the **whole multi-OS swarm's capabilities** via the routing > already shipped. A one-key agent on osa needs `image-render`? It routes to a Linux lane (domedog). Needs a build? Routes to a capable host. The customer pays for _one agent_ but stands on a survivable, multi-OS hive. Anyone can run an LLM in a container; few hand you a swarm behind one key — **capability routing is the differentiator.** - **osa** = the tenant-jail host (the hive body, dense Bastille jails) - **debby / domedog** = capability lanes (specialized organs) - **Vaultwarden** = per-tenant nerve store - **clawdie-iso** = the shared body every jail boots from ## 7. The security invariant (non-negotiable) **Bootstraps live on the host; jails hold only their resolved secrets.** - The orchestrator holds the org service-account credential. It fetches a tenant's collection, writes the resolved `.env` _into_ the jail, and the **bootstrap never enters the jail**. A compromised jail cannot re-fetch and cannot reach another tenant. - Per-tenant blast radius = one collection. Scoped credential, never a master. - This is the same shape the domedog smoke test validated (bootstrap on host, `.env` is the output) — just made multi-tenant. ## 8. [PLANNED] Lean MVP — and what NOT to build yet Smallest path that is real: 1. **`colibri-vault` crate** — lift `clawdie-vault-fetch` into Rust (lib + CLI), fetch a named collection → jail `.env`. Retire the shell helper. 2. **`tenants` row in `colibri-store`** — the 1:1:1 map. 3. **Spawner hook** — call vault-provision right after jail create. 4. **`mother` skill in layered-soul** — the genesis sequence above. **First-proof policy.** The first proven end-to-end runs against a **scratch jail + a throwaway test collection only** — no real tenant data until the path hardening lands (canonicalize + allowed-root containment, colibri issue #92). The former first-proof blockers — colibri **#88** (resolve the collection by name) and **#89** (per-call unlock) — are resolved on `main`; the remaining first-proof step is the operator-run scratch runbook. #92 is hardening that follows before real tenant data. **Overengineering traps to avoid for now:** a custom Bitwarden web UI (Vaultwarden's own UI plus a Collection is enough to start), billing/metering, a native Bitwarden protocol in Rust, multi-region control plane, and recursive auto-spawn (gate it off until policy exists). Those are product layers; the four steps above are the engine. --- _See [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md) for the routing layer the moat rests on, [`MCP-INTEGRATION.md`](./MCP-INTEGRATION.md) for the board interface, and [`../AGENTS.md`](../AGENTS.md) for the agent matrix._