2026-06-19 21:03:50 +02:00
|
|
|
# Hive Onboarding — `colibri-vault` and the "join the hive" primitive
|
|
|
|
|
|
|
|
|
|
**LIVE VS PLANNED.** This is a **design/vision** doc. The building blocks are real and
|
|
|
|
|
proven (Bastille jails on osa, capability routing, `register-agent`, and the
|
2026-06-20 09:48:12 +02:00
|
|
|
`clawdie-vault-fetch` flow validated end-to-end on domedog 2026-06-19). The _platform_
|
2026-06-19 21:03:50 +02:00
|
|
|
described here — `colibri-vault` as a crate, multi-tenant buckets, the mother skill — is
|
|
|
|
|
`[PLANNED]`. The thesis: it is mostly **composition of pieces we already have**, not new
|
|
|
|
|
invention. Sections are tagged `[LIVE]` / `[PLANNED]`.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-20 09:00:09 +02:00
|
|
|
## Status — 2026-06-20
|
|
|
|
|
|
|
|
|
|
The four MVP steps (§8) are **code-complete on colibri `main`**:
|
|
|
|
|
|
2026-06-20 09:48:12 +02:00
|
|
|
| MVP step | Status | Landed via |
|
|
|
|
|
| --------------------------- | ------------------------- | ------------------------------------------- |
|
|
|
|
|
| 1. `colibri-vault` crate | done; hardening in flight | #85 → #94 → #100 (server-match + serialize) |
|
|
|
|
|
| 2. `tenants` table | on `main` | (PR #90 closed as superseded) |
|
|
|
|
|
| 3. spawner → provision hook | done | #91 (root-verify) → #94 (wired) |
|
|
|
|
|
| 4. `mother` skill | done (draft) | layered-soul |
|
2026-06-20 09:00:09 +02:00
|
|
|
|
|
|
|
|
Supporting pieces merged: `agent-jail-bootstrap.sh` (#96 → #97 version-pin → #104
|
|
|
|
|
cold-cache guard), `provider.env` staging (#69/#99), vault-fetch shell helper
|
2026-06-20 09:48:12 +02:00
|
|
|
server-match (#67/#68/#69), and the first-proof runbook (#103).
|
2026-06-20 09:00:09 +02:00
|
|
|
|
2026-06-20 09:48:12 +02:00
|
|
|
**First proof is _not_ code-blocked** — the chain works today via the interim manual
|
2026-06-20 09:00:09 +02:00
|
|
|
path in [`../docs/VAULT-PROVISION-FIRST-PROOF.md`](https://code.smilepowered.org/clawdie/colibri)
|
2026-06-20 09:48:12 +02:00
|
|
|
(colibri). Critical path now: operator runs the runbook (scratch jail + test collection,
|
|
|
|
|
manual SQLite tenant insert, raw-socket jailed spawn) → verify `.env` at `0600` + tenant
|
2026-06-20 14:19:10 +02:00
|
|
|
`active`. With #101/#102 merged, the manual SQLite insert and raw-socket spawn are now
|
|
|
|
|
`colibri register-tenant …` and `colibri spawn-agent … --jail-name … --jail-root …`.
|
2026-06-20 09:00:09 +02:00
|
|
|
|
|
|
|
|
Open work, categorized:
|
|
|
|
|
|
2026-06-20 09:48:12 +02:00
|
|
|
- **Hardening:** #92 (path canonicalization/containment).
|
2026-06-20 14:19:10 +02:00
|
|
|
- **CLI-driveability — DONE, merged:** #101 (`register-tenant` + `list-tenants`) and #102
|
|
|
|
|
(`--jail-name`/`--jail-root` on `spawn-agent`/`spawn-local`) are merged to colibri `main`
|
|
|
|
|
(PR #107); they replace the runbook's manual SQLite insert and raw-socket spawn.
|
2026-06-20 09:00:09 +02:00
|
|
|
- **Source-of-truth/naming:** #98 (`npm-node24` vs `npm`), clawdie-iso #70 (agent-jail
|
|
|
|
|
section in `pkg-list-jails.txt`).
|
2026-06-20 09:48:12 +02:00
|
|
|
- **Cost/source-of-truth:** fill `docs/HOST-MATRIX.md` cost provenance rows before buying
|
|
|
|
|
or retiring build capacity; compare OVH quotes/invoices against measured self-host power.
|
2026-06-20 12:41:54 +02:00
|
|
|
- **Trusted supply chain (new, §10):** stand up first-party repos — `pkg.clawdie.si`
|
|
|
|
|
(poudriere) and a signed skill repo — so paid tenants run first-party-only skills and
|
|
|
|
|
packages instead of external marketplaces.
|
2026-06-20 09:00:09 +02:00
|
|
|
|
2026-06-20 14:19:10 +02:00
|
|
|
**One-line plan:** run the first-proof runbook (now CLI-driveable via merged #101/#102) →
|
2026-06-20 09:48:12 +02:00
|
|
|
#92 before promoting past scratch, and fill verified OVH/self-host cost data before buying
|
|
|
|
|
or depending on a new mother/build host.
|
2026-06-20 09:00:09 +02:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-19 21:03:50 +02:00
|
|
|
## 1. The core idea
|
|
|
|
|
|
|
|
|
|
The Vaultwarden→`.env` fetch we proved is not a utility — it is the **onboarding
|
|
|
|
|
primitive**. Promote it from the `clawdie-vault-fetch` shell helper to a first-class
|
|
|
|
|
crate, **`colibri-vault`**, sitting beside `colibri-spawner` / `colibri-store`:
|
|
|
|
|
|
|
|
|
|
- **in:** a tenant id (→ a bucket) + a target jail/home
|
2026-06-20 09:48:12 +02:00
|
|
|
- **out:** a `0600` `.env` materialized _inside the jail_, owned by the jail user
|
2026-06-19 21:03:50 +02:00
|
|
|
- wraps the `bw` CLI for now (do **not** reimplement the Bitwarden protocol), fail-closed,
|
|
|
|
|
idempotent, no-op when there is no bucket
|
|
|
|
|
|
|
|
|
|
It stops being "a thing you run" and becomes "a thing the hive does to you when you join."
|
|
|
|
|
|
|
|
|
|
## 2. [PLANNED] "Join the hive" = one composed step
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
spawn jail → colibri-vault provision → register-agent
|
|
|
|
|
(spawner,LIVE) (new crate, PLANNED) (LIVE)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The first and third primitives already exist. **Vault-provision is the missing limb**
|
|
|
|
|
between an empty Bastille jail and a participating hive member. Once secrets land and the
|
|
|
|
|
agent registers its capabilities, everything else — capability routing, poll/worker loop,
|
|
|
|
|
the cross-host bridge — is already live (see [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md)).
|
|
|
|
|
|
|
|
|
|
## 3. The mapping (decided)
|
|
|
|
|
|
|
|
|
|
**`tenant_id` == Bastille jail name == Vaultwarden bucket**, 1:1:1. One row in
|
|
|
|
|
`colibri-store`: `(tenant_id, jail, collection_id, status, created_at)`. No more
|
|
|
|
|
indirection than that.
|
|
|
|
|
|
|
|
|
|
On "folder vs bucket":
|
|
|
|
|
|
2026-06-20 09:48:12 +02:00
|
|
|
- **Folders** are personal-vault organization → fine for _Clawdie's own internal_ agents.
|
|
|
|
|
- **Organization + Collections** give _access-scoped isolation_ → the multi-tenant
|
2026-06-19 21:03:50 +02:00
|
|
|
primitive. One customer = one Collection; a scoped credential reads only that collection.
|
|
|
|
|
- **Do not** run a separate Vaultwarden instance per customer — Collections are exactly
|
|
|
|
|
this feature.
|
|
|
|
|
|
|
|
|
|
## 4. The "one key" ideal — actually two ones
|
|
|
|
|
|
|
|
|
|
- **Customer's one key:** a single provider key in their bucket. **OpenRouter** is the
|
|
|
|
|
exemplar (one key → every model), but a single direct-provider key works too — DeepSeek
|
|
|
|
|
alone is the currently validated single-key case. The point is **one secret per tenant**.
|
|
|
|
|
- **Operator's one key:** the Vaultwarden **org service-account** credential, held only on
|
|
|
|
|
the orchestrator, that can read any tenant collection to provision jails.
|
|
|
|
|
|
|
|
|
|
Everything non-secret — harness, base config, model-routing prefs — **ships in the
|
2026-06-20 09:48:12 +02:00
|
|
|
clawdie-iso image**. The image is the _body_; the bucket is the _one private nerve_.
|
2026-06-19 21:03:50 +02:00
|
|
|
|
|
|
|
|
## 5. [PLANNED] The mother skill
|
|
|
|
|
|
|
|
|
|
The genesis routine every image carries — the one skill that turns a jail into an agent:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
mother := resolve-identity (layered-soul)
|
|
|
|
|
∘ acquire-secrets (colibri-vault)
|
|
|
|
|
∘ register (colibri capabilities)
|
|
|
|
|
∘ heartbeat / poll
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
- **Narrow:** onboarding — births one working agent from a bare jail.
|
2026-06-20 09:48:12 +02:00
|
|
|
- **Wide:** self-replication. An agent that _holds_ the mother skill can spawn and
|
2026-06-19 21:03:50 +02:00
|
|
|
provision more jails (a queen births workers, each inheriting the mother skill), gated
|
|
|
|
|
by capability/policy so it cannot run away. That is "agent swarms with a mother skill,"
|
|
|
|
|
and `colibri-vault` is how each birth gets its one nerve.
|
|
|
|
|
|
|
|
|
|
osa/FreeBSD/Bastille is the natural womb — cheap, dense, isolated jails.
|
|
|
|
|
|
2026-06-20 12:41:54 +02:00
|
|
|
**Paid tier — the product surface.** `mother` is what a paying customer actually
|
|
|
|
|
buys. Paste one key → `mother` births a private agent in an isolated jail and
|
|
|
|
|
provisions it **exclusively from Clawdie's first-party supply chain**: curated
|
|
|
|
|
skills from our own skill repository and packages from `pkg.clawdie.si`
|
|
|
|
|
(see [§10](#10-planned-the-trusted-supply-chain--first-party-skills--packages)).
|
|
|
|
|
A free/community path may opt into public skill marketplaces at the operator's own
|
|
|
|
|
risk; the **paid** path is hardened to first-party-only sources. That hardening is
|
|
|
|
|
both the safety guarantee and the thing worth paying for — not a separate feature
|
|
|
|
|
bolted on, but the difference between "an LLM in a box" and "an agent whose every
|
|
|
|
|
skill and package we curate and sign."
|
|
|
|
|
|
2026-06-19 21:03:50 +02:00
|
|
|
## 6. The product, and the moat
|
|
|
|
|
|
|
|
|
|
> A customer pastes **one key** → gets a **private agent in an isolated jail** → that lean
|
|
|
|
|
> agent transparently borrows the **whole multi-OS swarm's capabilities** via the routing
|
|
|
|
|
> already shipped.
|
|
|
|
|
|
|
|
|
|
A one-key agent on osa needs `image-render`? It routes to a Linux lane (domedog). Needs a
|
2026-06-20 09:48:12 +02:00
|
|
|
build? Routes to a capable host. The customer pays for _one agent_ but stands on a
|
2026-06-19 21:03:50 +02:00
|
|
|
survivable, multi-OS hive. Anyone can run an LLM in a container; few hand you a swarm
|
|
|
|
|
behind one key — **capability routing is the differentiator.**
|
|
|
|
|
|
|
|
|
|
- **osa** = the tenant-jail host (the hive body, dense Bastille jails)
|
|
|
|
|
- **debby / domedog** = capability lanes (specialized organs)
|
|
|
|
|
- **Vaultwarden** = per-tenant nerve store
|
|
|
|
|
- **clawdie-iso** = the shared body every jail boots from
|
2026-06-20 12:41:54 +02:00
|
|
|
- **first-party supply chain** = curated skills + packages the agent feeds on
|
|
|
|
|
([§10](#10-planned-the-trusted-supply-chain--first-party-skills--packages)), not
|
|
|
|
|
arbitrary public marketplaces — a second differentiator alongside routing
|
2026-06-19 21:03:50 +02:00
|
|
|
|
|
|
|
|
## 7. The security invariant (non-negotiable)
|
|
|
|
|
|
|
|
|
|
**Bootstraps live on the host; jails hold only their resolved secrets.**
|
|
|
|
|
|
|
|
|
|
- The orchestrator holds the org service-account credential. It fetches a tenant's
|
2026-06-20 09:48:12 +02:00
|
|
|
collection, writes the resolved `.env` _into_ the jail, and the **bootstrap never enters
|
2026-06-19 21:03:50 +02:00
|
|
|
the jail**. A compromised jail cannot re-fetch and cannot reach another tenant.
|
|
|
|
|
- Per-tenant blast radius = one collection. Scoped credential, never a master.
|
|
|
|
|
- This is the same shape the domedog smoke test validated (bootstrap on host, `.env` is the
|
|
|
|
|
output) — just made multi-tenant.
|
2026-06-20 12:41:54 +02:00
|
|
|
- **Supply-chain trust is part of the invariant.** Secrets are not the only thing that
|
|
|
|
|
enters a jail — so do **code and instructions** (packages, and `SKILL.md` bundles the
|
|
|
|
|
agent will follow). Both come only from sources we control; external skill marketplaces
|
|
|
|
|
are untrusted input, vetted before they ever reach a tenant
|
|
|
|
|
(see [§10](#10-planned-the-trusted-supply-chain--first-party-skills--packages)).
|
2026-06-19 21:03:50 +02:00
|
|
|
|
|
|
|
|
## 8. [PLANNED] Lean MVP — and what NOT to build yet
|
|
|
|
|
|
|
|
|
|
Smallest path that is real:
|
|
|
|
|
|
|
|
|
|
1. **`colibri-vault` crate** — lift `clawdie-vault-fetch` into Rust (lib + CLI), fetch a
|
|
|
|
|
named collection → jail `.env`. Retire the shell helper.
|
|
|
|
|
2. **`tenants` row in `colibri-store`** — the 1:1:1 map.
|
|
|
|
|
3. **Spawner hook** — call vault-provision right after jail create.
|
|
|
|
|
4. **`mother` skill in layered-soul** — the genesis sequence above.
|
|
|
|
|
|
2026-06-20 06:39:31 +02:00
|
|
|
**First-proof policy.** The first proven end-to-end runs against a **scratch jail + a
|
|
|
|
|
throwaway test collection only** — no real tenant data until the path hardening lands
|
2026-06-20 09:48:12 +02:00
|
|
|
(canonicalize + allowed-root containment, colibri issue #92). The former first-proof
|
|
|
|
|
blockers — colibri **#88** (resolve the collection by name) and **#89** (per-call unlock)
|
|
|
|
|
— are resolved on `main`; the remaining first-proof step is the operator-run scratch
|
|
|
|
|
runbook. #92 is hardening that follows before real tenant data.
|
2026-06-20 06:39:31 +02:00
|
|
|
|
2026-06-19 21:03:50 +02:00
|
|
|
**Overengineering traps to avoid for now:** a custom Bitwarden web UI (Vaultwarden's own UI
|
2026-06-20 09:48:12 +02:00
|
|
|
plus a Collection is enough to start), billing/metering, a native Bitwarden protocol in
|
|
|
|
|
Rust, multi-region control plane, and recursive auto-spawn (gate it off until policy
|
|
|
|
|
exists). Those are product layers; the four steps above are the engine.
|
2026-06-19 21:03:50 +02:00
|
|
|
|
2026-06-20 10:29:16 +02:00
|
|
|
## 9. Multi-tenant GDPR + OVH GTS gates
|
|
|
|
|
|
|
|
|
|
Before the hive serves paying customers, these administrative items must be completed.
|
|
|
|
|
None are technical blockers; all are paper-and-process:
|
|
|
|
|
|
|
|
|
|
- [ ] GDPR controller documentation package (privacy notice, legal basis, ROPA)
|
|
|
|
|
- [ ] Data Protection Impact Assessment for AI auto-decisions (GDPR Art. 35)
|
|
|
|
|
- [ ] Customer-facing Data Processing Agreement (controller→processor chain)
|
|
|
|
|
- [ ] Professional indemnity / third-party insurance ($10.6)
|
|
|
|
|
- [ ] Customer sanctions screening (denied parties / export controls)
|
|
|
|
|
- [ ] OVH GTS §10.6 — pass terms down to sub-licensees
|
|
|
|
|
|
|
|
|
|
See [`HOST-MATRIX.md §4`](./HOST-MATRIX.md#%C2%A74-compliance-standing-constraints) for the
|
|
|
|
|
four standing constraints that apply now (internal use).
|
|
|
|
|
|
2026-06-20 12:41:54 +02:00
|
|
|
## 10. [PLANNED] The trusted supply chain — first-party skills + packages
|
|
|
|
|
|
|
|
|
|
Two layers feed every agent, and **both are attack surface**:
|
|
|
|
|
|
|
|
|
|
- **OS packages** — what the jail is built and patched from (`pkg` / poudriere).
|
|
|
|
|
- **Skills** — `SKILL.md` instruction bundles ingested into the agent's context and
|
|
|
|
|
**followed by the model at runtime**.
|
|
|
|
|
|
|
|
|
|
External skill marketplaces are community-sourced, unvetted, and mutable. Hermes already
|
|
|
|
|
integrates several as read-only community sources — `clawhub.ai`, `skills.sh`, `lobehub`,
|
|
|
|
|
`browse.sh`, `claude-marketplace` (see hermes-bsd `tools/skills_hub.py`). **A skill is
|
|
|
|
|
literally instructions an LLM will execute**, so a hostile or compromised skill is a
|
|
|
|
|
prompt-injection / instruction-smuggling vector: it can attempt to exfiltrate the tenant's
|
|
|
|
|
one key, redirect tasks, or escalate through routed capabilities. That is the same class of
|
|
|
|
|
risk as `pkg install` from a random public mirror — one layer up, and arguably worse,
|
|
|
|
|
because the payload targets the agent's reasoning directly.
|
|
|
|
|
|
|
|
|
|
> **Note on `clawhub.ai`:** despite the name, it is a **third-party** marketplace Hermes
|
|
|
|
|
> only consumes (read/download via `https://clawhub.ai/api/v1`). It is **not** Clawdie
|
|
|
|
|
> infrastructure and is **unrelated to `pkg.clawdie.si`** — different layer (skills vs OS
|
|
|
|
|
> packages), different ownership (upstream we pull vs server we run).
|
|
|
|
|
|
|
|
|
|
**Decision: Clawdie runs its own first-party repository for _both_ layers.** The poudriere
|
|
|
|
|
plan (`pkg.clawdie.si`) is the package half; it gets a sibling for skills.
|
|
|
|
|
|
|
|
|
|
| Layer | First-party repo | Curates / replaces | Status |
|
|
|
|
|
| ----------- | ----------------------------------------------------- | ---------------------------- | ---------------------------------------- |
|
|
|
|
|
| OS packages | `pkg.clawdie.si` (poudriere) | public FreeBSD `pkg` mirrors | [PLANNED] — `mother-build` host (matrix) |
|
|
|
|
|
| Skills | first-party skill repo (proposed `skills.clawdie.si`) | external skill marketplaces | [PLANNED] |
|
|
|
|
|
|
|
|
|
|
The pattern is identical for both: **an external source is a staging/review input, never a
|
|
|
|
|
direct runtime dependency for a tenant.**
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
external source review + pin + sign first-party repo agent pulls
|
|
|
|
|
(clawhub, skills.sh, → (human / CI gate) → (we control) → (paid tenant:
|
|
|
|
|
public pkg mirror) first-party only)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
- **Curate, don't proxy.** We pull a candidate skill/package, review it, pin a version,
|
|
|
|
|
sign it, and publish into our repo. Upstream edits never silently reach tenants.
|
|
|
|
|
- **Pinned + signed.** Tenants resolve a fixed version from a repo whose signing key they
|
|
|
|
|
trust — the same trust-anchor idea as the SSH key planned for `pkg.clawdie.si`.
|
|
|
|
|
- **Paid = first-party-only.** `mother` ([§5](#5-planned-the-mother-skill)) provisions paid
|
|
|
|
|
tenants exclusively from these repos. This is the concrete extension of the §7 invariant:
|
|
|
|
|
jails hold only resolved secrets **and** run only first-party code/instructions.
|
|
|
|
|
- **Free/community** may opt into external marketplaces directly, at the operator's own
|
|
|
|
|
risk — that risk boundary is exactly what the paid tier removes.
|
|
|
|
|
|
|
|
|
|
**What NOT to build yet** (same restraint as [§8](#8-planned-lean-mvp--and-what-not-to-build-yet)):
|
|
|
|
|
no custom marketplace UI, no automated upstream-sync bot, no public skill-publishing for
|
|
|
|
|
third parties. Smallest real path: stand up `pkg.clawdie.si`, then a **minimal signed skill
|
|
|
|
|
repo `mother` can read**. Curation starts manual; automate only once it hurts.
|
|
|
|
|
|
2026-06-19 21:03:50 +02:00
|
|
|
---
|
|
|
|
|
|
|
|
|
|
_See [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md) for the routing layer the moat rests
|
|
|
|
|
on, [`MCP-INTEGRATION.md`](./MCP-INTEGRATION.md) for the board interface, and
|
|
|
|
|
[`../AGENTS.md`](../AGENTS.md) for the agent matrix._
|