docs(wiki): add 9 subsystem pages (rebuilt on current main)
Brings the wiki-expansion pages onto current main WITHOUT the stale baggage the original feature/wiki-expansion branch carried (it predated the rename + date PRs and would have reverted them). Cherry-picked only the 9 genuinely-new pages: contracts, store-schema, external-mcp, operator-cli, tui, runtime-inventory, skills-catalog, vault-provision, deployment. Added them to index.md. Fixed on the way in: vault-provision referenced the pre-rename VAULT-PROVISION-FIRST-PROOF → repointed to VAULT-PROVISION-RUNBOOK. (No US dates in these pages.) Gates: wiki-lint --strict clean (131 pass); markdown format clean.
This commit is contained in:
parent
5d646b1f2c
commit
f581433b29
10 changed files with 1161 additions and 10 deletions
49
docs/wiki/contracts.md
Normal file
49
docs/wiki/contracts.md
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
# Stable JSON contracts
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
`colibri-contracts` holds the stable, language-agnostic wire shapes shared
|
||||
between Colibri (Rust) and Clawdie agents (TypeScript). It owns _schemas and
|
||||
(De)serialize_, not business logic.
|
||||
|
||||
## Why a separate contracts crate
|
||||
|
||||
- Prevent duplicated definitions between Rust and TypeScript lanes.
|
||||
- Keep committed manifests in `manifests/` parseable by both sides.
|
||||
- Centralize schema strings, field renaming aliases, and backward-compat
|
||||
defaults.
|
||||
|
||||
## Active schemas
|
||||
|
||||
| Schema | Rust struct | Purpose |
|
||||
| -------------------------------------- | --------------------- | -------------------------------------------------------------- |
|
||||
| `clawdie.interagent.run-manifest.v1` | `RunManifest` | Records a build/test run — role, agent, artifacts, summary. |
|
||||
| `clawdie.runtime-version-inventory.v1` | `RuntimeInventory` | Host runtime snapshot — OS, package versions, npm/node/zot/pi. |
|
||||
| `clawdie.provider-smoke.result.v1` | `ProviderSmokeResult` | DeepSeek cache-hit probe result and token accounting. |
|
||||
|
||||
Schema constants and structs live in `crates/colibri-contracts/src/lib.rs`.
|
||||
|
||||
## Evolution rules
|
||||
|
||||
- The crate carries **no logic** — only `serde` structs and schema constants.
|
||||
- New fields are normally optional with `#[serde(default)]` so old manifests
|
||||
still parse.
|
||||
- `RuntimeInventory.pi` is optional because not every host installs `pi` or
|
||||
`zot`.
|
||||
- `HostStatus.raw` is a catch-all `serde_json::Value` so hostile collector
|
||||
output can be captured without forcing a schema bump.
|
||||
|
||||
## Golden tests
|
||||
|
||||
`crates/colibri-contracts/tests/golden.rs` parses every committed manifest in
|
||||
`manifests/` and asserts round-trip equality. The fixtures are intended to be
|
||||
**cross-platform** — if a manifest produced on Linux differs from one produced
|
||||
on FreeBSD 15, the difference must be understood and documented before it is
|
||||
merged.
|
||||
|
||||
## See also
|
||||
|
||||
- [cost-model](./cost-model.md) — how the provider-smoke result feeds cache-hit
|
||||
metering.
|
||||
- [runtime-inventory](./runtime-inventory.md) — where the runtime inventory is
|
||||
produced.
|
||||
175
docs/wiki/deployment.md
Normal file
175
docs/wiki/deployment.md
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
# Deployment
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
The `clawdie` crate is Colibri's host installer. It discovers a machine's ZFS
|
||||
layout and provisions the `clawdie` service. On FreeBSD this means an rc.d
|
||||
service, ZFS datasets, and an unprivileged user. On Linux it can use systemd
|
||||
and either ZFS or plain directories.
|
||||
|
||||
→ `crates/clawdie/src/main.rs`
|
||||
|
||||
→ `crates/clawdie/src/plan.rs`
|
||||
|
||||
→ `docs/ISO-SERVICE-LAYOUT.md`
|
||||
|
||||
→ `docs/CLAWDIE-INSTALLER-HANDOFF.md`
|
||||
|
||||
## Decisions
|
||||
|
||||
### ZFS is required on FreeBSD, preferred on Linux
|
||||
|
||||
FreeBSD does not support a plain-directory layout. If ZFS userland is missing,
|
||||
the plan errors immediately. Linux can fall back to plain directories if no
|
||||
pool is named and ZFS is unavailable, and it can create a fresh pool on a spare
|
||||
disk when asked.
|
||||
|
||||
This matches the production target: bare-metal FreeBSD on a ZFS RAID1 mirror.
|
||||
Linux support makes development and CI possible without a ZFS host.
|
||||
|
||||
### Storage is resolved, not configured
|
||||
|
||||
`clawdie plan` resolves storage in this order:
|
||||
|
||||
1. If `--pool NAME --create-pool DEVICE` is given, create that pool.
|
||||
2. If `--pool NAME` is given, use that existing pool.
|
||||
3. If no pool is given and exactly one pool exists, use it.
|
||||
4. If multiple pools exist and none is named, error.
|
||||
5. On Linux with no ZFS, fall back to plain directories.
|
||||
|
||||
This removes the need for a hand-written topology file on typical single-pool
|
||||
hosts, while still allowing explicit control when needed.
|
||||
|
||||
→ `crates/clawdie/src/main.rs` (`pick_pool`, `validate_storage`)
|
||||
|
||||
### Datasets separate state from logs
|
||||
|
||||
When ZFS is used, the installer creates:
|
||||
|
||||
- `<pool>/clawdie` as a container dataset with `canmount=off`
|
||||
- `<pool>/clawdie/db` mounted at `/var/db/clawdie`
|
||||
- `<pool>/clawdie/log` mounted at `/var/log/clawdie`
|
||||
|
||||
Keeping database and logs in separate datasets lets snapshots, quotas, and
|
||||
log-rotation policies apply independently.
|
||||
|
||||
→ `crates/clawdie/src/plan.rs` (`zfs_dataset_steps`)
|
||||
|
||||
### Dry-run by default
|
||||
|
||||
`clawdie apply` prints the plan and exits unless `--yes` is given. `discover`
|
||||
and `plan` are read-only. This protects production hosts from accidental
|
||||
provisioning.
|
||||
|
||||
→ `crates/clawdie/src/main.rs` (`Cmd::Apply`)
|
||||
|
||||
### Pool creation is guarded against busy disks
|
||||
|
||||
`--create-pool` on a non-empty disk is refused unless `--force` is also given.
|
||||
The installer uses `lsblk` on Linux to detect partitions, filesystems, mount
|
||||
points, and the root disk. The guard is conservative: if a disk is ambiguous,
|
||||
it must be explicitly forced.
|
||||
|
||||
→ `crates/clawdie/src/disk.rs`
|
||||
|
||||
→ `crates/clawdie/src/main.rs` (`validate_create_device`)
|
||||
|
||||
### Single unprivileged service user
|
||||
|
||||
The service runs as `_clawdie` on both platforms. On FreeBSD the user is created
|
||||
with `pw useradd -s /usr/sbin/nologin -d /var/db/clawdie` and exit code `65`
|
||||
(already exists) is treated as a skip. On Linux `useradd --system` is used. The
|
||||
state directories are then chowned to that user.
|
||||
|
||||
→ `crates/clawdie/src/platform.rs`
|
||||
|
||||
### Platform-specific service managers, same spec
|
||||
|
||||
`Platform` is an internal trait. The two implementations differ only in how
|
||||
they install and enable the unit:
|
||||
|
||||
- FreeBSD: writes `/usr/local/etc/rc.d/clawdie`, uses `sysrc clawdie_enable=YES`.
|
||||
- Linux: writes `/etc/systemd/system/clawdie.service`, runs `systemctl enable --now
|
||||
clawdie`.
|
||||
|
||||
Both use the same `ServiceSpec` (binary, user, data dir, service name).
|
||||
Running `apply` across platforms therefore produces the same filesystem layout
|
||||
and differs only in the service-manager wrapper.
|
||||
|
||||
→ `crates/clawdie/src/platform.rs` (`FreeBsd`, `Linux`)
|
||||
|
||||
### Daemon runs through the platform supervisor
|
||||
|
||||
The generated FreeBSD rc.d script execs `/usr/local/bin/colibri-daemon` through
|
||||
`/usr/sbin/daemon -u _clawdie` so the supervisor restarts on crash and the
|
||||
process drops to the unprivileged user. The systemd unit is a simple service
|
||||
with `Restart=on-failure`.
|
||||
|
||||
The installer itself does not start the daemon or stage the binary; it only
|
||||
creates the environment. The operator or package build stages
|
||||
`colibri-daemon` and then `service clawdie start`.
|
||||
|
||||
→ `docs/ISO-SERVICE-LAYOUT.md` (rc.d through daemon(8))
|
||||
|
||||
### Secrets are not written by the installer
|
||||
|
||||
The installer does not touch provider API keys. A separate file — conventionally
|
||||
`/usr/local/etc/colibri/provider environment file — holds secrets and is sourced by rc.d
|
||||
before the daemon starts. This keeps the installer's blast radius limited to
|
||||
ZFS, directories, users, and service files.
|
||||
|
||||
→ [vault-provision](./vault-provision.md)
|
||||
|
||||
### Steps are executed sequentially and stop on failure
|
||||
|
||||
`deploy::apply` runs each `Step` in order. `Run` steps shell out and fail on a
|
||||
non-zero exit unless the step declares allowed exit codes. `WriteFile` steps
|
||||
create parent directories, write the file, and chmod it. If any step fails,
|
||||
apply stops immediately and reports the failing command and stderr.
|
||||
|
||||
→ `crates/clawdie/src/deploy.rs`
|
||||
|
||||
## Plan shape
|
||||
|
||||
```text
|
||||
clawdie plan
|
||||
├── ZFS layout (or plain dirs)
|
||||
│ ├── create <pool>/clawdie container
|
||||
│ ├── create <pool>/clawdie/db -> /var/db/clawdie
|
||||
│ └── create <pool>/clawdie/log -> /var/log/clawdie
|
||||
└── service install
|
||||
├── create user _clawdie
|
||||
├── chown state dirs
|
||||
├── write service unit (rc.d / systemd)
|
||||
├── enable service (sysrc / systemctl)
|
||||
└── [systemd] daemon-reload + start
|
||||
```
|
||||
|
||||
## Typical FreeBSD install
|
||||
|
||||
```sh
|
||||
# discover
|
||||
clawdie discover
|
||||
|
||||
# preview
|
||||
clawdie plan
|
||||
|
||||
# provision datasets, user, and rc.d service
|
||||
sudo clawdie apply --yes
|
||||
|
||||
# start once the colibri-daemon binary is staged
|
||||
sudo service clawdie start
|
||||
```
|
||||
|
||||
## Cross-link to runtime paths
|
||||
|
||||
After deployment, the service owns these paths:
|
||||
|
||||
- `/var/db/clawdie/colibri.sqlite` — SQLite coordination store
|
||||
- `/var/run/clawdie/clawdie.sock` — daemon Unix socket
|
||||
- `/var/log/clawdie/daemon.log` — stdout/stderr log
|
||||
- `/usr/local/etc/colibri/` — configuration and provider secrets
|
||||
|
||||
→ [store-schema](./store-schema.md)
|
||||
|
||||
→ [operator-cli](./operator-cli.md)
|
||||
138
docs/wiki/external-mcp.md
Normal file
138
docs/wiki/external-mcp.md
Normal file
|
|
@ -0,0 +1,138 @@
|
|||
# External MCP bridge
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
`colibri-mcp` is the Model Context Protocol bridge between Colibri and
|
||||
MCP-capable editors (Zed, Cursor, Windsurf, Claude Code). It exposes the
|
||||
current daemon state as MCP tools today and acts as a small MCP host for
|
||||
arbitrary external stdio MCP servers as a prototype.
|
||||
|
||||
## Why MCP?
|
||||
|
||||
The daemon already exposes a typed Unix-socket API through
|
||||
`crates/colibri-client`. MCP wraps that API into the standard JSON-RPC tool
|
||||
protocol that editors already speak. This avoids the maintenance cost and
|
||||
political risk of forking or embedding an editor, keeps Colibri headless-safe,
|
||||
and lets any MCP-compatible client access the same surface.
|
||||
|
||||
For the longer-term product framing, see ../CLAWDIE-STUDIO-PROPOSAL.md.
|
||||
|
||||
## Two roles in one binary
|
||||
|
||||
`colibri-mcp` serves as both:
|
||||
|
||||
1. **MCP server for Colibri** — presents tools such as `colibri_status`,
|
||||
`colibri_snapshot`, `colibri_list_tasks`, `colibri_create_task`, etc.
|
||||
2. **MCP host for external servers** — reads a registry file, spawns configured
|
||||
proc ess servers, and proxies `tools/list` and `tools/call` to them.
|
||||
|
||||
Separating these roles would create a second binary for little gain; hosting
|
||||
external servers is gated so the default surface stays read-only.
|
||||
|
||||
## Daemon socket resolution
|
||||
|
||||
The MCP server must reach the daemon. The socket path is resolved in order:
|
||||
|
||||
1. `--socket` CLI flag
|
||||
2. `COLIBRI_MCP_SOCKET`
|
||||
3. `COLIBRI_DAEMON_SOCKET`
|
||||
4. `DaemonConfig::from_env().socket_path` (env-driven defaults)
|
||||
|
||||
This mirrors how the operator CLI and TUI resolve the same socket.
|
||||
|
||||
## Colibri tools and gates
|
||||
|
||||
| Tool | Default | Gate |
|
||||
| ----------------------- | ----------- | --------------------------------- |
|
||||
| `colibri_status` | read-only | none |
|
||||
| `colibri_snapshot` | read-only | none |
|
||||
| `colibri_list_tasks` | read-only | none |
|
||||
| `colibri_list_skills` | read-only | none |
|
||||
| `colibri_create_task` | write-gated | `COLIBRI_MCP_WRITE=1` / `--write` |
|
||||
| `colibri_intake_task` | write-gated | `COLIBRI_MCP_WRITE=1` / `--write` |
|
||||
| `colibri_set_cost_mode` | write-gated | `COLIBRI_MCP_WRITE=1` / `--write` |
|
||||
|
||||
The default ISO posture is read-only. Mutating commands require the operator to
|
||||
opt in explicitly, which prevents an assistant from creating tasks or switching
|
||||
cost mode by accident.
|
||||
|
||||
## External MCP host
|
||||
|
||||
The prototype external-host tools are always exposed but only allow calling an
|
||||
external tool when the separate `COLIBRI_MCP_EXTERNAL_CALL=1` / `--external-call`
|
||||
flag is set.
|
||||
|
||||
### Registry
|
||||
|
||||
External servers are configured from a JSON registry. Default path:
|
||||
/usr/local/etc/colibri/external-mcp.json. Override with
|
||||
`COLIBRI_MCP_EXTERNAL_CONFIG` or `--external-config`.
|
||||
|
||||
Each entry declares a command, args, optional env, and optional jail
|
||||
confinement:
|
||||
|
||||
```json
|
||||
{
|
||||
"servers": {
|
||||
"demo": {
|
||||
"command": "/usr/local/bin/demo-mcp-server",
|
||||
"args": ["--stdio"],
|
||||
"env": { "DEMO_MODE": "1" },
|
||||
"jail": { "name": "mcp0", "root_path": "/usr/local/bastille/jails/mcp0/root" }
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Confinement
|
||||
|
||||
External MCP servers execute arbitrary code on the operator machine, so they
|
||||
reuse the same jail primitive as agent spawning:
|
||||
`colibri_daemon::spawner::{prepare_spawn_command, jail_wrap, JailConfig, PrivMode}`.
|
||||
|
||||
- `jail.name` enters an existing persistent jail via `jexec`.
|
||||
- `jail.root_path` creates an ephemeral jail for the duration of the call.
|
||||
- Omitting `jail` runs the server on the host, but stdin/stdout framing is the
|
||||
same either way.
|
||||
|
||||
The root-only jail step honors the shared `COLIBRI_JAIL_PRIV_MODE` policy (`mdo`
|
||||
on the operator USB, `helper` on deployed hosts). See [jail-confinement](./jail-confinement.md).
|
||||
|
||||
### Request lifecycle
|
||||
|
||||
Every external `tools/list` or `tools/call` request:
|
||||
|
||||
1. Spawns a fresh process (`ExternalMcpSession::start`) using the shared spawner.
|
||||
2. Runs the MCP `initialize` handshake with protocol version `2024-11-05`.
|
||||
3. Sends `tools/list` or `tools/call`, reads the response over newline-delimited
|
||||
JSON, and returns the result.
|
||||
4. Kills the child and removes the staged cleanup directory.
|
||||
|
||||
This is intentionally simple: one process per request, no connection pool, no
|
||||
streaming, no long-lived state. It is good enough for prototyping; a production
|
||||
host should add policy, audit logging, secret management, and per-tool
|
||||
permissions.
|
||||
|
||||
## Why separate `COLIBRI_MCP_WRITE` and `COLIBRI_MCP_EXTERNAL_CALL`
|
||||
|
||||
`COLIBRI_MCP_WRITE` gates mutations against the local Colibri daemon. External
|
||||
tool calls execute arbitrary third-party binaries and therefore live on a
|
||||
different trust surface. Requiring two separate opt-ins makes accidental
|
||||
privilege escalation harder.
|
||||
|
||||
## Limits and open questions
|
||||
|
||||
- stdio transport only
|
||||
- one external process per request
|
||||
- no server/tool allowlist beyond the registry file
|
||||
- no streaming tool results
|
||||
- no production secret manager integration
|
||||
|
||||
Those limits are recorded as explicitly accepted for now; if the prototype is
|
||||
promoted to default ISO behavior, each limit should be addressed.
|
||||
|
||||
## See also
|
||||
|
||||
- [jail-confinement](./jail-confinement.md) — jail policy reused for external MCP servers
|
||||
- [cost-model](./cost-model.md) — cost mode and the write-gated `colibri_set_cost_mode`
|
||||
- [skills-catalog](./skills-catalog.md) — read-only skill catalog exposed via `colibri_list_skills`
|
||||
|
|
@ -43,13 +43,22 @@ warning.
|
|||
|
||||
## Pages
|
||||
|
||||
| Page | What it covers |
|
||||
| ----------------------------------------- | --------------------------------------------------------------------------------------------- |
|
||||
| [agent-harness](./agent-harness.md) | The zot (agent) + Colibri (control plane) split; autospawn + RPC driver |
|
||||
| [cost-model](./cost-model.md) | Byte-stable prefixes, cache-hit metering, auto-escalation, T14 compaction |
|
||||
| [glasspane](./glasspane.md) | Agent state machine, JSONL streaming, AgentRuntime taxonomy, snapshot API |
|
||||
| [jail-confinement](./jail-confinement.md) | Persistent vs ephemeral jails, priv-mode policy, reuse of spawner confinement for MCP servers |
|
||||
| [mother-hive](./mother-hive.md) | Mother MCP architecture — forced-command SSH, single-home-in-colibri, peer auth, key-on-seed |
|
||||
| [naming-decisions](./naming-decisions.md) | Ledger of harness-neutral / architecture renames — shipped and in-flight |
|
||||
| [task-board](./task-board.md) | Capability match scoring, cron scheduling, intake drain, SQLite backing |
|
||||
| [quality-gates](./quality-gates.md) | `ci-checks.sh` as the pre-merge gate; why drift reached `main` before |
|
||||
| Page | What it covers |
|
||||
| ------------------------------------------- | --------------------------------------------------------------------------------------------- |
|
||||
| [agent-harness](./agent-harness.md) | The zot (agent) + Colibri (control plane) split; autospawn + RPC driver |
|
||||
| [cost-model](./cost-model.md) | Byte-stable prefixes, cache-hit metering, auto-escalation, T14 compaction |
|
||||
| [glasspane](./glasspane.md) | Agent state machine, JSONL streaming, AgentRuntime taxonomy, snapshot API |
|
||||
| [jail-confinement](./jail-confinement.md) | Persistent vs ephemeral jails, priv-mode policy, reuse of spawner confinement for MCP servers |
|
||||
| [mother-hive](./mother-hive.md) | Mother MCP architecture — forced-command SSH, single-home-in-colibri, peer auth, key-on-seed |
|
||||
| [naming-decisions](./naming-decisions.md) | Ledger of harness-neutral / architecture renames — shipped and in-flight |
|
||||
| [task-board](./task-board.md) | Capability match scoring, cron scheduling, intake drain, SQLite backing |
|
||||
| [quality-gates](./quality-gates.md) | `ci-checks.sh` as the pre-merge gate; why drift reached `main` before |
|
||||
| [contracts](./contracts.md) | Stable JSON schemas (run-manifest, runtime-inventory, provider-smoke), golden tests |
|
||||
| [store-schema](./store-schema.md) | SQLite coordination schema and migration discipline |
|
||||
| [external-mcp](./external-mcp.md) | MCP bridge for editors + external stdio MCP host; read/write/external-call gates |
|
||||
| [operator-cli](./operator-cli.md) | The `colibri` CLI as a thin typed Unix-socket client over the daemon API |
|
||||
| [tui](./tui.md) | Terminal dashboard client (colibri-tui) vs the colibri-glasspane state machine |
|
||||
| [runtime-inventory](./runtime-inventory.md) | Host runtime inventory + watchdog status reader; additive, read-only integrations |
|
||||
| [skills-catalog](./skills-catalog.md) | Read-only runtime consumer for reviewed Clawdie-AI skill artifacts |
|
||||
| [vault-provision](./vault-provision.md) | Vaultwarden-driven env-file provisioning into jails after agent spawn |
|
||||
| [deployment](./deployment.md) | Host installer (clawdie): ZFS layout, rc.d/systemd service, dry-run safety |
|
||||
|
|
|
|||
124
docs/wiki/operator-cli.md
Normal file
124
docs/wiki/operator-cli.md
Normal file
|
|
@ -0,0 +1,124 @@
|
|||
# Operator CLI (`colibri`)
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
The `colibri` binary is the operator's command-line interface to the daemon.
|
||||
It wraps a typed Unix-socket client (`DaemonClient`) and turns typed commands
|
||||
into newline-delimited JSON messages on the control-plane socket. It is not
|
||||
where policy lives — policy lives in the daemon behind the socket.
|
||||
|
||||
## Job of the CLI
|
||||
|
||||
The CLI has two responsibilities:
|
||||
|
||||
1. **Parse shell input** into strongly-typed commands.
|
||||
2. **Send those commands** to the daemon and print the JSON response.
|
||||
|
||||
It does not contain business logic about session compaction, task scheduling,
|
||||
or jail confinement. That keeps the CLI small and lets any other client (TUI,
|
||||
MCP bridge, web dashboard, tests) perform the same operations with the same
|
||||
protocol.
|
||||
|
||||
→ `crates/colibri-client/src/bin/colibri.rs` (argument parsing and `run` dispatch)
|
||||
|
||||
→ `crates/colibri-client/src/lib.rs` (`DaemonClient` request/response wrapper)
|
||||
|
||||
## Decisions
|
||||
|
||||
### One binary, one socket, one protocol
|
||||
|
||||
Every command — `status`, `snapshot`, `spawn-agent`, `create-task`,
|
||||
`register-tenant` — goes over the same Unix socket. The CLI builds a
|
||||
`DaemonClient`, serializes a `ColibriCommand`, writes one line ending in `\n`,
|
||||
and reads one `ColibriResponse` line back.
|
||||
|
||||
Because the protocol is newline-delimited JSON, operators can still debug with
|
||||
`nc -U` or similar when the CLI is not enough. The socket is the stable API;
|
||||
the CLI is a polished client.
|
||||
|
||||
→ `crates/colibri-daemon/src/lib.rs` (`ColibriCommand`, `ColibriResponse`)
|
||||
|
||||
→ `crates/colibri-daemon/src/socket.rs` (dispatch table)
|
||||
|
||||
### Socket resolution order matches other clients
|
||||
|
||||
The CLI resolves the daemon socket the same way the TUI and MCP bridge do:
|
||||
|
||||
1. `--socket PATH`
|
||||
2. `COLIBRI_DAEMON_SOCKET`
|
||||
3. `DaemonConfig::from_env().socket_path`
|
||||
|
||||
Sharing the resolution order means documentation, environment setup scripts,
|
||||
and operator muscle memory apply to every client.
|
||||
|
||||
→ `crates/colibri-client/src/bin/colibri.rs` (`default_socket_path`)
|
||||
|
||||
### No write-gating inside the CLI itself
|
||||
|
||||
Commands that mutate state (`create-task`, `kill-agent`, `set-cost-mode`,
|
||||
`register-tenant`) are not blocked by CLI flags. The gate is the Unix socket
|
||||
itself: the daemon is configured to listen on a unix socket with operator-only
|
||||
permissions, and the daemon validates each command. This avoids two parallel
|
||||
permission layers that could drift out of sync.
|
||||
|
||||
This is an intentional contrast with `colibri-mcp`, which exposes the daemon to
|
||||
editor assistants and therefore uses `COLIBRI_MCP_WRITE=1` as an explicit trust
|
||||
switch. An operator at the shell already has that trust by virtue of the socket.
|
||||
|
||||
→ [external-mcp](./external-mcp.md)
|
||||
|
||||
### Commands return JSON, not human prose
|
||||
|
||||
All successful CLI commands print pretty-printed JSON. This keeps the output
|
||||
scriptable (`colibri snapshot | jq '.panes[] | select(.state == "working")'`)
|
||||
and consistent with the socket protocol. If a command fails, the CLI prints the
|
||||
daemon's error message to stderr and exits non-zero.
|
||||
|
||||
→ `crates/colibri-client/src/lib.rs` (`request`, error handling)
|
||||
|
||||
### `spawn-agent` accepts jail confinement directly
|
||||
|
||||
The `--jail-name` and `--jail-root` flags on `spawn-local` and `spawn-agent`
|
||||
build a `JailConfig` that is sent to the daemon. The same type is re-exported
|
||||
from `colibri-daemon::spawner` so the CLI crate does not have to depend on the
|
||||
daemon crate just to build a config.
|
||||
|
||||
Pairing `--jail-name` with `--jail-root` is the only path that triggers vault
|
||||
provisioning after a spawn, because the daemon needs both the jail identity
|
||||
and the host-visible jail root.
|
||||
|
||||
→ `crates/colibri-client/src/lib.rs` (`JailConfig` re-export)
|
||||
|
||||
→ `crates/colibri-daemon/src/spawner.rs`
|
||||
|
||||
### Local sample agent lives next door
|
||||
|
||||
The same crate also ships `colibri-test-agent`, a tiny sample binary used by
|
||||
tests and the TUI's spawn shortcut. Keeping it in `colibri-client` keeps the
|
||||
sample close to its primary caller without adding a new crate.
|
||||
|
||||
→ `crates/colibri-client/src/bin/colibri_test_agent.rs`
|
||||
|
||||
## Notable commands
|
||||
|
||||
| Command | Purpose |
|
||||
| ---------------------------------------------------------------- | --------------------------------- |
|
||||
| `status` | daemon health, paths, cost mode |
|
||||
| `snapshot` / `glasspane-snapshot` | current pane radar view |
|
||||
| `list-sessions` | active agent sessions |
|
||||
| `spawn-local` / `spawn-agent` | start an agent, optionally jailed |
|
||||
| `kill AGENT_ID` | terminate a pane/agent |
|
||||
| `create-task` / `intake-task` / `claim-task` / `transition-task` | task-board workflow |
|
||||
| `set-cost-mode MODE` | acknowledge/toggle cost mode |
|
||||
| `register-tenant` / `list-tenants` | vault provisioning bookkeeping |
|
||||
| `register-skill` / `list-skills` | skill catalog maintenance |
|
||||
| `register-agent` / `list-agents` | agent capability registration |
|
||||
|
||||
## See also
|
||||
|
||||
- [tui](./tui.md) — the live terminal dashboard that uses the same `DaemonClient`
|
||||
- [glasspane](./glasspane.md) — the pane state machine behind `snapshot`
|
||||
- [task-board](./task-board.md) — commands that manipulate the task board
|
||||
- [store-schema](./store-schema.md) — SQLite entities queried by the CLI
|
||||
- [vault-provision](./vault-provision.md) — why `register-tenant` carries a jail root path
|
||||
- [external-mcp](./external-mcp.md) — another daemon client with write-gating
|
||||
88
docs/wiki/runtime-inventory.md
Normal file
88
docs/wiki/runtime-inventory.md
Normal file
|
|
@ -0,0 +1,88 @@
|
|||
# Runtime inventory and host status
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
Colibri discovers the host in two complementary ways:
|
||||
|
||||
1. **Runtime inventory** — a one-shot probe that reports versions installed on
|
||||
the machine (`node`, `npm`, `pi`, `zot`, package manager, OS, etc.).
|
||||
2. **Watchdog host status** — a read-only, newline-framed Unix-socket call to
|
||||
the Clawdie watchdog that returns live health metrics.
|
||||
|
||||
Both are intentionally additive: they read from Clawdie, they do not change
|
||||
it. This page records the design of those read-only integrations.
|
||||
|
||||
## Runtime inventory probe
|
||||
|
||||
The `colibri-runtime-inventory` binary (`src/bin/runtime_inventory.rs`) emits a
|
||||
single JSON object matching the `clawdie.runtime-version-inventory.v1` schema
|
||||
from `crates/colibri-contracts/src/lib.rs`.
|
||||
|
||||
### Detection strategy
|
||||
|
||||
| Field | How it is detected |
|
||||
| ----------------- | ------------------------------------------------------------------------------------------------------ |
|
||||
| `host` | `COLIBRI_HOST` → `HOSTNAME` → `hostname` command → `"unknown"` |
|
||||
| `os` | `uname -sr` + target architecture; falls back to `std::env::consts` |
|
||||
| `node` | `node --version` |
|
||||
| `npm` | `npm --version` |
|
||||
| `npm_prefix` | `npm config get prefix` |
|
||||
| `package_manager` | `pkg` on FreeBSD, otherwise `apt` / `dnf` / `brew` |
|
||||
| `pi` | `PI_BIN` → `~/.npm-global/bin/pi` → `pi --version` → package.json of `@earendil-works/pi-coding-agent` |
|
||||
| `zot` | `ZOT_BIN` → `zot --version` across PATH and candidate locations |
|
||||
|
||||
### Why this shape
|
||||
|
||||
- `pi` is an npm package installed in `node_modules`, so version detection must
|
||||
fall back to reading its package manifest when `--version` is missing.
|
||||
- `zot` is a single Go binary, so a plain `--version` probe is correct.
|
||||
- `pi`/`zot` are optional; a host that only runs one agent runtime should
|
||||
still produce a valid inventory.
|
||||
|
||||
## Watchdog host status
|
||||
|
||||
`crates/colibri-runtime/src/lib.rs` implements the watchdog reader. It connects
|
||||
over a Unix domain socket, sends `{"cmd":"status"}\n`, reads back one
|
||||
newline-terminated JSON line, and normalizes the response into `HostStatus`.
|
||||
|
||||
### Socket path resolution
|
||||
|
||||
The search order lets operators, services, and test harnesses override the
|
||||
socket location without recompiling:
|
||||
|
||||
1. `COLIBRI_WATCHDOG_SOCKET` (explicit override)
|
||||
2. `COLIBRI_SERVICE_NAME` (default `clawdie`) → `{service}-watchdog.sock`
|
||||
3. `TMP_IPC_DIR/{service}-watchdog.sock`
|
||||
4. `AGENT_TMP_DIR/ipc/{service}-watchdog.sock` or
|
||||
`CLAWDIE_TMP_DIR/ipc/{service}-watchdog.sock`
|
||||
5. `$HOME/clawdie-ai/tmp/ipc/{service}-watchdog.sock`
|
||||
6. `tmp/ipc/{service}-watchdog.sock` (final fallback)
|
||||
|
||||
### Wire protocol
|
||||
|
||||
- Framing: one line, newline-terminated.
|
||||
- Request: `{"cmd":"status"}\n`.
|
||||
- Expected response: `{"ok": true, "data": { ... watchful host fields ... }}`.
|
||||
- Timeout: 2 seconds by default, overridable in `WatchdogReadOptions`.
|
||||
|
||||
### Normalization rules
|
||||
|
||||
`normalize_watchdog_status()` in `crates/colibri-runtime/src/lib.rs` is defensive:
|
||||
|
||||
- Missing fields default to `"unknown"` for strings, `0` for counters, and
|
||||
`false` for booleans.
|
||||
- `controlplane_status` is lifted from `controlplane.overallStatus`.
|
||||
- The original raw object is preserved under `HostStatus.raw` so callers can
|
||||
access fields Colibri does not yet model.
|
||||
|
||||
## Golden fixtures
|
||||
|
||||
`crates/colibri-contracts/tests/golden.rs` parses committed inventory and
|
||||
host-status manifests in `manifests/` and round-trips them through the Rust
|
||||
structs. Those fixtures come from real hosts (`osa`, `domedog`, `debby`, the
|
||||
operator USB) and are treated as cross-platform source material.
|
||||
|
||||
## See also
|
||||
|
||||
- [contracts](./contracts.md) — stable schemas for inventory and host-status.
|
||||
- [cost-model](./cost-model.md) — how runtime inventory feeds cost decisions.
|
||||
161
docs/wiki/skills-catalog.md
Normal file
161
docs/wiki/skills-catalog.md
Normal file
|
|
@ -0,0 +1,161 @@
|
|||
# Skills catalog
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
`colibri-skills` is Colibri's read-only runtime consumer for Clawdie-AI skill
|
||||
artifacts. Clawdie-AI authors and reviews the skillpacks; Colibri indexes
|
||||
them, validates checksums, chunks searchable text, and exposes typed structs to
|
||||
the daemon, CLI, and TUI. This crate does not author skills.
|
||||
|
||||
→ `crates/colibri-skills/src/lib.rs`
|
||||
|
||||
→ `docs/COLIBRI-SKILLS-PLAN.md`
|
||||
|
||||
## Decisions
|
||||
|
||||
### Source of truth stays in Clawdie-AI
|
||||
|
||||
Skill artifacts live in the `clawdie-ai` repository, not in `colibri`. They are
|
||||
committed reviewed directories containing prose, screenshots, transcripts,
|
||||
scripts, a manifest, and a checksum file. `colibri-skills` imports these
|
||||
artifacts into Colibri's SQLite store at runtime.
|
||||
|
||||
This split preserves review discipline: a skill changes through a PR in its
|
||||
home repo, then Colibri re-indexes the checkout.
|
||||
|
||||
### Read-only, not authoring
|
||||
|
||||
The crate deliberately lacks "create skill" or "edit skill" operations. Those
|
||||
belong in Clawdie-AI where human review and media pipelines run. Putting
|
||||
authoring here would duplicate state and split review authority.
|
||||
|
||||
The import path is target for Phase 1: scan the configured Clawdie-AI checkout,
|
||||
parse manifests, verify checksums, and upsert into SQLite. The type scaffold
|
||||
exists today; the importer, chunker, and FTS5 index are planned.
|
||||
|
||||
→ `docs/COLIBRI-SKILLS-PLAN.md` (Phases 1-7)
|
||||
|
||||
### Manifest-driven identity
|
||||
|
||||
Each skill directory contains a run manifest file. From it the importer derives:
|
||||
|
||||
- `skill_id`
|
||||
- `display_name`
|
||||
- `source_path` within the Clawdie-AI checkout
|
||||
- pipeline stages and models used
|
||||
- source media metadata
|
||||
|
||||
Any file not listed in the manifest can still be classified and indexed as an
|
||||
artifact, but the manifest is the canonical identity document.
|
||||
|
||||
### Artifact classification by extension and filename
|
||||
|
||||
`ArtifactType::from_path` classifies files without relying on a sidecar:
|
||||
|
||||
- Python or shell files → Script
|
||||
- paths containing contact_sheet → ContactSheet
|
||||
- paths containing run_manifest and ending in .json → Manifest
|
||||
- paths containing sha256 or checksum → Checksum
|
||||
- paths containing report and ending in .json → Report
|
||||
- .md → Document
|
||||
- .jpg / .png / .webp → Image
|
||||
- .txt transcript files → Transcript
|
||||
- anything else → Other
|
||||
|
||||
This heuristic keeps classification local and fast. Misclassified files can be
|
||||
fixed by renaming within Clawdie-AI.
|
||||
|
||||
→ `crates/colibri-skills/src/lib.rs` (`ArtifactType::from_path`)
|
||||
|
||||
### Checksums are validated, then stored
|
||||
|
||||
The run manifest is accompanied by a checksum file. At import time the runtime
|
||||
computes SHA-256 of each artifact and compares it to the committed checksum.
|
||||
Failures are reported in `ImportSummary::checksum_failures` and prevent
|
||||
`success()`.
|
||||
|
||||
Only the hash is stored in SQLite; image and media blobs stay on disk. The
|
||||
catalog stores relative paths and hashes, not the binary content.
|
||||
|
||||
### Content is chunked into searchable units
|
||||
|
||||
The planned chunker turns skill content into `SkillChunk` rows:
|
||||
|
||||
- Markdown sections by heading
|
||||
- Command blocks
|
||||
- Code blocks
|
||||
- Tables
|
||||
- Transcript segments
|
||||
|
||||
Chunks are the unit of search and the unit shown in TUI or CLI results.
|
||||
`SkillChunk` carries `line_start`/`line_end` so a hit can point back to the
|
||||
source artifact.
|
||||
|
||||
→ `crates/colibri-skills/src/lib.rs` (`SkillChunk`, `ChunkType`)
|
||||
|
||||
### SQLite + FTS5 as the runtime search backend
|
||||
|
||||
The target schema keeps three tables:
|
||||
|
||||
- `system_skills` — one row per skill
|
||||
- `system_skill_artifacts` — one row per file
|
||||
- `system_skill_chunks` — one row per searchable chunk, plus a virtual FTS5
|
||||
table for ranked text search
|
||||
|
||||
This matches the store's pragmatic relational model. If skill volumes grow
|
||||
beyond tens of thousands of chunks, we can move the FTS index to PostgreSQL
|
||||
pgvector; until then, SQLite keeps the control-plane self-contained.
|
||||
|
||||
→ [store-schema](./store-schema.md)
|
||||
|
||||
→ `docs/COLIBRI-SKILLS-PLAN.md` (SQLite schema target)
|
||||
|
||||
### Status is a lifecycle marker, not a state machine
|
||||
|
||||
`SkillStatus` is `active`, `archived`, or `superseded`. There is no pending
|
||||
review state because review happens in Clawdie-AI before import. Colibri simply
|
||||
stops returning archived skills in default searches but keeps them in the store
|
||||
for audit and explicit lookups.
|
||||
|
||||
### Natural-language verification question
|
||||
|
||||
Each skill can carry a `verification` field like "can the user create and run
|
||||
an Astro project?". This is not an executable test; it is the acceptance
|
||||
criterion used during skill review and later during agent self-verification.
|
||||
|
||||
### Runtime commands are read-only
|
||||
|
||||
The CLI surface is planned as:
|
||||
|
||||
- `colibri list-skills`
|
||||
- `colibri show-skill <id>`
|
||||
- `colibri search-skills <query>`
|
||||
- `colibri index-skills`
|
||||
- `colibri verify-skill <id>`
|
||||
|
||||
`index-skills` refreshes the catalog from disk. The remaining commands query the
|
||||
runtime store. None mutate the Clawdie-AI checkout.
|
||||
|
||||
→ [operator-cli](./operator-cli.md)
|
||||
|
||||
## Entity shape
|
||||
|
||||
```text
|
||||
Skill
|
||||
├─ skill_id, display_name, source_path, status, verification
|
||||
├─ SkillManifest
|
||||
│ ├─ run_id, created, notes
|
||||
│ ├─ ManifestSource
|
||||
│ ├─ [PipelineStage]
|
||||
│ └─ [ModelUsage]
|
||||
└─ [SkillArtifact]
|
||||
├─ artifact_type, relative_path, file_name, mime_type, size_bytes, sha256_hash
|
||||
└─ [SkillChunk]
|
||||
├─ chunk_type, heading, content, line_start, line_end, tokens_estimate
|
||||
```
|
||||
|
||||
## See also
|
||||
|
||||
- [store-schema](./store-schema.md) — coordination and planned skill catalog tables
|
||||
- [operator-cli](./operator-cli.md) — planned skill catalog CLI commands
|
||||
- [task-board](./task-board.md) — agents will match claimed tasks to skills by capability
|
||||
140
docs/wiki/store-schema.md
Normal file
140
docs/wiki/store-schema.md
Normal file
|
|
@ -0,0 +1,140 @@
|
|||
# Store schema
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
Colibri's coordination store is a single SQLite database owned by the `colibri`
|
||||
service. It holds the task board, the registry of agents and skills, and the
|
||||
vault tenant map. It is not a cache — it is durable state. Most writes happen
|
||||
through the daemon's socket API, but the schema belongs to `colibri-store`.
|
||||
|
||||
→ `crates/colibri-store/src/schema.rs`
|
||||
|
||||
→ `crates/colibri-store/src/lib.rs`
|
||||
|
||||
## Decisions
|
||||
|
||||
### SQLite, not PostgreSQL, for the control-plane store
|
||||
|
||||
The store is SQLite because the control plane needs a single-file database that
|
||||
is easy to back up, snapshot, inspect, and ship. PostgreSQL with pgvector is
|
||||
planned for retrieval/long-term memory, but the task board and agent registry do
|
||||
not need a server process.
|
||||
|
||||
The daemon batches related writes and relies on SQLite's WAL mode for concurrent
|
||||
readers. This keeps the operator stack self-contained on a small bare-metal host.
|
||||
|
||||
### WAL + foreign keys by default
|
||||
|
||||
`Store::open` runs three pragmas on every startup:
|
||||
|
||||
- `journal_mode=WAL` — readers don't block writers.
|
||||
- `synchronous=NORMAL` — a safe middle ground between full-synchronous and OFF.
|
||||
- `foreign_keys=ON` — the task/agent FK is enforced.
|
||||
|
||||
These are not configurable at runtime. If we ever need different durability or
|
||||
concurrency guarantees, we should make it explicit rather than letting the
|
||||
connection inherit defaults.
|
||||
|
||||
→ `crates/colibri-store/src/lib.rs` (`Store::open`)
|
||||
|
||||
### Idempotent migrations only
|
||||
|
||||
Migrations run on every `Store::open`. They use `IF NOT EXISTS` tables and
|
||||
indexes, so repeated runs are safe. We do not ship downward migrations; schema
|
||||
evolution is additive tables and columns. If a destructive migration is ever
|
||||
needed, it must be a deliberate manual step documented in a handoff.
|
||||
|
||||
→ `crates/colibri-store/src/schema.rs`
|
||||
|
||||
### Four tables for four concerns
|
||||
|
||||
| Table | Concern | Key entity |
|
||||
| --------- | ----------------------- | ---------- |
|
||||
| `tasks` | Task board | `Task` |
|
||||
| `agents` | Registered teammates | `Agent` |
|
||||
| `skills` | Team skill catalog | `Skill` |
|
||||
| `tenants` | Vault/secret tenant map | `Tenant` |
|
||||
|
||||
Tasks carry an `agent_id` foreign key into `agents`. Every other relationship is
|
||||
loose — skills are not linked to agents, and tenants are referenced by their
|
||||
`tenant_id` in socket commands and provisioning hooks.
|
||||
|
||||
→ `crates/colibri-store/src/schema.rs`
|
||||
|
||||
### Task-status CHECK constraint is the source of truth
|
||||
|
||||
`tasks.status` is constrained to `('queued','claimed','started','done','failed')`.
|
||||
The Rust `TaskStatus` enum mirrors it, but the database is the final gate. A
|
||||
command that tries to insert an unknown status fails at write time.
|
||||
|
||||
→ `crates/colibri-store/src/schema.rs`
|
||||
|
||||
### Agent capabilities stored as JSON, not normalized
|
||||
|
||||
`agents.capabilities` is a JSON blob like `["code","rust","freebsd"]`. We
|
||||
avoided a separate capabilities table because capability tags are just
|
||||
strings, and the team registry is small. Normalized joins would add schema
|
||||
complexity without improving query power.
|
||||
|
||||
If capability metadata grows (weights, versions, required skills), we can split
|
||||
it later; the current schema intentionally stays pragmatic.
|
||||
|
||||
→ `crates/colibri-store/src/lib.rs` (`register_agent`)
|
||||
|
||||
### Tenants encode the 1:1:1 jail/vault/collection map
|
||||
|
||||
`tenants` stores `tenant_id`, `jail_root_path`, and `collection_id` as UNIQUE
|
||||
columns. The rule is `tenant_id = jail name = Vaultwarden collection`. This
|
||||
lets `colibri-vault` look up a jail by name and know exactly which host path and
|
||||
Vaultwarden collection to use when writing the environment file.
|
||||
|
||||
The tenant `status` column tracks the lifecycle:
|
||||
`provisioned → active → stopped → destroyed`. It is independent of whether the
|
||||
jail process is running; lifecycle management is a separate concern.
|
||||
|
||||
→ `crates/colibri-store/src/schema.rs` (comments on `tenants`)
|
||||
|
||||
### Default database path is platform-specific
|
||||
|
||||
The store default is:
|
||||
|
||||
- `COLIBRI_DB_PATH` if set.
|
||||
- FreeBSD: `/var/db/colibri/colibri.sqlite`.
|
||||
- Linux/macOS: `$XDG_DATA_HOME/colibri/colibri.sqlite`, falling back to
|
||||
`$HOME/.local/share/colibri/colibri.sqlite`, then `/tmp`.
|
||||
|
||||
FreeBSD defaults to `/var/db` because that is the conventional local-state
|
||||
directory for services. The Linux fallback respects XDG, so development on a
|
||||
workstation feels normal.
|
||||
|
||||
→ `crates/colibri-store/src/lib.rs` (`default_db_path`)
|
||||
|
||||
### JSON export for backups and parity tests
|
||||
|
||||
`Store::export_json()` dumps all four tables into one JSON object. It exists
|
||||
for dual-run parity diffs, ad-hoc backups, and debugging. It is not the primary
|
||||
query API; most readers should use the typed methods.
|
||||
|
||||
## Entity relationships
|
||||
|
||||
```text
|
||||
tasks.agent_id ----------> agents.id
|
||||
|
||||
tasks agents skills tenants
|
||||
----- ------ ------ -------
|
||||
id id id tenant_id
|
||||
agent_id FK name name jail_root_path
|
||||
status capabilities description collection_id
|
||||
title status category status
|
||||
description created_at created_at created_at
|
||||
created_at updated_at
|
||||
updated_at
|
||||
```
|
||||
|
||||
## See also
|
||||
|
||||
- [task-board](./task-board.md) — task lifecycle and capability matching
|
||||
- [operator-cli](./operator-cli.md) — socket commands that write to these tables
|
||||
- [vault-provision](./vault-provision.md) — how the tenants table drives env-file provisioning
|
||||
- [jail-confinement](./jail-confinement.md) — jail names map to tenant rows
|
||||
- [skills-catalog](./skills-catalog.md) — the read-only skills consumer
|
||||
104
docs/wiki/tui.md
Normal file
104
docs/wiki/tui.md
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
# Terminal dashboard (colibri-tui)
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
The TUI is Colibri's live terminal dashboard. It connects to the daemon's Unix
|
||||
socket, pulls the `GlasspaneSnapshot`, and renders a color-coded table of
|
||||
supervised panes. It is a **display client**, not part of the daemon, and not
|
||||
the same thing as `colibri-glasspane`.
|
||||
|
||||
## Why it is not `colibri-glasspane`
|
||||
|
||||
`colibri-glasspane` is the **state machine** that decides what state an agent
|
||||
is in from its JSONL events. `colibri-tui` is the **screen** that asks the
|
||||
daemon "what does the radar look like right now?" and draws it.
|
||||
|
||||
| Artifact | Role | Resident crate |
|
||||
| ------------------- | -------------------------------------------------------- | ------------------------------------------------------- |
|
||||
| `colibri-glasspane` | Pane state machine, event ingestor, snapshot builder | `crates/colibri-glasspane` |
|
||||
| `colibri-tui` | Terminal dashboard client with rows, colors, keybindings | `crates/colibri-glasspane-tui` (binary = `colibri-tui`) |
|
||||
|
||||
The split matters because the daemon, the MCP bridge, the CLI, and tests all
|
||||
use `colibri-glasspane`. The TUI is just one consumer. If the TUI is not
|
||||
installed, or crashes, agents keep running.
|
||||
|
||||
## Decisions
|
||||
|
||||
### Keep the daemon separate from any terminal UI
|
||||
|
||||
`colibri-tui` is a standalone process. It resolves the daemon socket the same
|
||||
way the CLI does (`DaemonConfig::from_env().socket_path`), then calls
|
||||
`client.glasspane_snapshot()` every two seconds. The daemon has no awareness of
|
||||
crossterm or ratatui.
|
||||
|
||||
This is the same "service owns state, clients render it" pattern as the MCP
|
||||
bridge and the CLI. It keeps Colibri headless-safe, which is required for an
|
||||
`rc.d` service that must boot before any operator logs in.
|
||||
|
||||
→ `crates/colibri-glasspane-tui/src/main.rs` (socket resolution, refresh loop)
|
||||
|
||||
### TUI gets spawn/kill keys, not just read-only status
|
||||
|
||||
You can spawn a local test agent (`s`) and kill the selected pane (`x`) from
|
||||
the dashboard. That overlaps with commands the `colibri` CLI can already do,
|
||||
but the experience is different: a CLI command is one-shot; the TUI is a live
|
||||
supervision surface with a selected row and an immediate status bar.
|
||||
|
||||
We kept the action keys because the dashboard's job is to let an operator
|
||||
notice and react — spot a stalled pane and kill it without leaving the
|
||||
terminal.
|
||||
|
||||
→ `crates/colibri-glasspane-tui/src/main.rs` (`spawn_agent`, `kill_selected`)
|
||||
|
||||
### One taxonomy from one snapshot
|
||||
|
||||
The TUI does not parse agent stdout. It only reads the already-folded
|
||||
`GlasspaneSnapshot`, so Pi, zot, and local test agents are rendered with the
|
||||
same columns, colors, and state icons. The rendering code concerns itself only
|
||||
with layout and keybindings; all semantic decisions live in
|
||||
`colibri-glasspane`.
|
||||
|
||||
→ `crates/colibri-glasspane/src/lib.rs` (`AgentState`, `GlasspaneSnapshot`)
|
||||
|
||||
### Naming: the binary is `colibri-tui`, the crate is `colibri-glasspane-tui`
|
||||
|
||||
The crate directory is `colibri-glasspane-tui` because the package implements
|
||||
"a TUI for the glasspane." The installed binary is named `colibri-tui`
|
||||
because that is what an operator types. `CLAWDIE-STUDIO-PROPOSAL.md` and other
|
||||
docs refer to `colibri-tui` as shorthand; there is no separate `colibri-tui`
|
||||
crate.
|
||||
|
||||
This duality is currently accepted. If we ever add a second TUI surface (e.g.
|
||||
a `colibri-tui-web` or `colibri-tui-gui`), the naming becomes confusing and
|
||||
should be revisited.
|
||||
|
||||
## Current keybindings
|
||||
|
||||
| Key | Action |
|
||||
| ---------------------- | ----------------------------------------------- |
|
||||
| `q` / `Esc` | Quit, or close detail pane if open |
|
||||
| `r` | Refresh snapshot now |
|
||||
| `s` | Spawn a local `colibri-test-agent` |
|
||||
| `x` | Kill the selected pane |
|
||||
| `Enter` | Open/close the detail pane for the selected row |
|
||||
| `Tab` / `Shift-Tab` | Cycle through distinct sessions |
|
||||
| `j` / `k` or `↓` / `↑` | Navigate the pane table |
|
||||
|
||||
## When to use the TUI vs the CLI
|
||||
|
||||
Use the TUI when:
|
||||
|
||||
- You want a live, auto-refreshing view of all panes.
|
||||
- You are picking a pane to inspect or kill visually.
|
||||
- You are on an SSH session with only a terminal.
|
||||
|
||||
Use the `colibri` CLI when:
|
||||
|
||||
- You are scripting or piping output (`colibri snapshot | jq`).
|
||||
- You need a command not bound to a key (e.g. `claim-task`, `set-cost-mode`).
|
||||
- You want a one-shot answer without entering an alternate screen.
|
||||
|
||||
## See also
|
||||
|
||||
- [glasspane](./glasspane.md) — the pane state machine the TUI renders
|
||||
- [operator-cli](./operator-cli.md) — the `colibri` CLI that shares the same socket client
|
||||
163
docs/wiki/vault-provision.md
Normal file
163
docs/wiki/vault-provision.md
Normal file
|
|
@ -0,0 +1,163 @@
|
|||
# Vault provision
|
||||
|
||||
← [index](./index.md)
|
||||
|
||||
`colibri-vault` fetches secrets from a Vaultwarden collection and writes them
|
||||
into a freshly created jail as `0600` env-file. It is invoked as a post-spawn
|
||||
hook from the daemon, not by a human operator at provision time. The human step
|
||||
is registering a tenant mapping; the daemon does the secret fetch.
|
||||
|
||||
→ `crates/colibri-vault/src/lib.rs`
|
||||
|
||||
→ `crates/colibri-daemon/src/daemon.rs` (`provision_tenant_env`)
|
||||
|
||||
→ `docs/VAULT-PROVISION-RUNBOOK.md`
|
||||
|
||||
## Decisions
|
||||
|
||||
### Tenant = jail name = Vaultwarden collection
|
||||
|
||||
The `tenants` table stores a 1:1:1 map:
|
||||
|
||||
- `tenant_id` — the jail name.
|
||||
- `jail_root_path` — the host-visible root of the jail.
|
||||
- `collection_id` — the Vaultwarden collection name (kept equal to the jail name).
|
||||
|
||||
This means `colibri-vault` does not need a separate lookup table or configuration
|
||||
file. It finds the collection by the jail name and knows the destination path
|
||||
from the tenant row.
|
||||
|
||||
→ [store-schema](./store-schema.md)
|
||||
|
||||
### Provisioning is a post-spawn hook, not a separate command
|
||||
|
||||
When the daemon spawns an agent with both `--jail-name` and `--jail-root`, it
|
||||
calls `provision_tenant_env` after the jail is up. If the jail name has no
|
||||
matching tenant row, the hook no-ops. If the provision fails, the agent still
|
||||
starts, because a missing secret file should not leave the host with stale
|
||||
partial jails. The daemon logs the failure.
|
||||
|
||||
→ `crates/colibri-daemon/src/socket.rs` (`jail_provision_target`)
|
||||
|
||||
### Fail-soft on missing tenant or vault error
|
||||
|
||||
The hook returns early (and silently) when:
|
||||
|
||||
- no tenant row matches the jail name;
|
||||
- the stored `jail_root_path` does not match the spawned root; or
|
||||
- the vault call fails.
|
||||
|
||||
These are warnings, not hard errors. The spawn itself succeeds. This reflects the
|
||||
operational reality that secret tooling may be unavailable during boot or
|
||||
experimental spawns, while the agent process should still be observable.
|
||||
|
||||
### Path containment before any write
|
||||
|
||||
`colibri-vault::provision` canonicalizes the target directory and asserts it is
|
||||
strictly under the configured jail-root base (`/usr/local/bastille/jails` by
|
||||
default, overridable with `COLIBRI_JAIL_ROOT_BASE`). The check runs before
|
||||
`create_dir_all`, so a symlink or `..` path that escapes the jails tree results
|
||||
in `TargetEscapesRoot` before any file is created.
|
||||
|
||||
This is the same filesystem containment primitive reused by the external MCP
|
||||
server spawner.
|
||||
|
||||
→ [jail-confinement](./jail-confinement.md)
|
||||
|
||||
### Wrap the official `bw` CLI
|
||||
|
||||
We do not speak the Vaultwarden REST protocol directly. `colibri-vault` shells
|
||||
out to the official `bw` CLI. This keeps authentication, session management, and
|
||||
crypto off our plate.
|
||||
|
||||
The `bw` lifecycle is serialized across the process with a static `Mutex` because
|
||||
`bw` keeps global state (one configured server and one session token per
|
||||
process). Concurrent provisions would otherwise race on `bw config server` or
|
||||
tear down each other's session.
|
||||
|
||||
### Bootstrap creds come from the daemon environment
|
||||
|
||||
The daemon is expected to receive three variables from the operator-provided
|
||||
provider environment file:
|
||||
|
||||
- `BW_CLIENTID`
|
||||
- `BW_CLIENTSECRET`
|
||||
- `BW_PASSWORD`
|
||||
|
||||
Optional:
|
||||
|
||||
- `BW_SERVER` — the Vaultwarden host.
|
||||
- `COLIBRI_JAIL_ROOT_BASE` — base path used for containment checks.
|
||||
|
||||
The CLI never sees these values; it only registers the tenant row that triggers
|
||||
the hook.
|
||||
|
||||
→ [operator-cli](./operator-cli.md)
|
||||
|
||||
### Server-mismatch is fail-closed
|
||||
|
||||
If `BW_SERVER` is set and `bw` is already logged in to a different server,
|
||||
`provision` returns `ServerMismatch`. We do not wipe state automatically because
|
||||
cross-server confusion could leak credentials. An operator must `bw logout` if
|
||||
they want to switch servers.
|
||||
|
||||
### Env-file content from login items and secure notes
|
||||
|
||||
Each Vaultwarden collection item becomes one or more `KEY=VALUE` lines:
|
||||
|
||||
- **Login item**: `item.name` becomes the key, `login.password` becomes the value.
|
||||
- **Secure note**: each line is parsed as `KEY=VALUE` from the note body.
|
||||
|
||||
Keys are validated to `[A-Z0-9_]` after normalizing spaces, dashes, and dots
|
||||
to underscores. Invalid keys are skipped with a warning.
|
||||
|
||||
Note: a key collision between two items produces a duplicate line. The consumer
|
||||
is expected to ignore duplicates or define items accordingly.
|
||||
|
||||
### File mode and atomic-ish placement
|
||||
|
||||
The env file is written into the target directory and set to mode `0600`. The
|
||||
target directory is created if it does not exist, but it must already resolve
|
||||
under the jail-root base. The write is a single `std::fs::write`, then a
|
||||
permission change; it is not atomic-swap. If the daemon crashes between the
|
||||
write and the `chmod`, the file could momentarily have looser permissions. For
|
||||
now, we accept this because the daemon has the directory created immediately
|
||||
before the write and the target is inside the jail.
|
||||
|
||||
### Tenant status follows the provision state
|
||||
|
||||
`register_tenant` inserts the row with `status = provisioned`. After a successful
|
||||
vault provision, the hook flips it to `active`. A stopped or destroyed jail may
|
||||
later be moved to `stopped` or `destroyed` by the operator or a teardown flow.
|
||||
|
||||
Strictly, `provisioned` means the row is created; `active` means the secrets
|
||||
have been materialized at least once.
|
||||
|
||||
## Flow
|
||||
|
||||
```text
|
||||
register-tenant tenant_id jail_root collection_id
|
||||
|
|
||||
v
|
||||
spawn-agent --jail-name tenant_id --jail-root jail_root
|
||||
|
|
||||
v
|
||||
provision_tenant_env(tenant_id, jail_root)
|
||||
|-- no tenant row -> no-op
|
||||
|-- root mismatch -> warn, no-op
|
||||
|-- else
|
||||
v
|
||||
bw login -> unlock -> list collection -> list items -> write env file @ 0600
|
||||
|
|
||||
v
|
||||
set tenant status = active
|
||||
agent starts running
|
||||
```
|
||||
|
||||
## See also
|
||||
|
||||
- [store-schema](./store-schema.md) — how the tenant row is stored
|
||||
- [jail-confinement](./jail-confinement.md) — how jails are created and confined
|
||||
- [operator-cli](./operator-cli.md) — `register-tenant` and `spawn-agent` verbs
|
||||
- [mother-hive](./mother-hive.md) — a related Vaultwarden-backed pubkey exchange
|
||||
used to authorize agents to call mother
|
||||
Loading…
Add table
Reference in a new issue