colibri/docs/guide/architecture/host-operator-model.md
Sam & Claude 95c487546d
Some checks are pending
CI / rust (pull_request) Waiting to run
CI / markdown (pull_request) Waiting to run
CI / port (pull_request) Waiting to run
CI / agent-jail-pkgs (pull_request) Waiting to run
docs(guide): port 39 procedural docs from clawdie-ai to colibri
New docs/guide/ tree — canonical home for operator-facing procedural docs.
Starlight frontmatter added to all files. 0.12 alignment fixes applied:

- v0.11.0 → v0.12.0 throughout
- PI_TUI_PROVIDER/MODEL → DEEPSEEK_API_KEY
- Headless Codex login → Agent runtime setup (zot + RPC mode)
- /login and auth.json references removed
- pi → zot in provider-fallback spawn reference
- colibri-provider-verify (was pi-provider-smoke)
- Language cleanup: smoke test → verification, fake → test,
  can't self-fix → requires operator intervention,
  broken → unresponsive, Fix anything broken → Verify all checks pass

Two-tree model: docs/wiki/ (decisions) + docs/guide/ (procedural).
Single source of truth in colibri. clawdie-ai docs/public/ to be retired.
2026-06-26 09:16:43 +02:00

5.5 KiB

title
Host Operator Model

Current main uses the FreeBSD host as the operator surface.

There is no dedicated operator jail.

Identity split

Keep these roles separate:

  • operator account: the human login account on the host, for example sam
  • service account: the account that runs Clawdie, default clawdie
  • shared platform namespace: system
  • assistant display name: for example Atlas
  • tenant id: only for later additive tenants such as bob or jane

The root install is not modeled as tenant zero. It owns shared platform state.

Core Rule

  • SSH into the FreeBSD host
  • run Ansible against the FreeBSD host
  • manage Bastille jails from the host
  • treat the Data Service, Git Service, Web Service, Browser Execution Template, and worker jails as host-managed infrastructure

This keeps the trust boundary simple and avoids a second operator layer inside the jailed runtime.

Shared platform resources

Root/shared state lives under fixed shared names:

  • system_ops
  • system_brain
  • system_skills
  • system_git
  • system_web

These belong to the shared platform regardless of the assistant display name.

Responsibilities

The host owns:

  • Bastille lifecycle
  • warden0 bridge and PF/NAT
  • ZFS datasets and snapshots
  • rc.d service installation
  • .env
  • /etc/hosts managed internal block
  • deployment and verification steps

The service jails own:

  • PostgreSQL in the db jail when DB_RUNTIME=jail is explicitly chosen
  • the shared Git Service in the git jail
  • the shared Web Service in the cms jail (Strapi is optional, Ansible-managed when used)

Workers own only sandboxed execution.

Subnet Layout

The host uses a two-tier IP scheme:

Shared services<subnet>.x (one per host, not per assistant name):

Slot Role Notes
<subnet>.1 gateway warden0 host bridge
<subnet>.2 Git Service shared git jail
<subnet>.3 Web Service shared cms jail (nginx, Astro)
<subnet>.4 Local AI Models shared ollama / llama.cpp runtime when enabled
<subnet>.5 Data Service optional db jail when DB_RUNTIME=jail
<subnet>.6 Browser Execution Template thick Chromium + Node template

Worker range.101+ is reserved for worker and automation jails.

Set WARDEN_SUBNET_BASE in .env to the private /24 you want to use. Repo examples often use 10.0.1, but a live host may use 192.168.72 or any other private subnet. When DB_RUNTIME=host, jails connect to Postgres at ${AGENT_SUBNET_BASE}.1:5432 (warden0 on the host).

Control vs Observe

The host controls — hostd, controlplane, watchdog all run on the host with root access. They issue commands to jails via bastille, ZFS, PF, and rc.d.

Service jails provide persistent services. Current policy keeps git and cms thin, keeps db thick only when the optional jail runtime is used, and keeps workers thin.

Access Pattern

Default automation path:

operator -> ssh -> FreeBSD host -> bastille cmd -> db|git|cms|worker

That means:

  • no default sshd inside service jails
  • no default jail-to-host automation key chain
  • fewer secrets and fewer trust boundaries to maintain

Privileged Host Daemon

At runtime, the agent user never calls sudo. All privileged host operations go through hostd — a root daemon on /var/run/clawdie-hostd.sock by default:

agent user process
  -> src/hostd/client.ts -> hostd(op, params)
    -> /var/run/clawdie-hostd.sock (Unix socket, mode 0660, group clawdie)
      -> src/hostd/daemon.ts (root)
        -> whitelisted op handler (bastille, zfs, pf, service, etc.)

Two rc.conf entries:

  • clawdie_hostd_enable=YES — root daemon, always on
  • clawdie_enable=YES (or AUTO) — user agent

The --step hostd setup step installs the rc.d script and starts the daemon. src/controlplane.ts checks hostd reachability at startup and every 5 minutes, and attempts to repair service jails and PF via hostd when needed.

Privilege Model

Two rc.d services — always separate:

Service User Why
clawdie_hostd root Needs bastille, ZFS, PF
clawdie clawdie service user No privilege needed; talks to hostd via socket

The agent rc.d script uses daemon -u clawdie to drop from root to the service user before exec. The generated run-clawdie.sh wrapper sets HOME=/home/clawdie so SSH keys and git identity in the service user's home directory resolve correctly. Runtime dirs (data/, logs/, groups/) are pre-created and chowned to the service user during setup/service.ts to avoid EACCES on first startup.

Identities

Keep operator identities explicit on the host:

  • interactive operator account, typically clawdie or atlas
  • shared platform services installed through host-managed setup steps

Avoid building operational assumptions around an internal operator jail.