colibri/docs/COLIBRI-JAILED-AGENT-SPAWN-DESIGN.md
Sam & Claude a7565c49ad
Some checks failed
CI / markdown (pull_request) Has been cancelled
CI / port (pull_request) Has been cancelled
CI / agent-jail-pkgs (pull_request) Has been cancelled
CI / rust (pull_request) Has been cancelled
fix(spawner): stage jail spawn files under daemon-owned home, not /var/run
Closes #135. The daemon stages per-spawn launch.sh/env.sh under the jail root;
the previous location /var/run/colibri-stage is root-owned, so the daemon
(running as clawdie) could not create per-spawn subdirs there — the second
jail-spawn EACCES, worked around in #134 by pre-creating the dir in
agent-jail-bootstrap.sh.

Move the default staging root to the daemon user's home,
/home/clawdie/.cache/colibri/stage, which clawdie owns by construction of the
jail account. create_dir_all now succeeds with no privileged pre-creation step,
and /home is persistent (unlike a tmpfs /var/run). The path is overridable via
COLIBRI_JAIL_STAGE_DIR, matching the daemon's other env-configurable paths.

- spawner.rs: const → staged_jail_run_dir() resolver; updated unit test.
- agent-jail-bootstrap.sh: drop the now-unnecessary install -d staging block
  and DAEMON_USER var (the #134 workaround).
- docs: update jailed-spawn design + truss analysis to the new location.

clippy clean; spawner suite green (21 tests); sh -n clean; touched docs pass
the markdown gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 17:37:32 +02:00

4.6 KiB

Colibri jailed agent spawn

Status: Accepted — implemented · Date: 13.jun.2026

How Colibri confines a spawned agent (e.g. pi) inside a FreeBSD jail, and how the unprivileged daemon gets the root that jails require. This describes the shipped code in crates/colibri-daemon/src/spawner.rs.

Why this lives in Colibri, not zot

Colibri is the supervisor and already spawns agents — spawner.rs runs the subprocess, captures its JSONL, and feeds glasspane. Confinement is a supervisor concern, so it lives here, and zot stays a clean upstream mirror. (zot's own swarm only spawns copies of zot and has no isolation, so it was never the right place for this.)

How it works

A spawn can carry an optional JailConfig; with none, the agent runs on the host as before. The field that is set picks the jail lifecycle:

  • name — enter an already-running persistent jail with jexec (created/destroyed out of band by rc.d / the operator). Takes precedence.
  • path — create an ephemeral jail with jail -c … command=<binary>, which exists only while the agent runs and is removed when it exits (no teardown needed).
  • root_path — host-visible root path for a named jail; required when staged env/working-dir payload delivery is needed. Falls back to path for ephemeral jails.
  • optional ip4 (inherit by default) and user (in-jail user, jexec path).

jail_wrap() turns (binary, args) into the (program, argv) to exec. stdio is untouched — jexec, jail, and mdo all run the child in the foreground and inherit stdin/stdout — so the agent's JSON stream still reaches glasspane and the MCP host's stdin/stdout transport still works.

This is wired through the spawn-agent socket command (any caller can request a jail) and reused by the external-MCP host (colibri-mcp), which confines arbitrary third-party MCP servers the same way.

Privilege: how the unprivileged daemon gets root

Jail attach (jexec) and create (jail) are root-only, but colibri_daemon runs unprivileged. The deciding fact: FreeBSD mac_do rules are identity mappings (security.mac.do.rules=gid=0>uid=0 means "wheel may become root"), not command filters — so granting the daemon mdo access grants it full root, not just jexec. We choose the escalation per host via PrivMode (COLIBRI_JAIL_PRIV_MODE):

  • Live operator USB → mdo (default). The single operator already holds wheel→root, so a trusted local daemon is the same trust domain — mdo -u root reuses the image's existing mac_do plumbing, no new privileged binary.
  • Deployed / shared host → setuid helper. A socket-facing daemon with blanket root is a real escalation surface, so use a narrow setuid helper (/usr/local/libexec/colibri-jail-spawn) that only performs the jail spawn, and keep the daemon unprivileged.
  • Validated hosts with existing sudo policy → sudo. sudo -n can be used as an interim proof/ops mode when a narrow sudoers rule already permits the daemon user to run the jail command without prompting. Prefer the setuid helper for long-lived production hosts once packaged.
  • none — run the jail command directly (already root, or tests).

Staged env payloads

When a jailed spawn needs env vars or a working dir, prepare_spawn_command() writes a 0600 env.sh (sorted, single-quoted exports) and a launch.sh wrapper into a staged directory under the jail's root_path at /home/clawdie/.cache/colibri/stage/<id>/ (the daemon user's home, so the daemon creates it with no privileged step; overridable via COLIBRI_JAIL_STAGE_DIR). The jail command runs /bin/sh launch.sh, which sources the env file and cds to the working dir before exec-ing the agent binary. This bypasses the env-passthrough problem entirely — no reliance on jexec/mdo inheriting env vars.

The staged directory is cleaned up when the agent stops, fails, exits early, or encounters a poll error. The same mechanism is used by the external-MCP host for jailed MCP servers.

Open items

  • Teardown: ephemeral jail -c command= self-cleans; reaping a deeply nested in-jail process tree may want a process-group kill (follow-up).
  • Jail filesystem provisioning (ISO / deploy): the jailed binary needs its runtime + work dir — a pre-provisioned persistent jail, or nullfs mounts for an ephemeral one.

References

  • crates/colibri-daemon/src/spawner.rsJailConfig, PrivMode, jail_wrap, prepare_spawn_command, PreparedSpawnCommand
  • crates/colibri-daemon/src/lib.rs + socket.rsjail on the spawn-agent command
  • crates/colibri-mcp/src/external.rs — jailed external MCP servers