Fold in the agreed refinements from Codex's review. Proposal is now
explicit about what's decided and what gates on Phase 0.6.
Status: PROPOSAL — pending Phase 0.6 Bastille/ZFS clone lifecycle
validation. Hard gate before any BROWSER-JAIL.md or BROWSER-JAIL-HANDOFF.md
edits.
Key decisions baked in:
- VNET-safe naming throughout (browserop, browserclean, browsertaskNNN);
the hyphenated names in the original proposal would have been rejected
by bastille (confirmed in Phase 0.5 viability).
- Two templates, default "clean"; "operator" requires explicit
authorization, not silent model tool-call grant.
- Sealed-snapshot clones — Chromium stopped + SQLite quiesced before
the snapshot used as the clone source. No cloning from a live
mutable profile.
- No external ingress to template jail; controlplane-mediated refresh
only.
- Defer Firefox sync. Manual refresh in MVP.
- PF approach: static ruleset matching pfctl table "browser_tasks";
per-clone membership via pfctl -t -T add/delete. No full firewall
reload per session.
- Watchdog rule: templates are infrastructure (must stay up); task
clones are session-owned ephemeral resources (disappearance is normal).
- Screenshots/audit stored outside clone datasets so they survive
clone destruction.
- 7-step cleanup order codified (service stop, chrome TERM/KILL,
unmount, bastille stop, PF delete, IP release, zfs destroy).
New sections:
- Seven additional open questions from Codex review (VNET naming/IP
pool, sealed snapshot mechanics, profile clone correctness, RCTL
limits, screenshot lifetime, orphan reaper, operator-template
auth UX).
- Phase 0.6 spike (GATING) with hard acceptance criteria:
10 sequential cycles with zero orphans, median <2s p95 <5s clone+
start latency, sealed-template cookie visible in clone, idempotent
reaper.
- Failure modes that change the verdict.
- Status summary table.
No edits to BROWSER-JAIL.md or BROWSER-JAIL-HANDOFF.md.
Significant architectural change vs the current BROWSER-JAIL.md design.
Replaces "one long-lived jail + per-task BrowserContext" with persistent
template jails (operator-browser, clean-browser) + ephemeral per-task
ZFS clones.
Motivation: the current design has no story for persistent operator
logins. Every task starts blank — 2FA on every run, no usable workflow
for authenticated services. Cloning a thick template via ZFS is
~constant-time, plays to clawdie's existing platform strengths
(bastille, hostd, ZFS), and gives per-task jail-level isolation rather
than BrowserContext-level.
Status: PROPOSAL — six open questions documented for Codex review
before any BROWSER-JAIL.md edits or implementation reshape. Specifically
seeks Codex's read on bastille clone operational smoothness at per-task
rate, watchdog tolerance for fluctuating jail count, and PF rule
generation cost per clone.
No changes to BROWSER-JAIL.md or BROWSER-JAIL-HANDOFF.md in this commit —
those land after the proposal is accepted or amended.
Replace per-call persist:false with a session-level record mode set at
open_session, immutable for the session's life. Three modes:
- off: nothing written to disk; model still sees screenshots in
context.
- transient: last N=50 screenshots in a FIFO ring buffer per session.
Default. Enough for post-hoc debugging without unbounded
growth.
- audit: persist all with 7d retention. Explicit opt-in for
sensitive operations.
Default resolution: explicit param → tenant default → system default
("transient"). MVP hardcodes the system default; tenant overrides are
Phase 2.
Rationale: screenshots serve three different jobs (agent's eyes,
debugging trace, forensic audit), and a single retention policy can't
serve all three without either drowning in disk or losing audit value.
The dashcam analogy in the doc covers this directly. Per-call
persistence flags are messy and per-tenant audit-flagging at session
level was the wrong granularity.
Also:
- Credential-exfiltration mitigation in the threat model now describes
the off/audit levers an operator has.
- Future enhancement noted: browser.freeze_session to promote a
transient ring buffer to audit retention without restarting.
- Phase 1A handoff updated: POST /sessions accepts record, response
echoes it; /screenshot persistence behavior tied to session record
mode with explicit test points.
Move the spike workspace from the gitignored tmp/ scratch dir into
scripts/browser-jail-spike/ so Codex (or anyone) can re-run it on
FreeBSD with the keys already configured on the host. Self-contained:
fixtures, CDP renderer, OpenAI-compat harness, scorer, plus the
committed screenshots and ground-truth JSON so the experiment is
reproducible without re-rendering.
Claude Opus 4.7 baseline included in results/ (17/17 PASS at 30 px,
mean 1 px). Pending columns:
- GPT-4o via OPENAI_API_KEY
- GLM-4V via ZAI_API_KEY (pi's existing provider)
- UI-TARS-7B via vLLM if/when an endpoint exists
Path references in VISION-GROUNDING-FINDINGS.md and
BROWSER-JAIL-HANDOFF.md updated to match the new location.
Make the docs renderer name match its purpose, add CMS_DOCS_SITE_PATH with ASTRO_SITE_PATH compatibility, and update docs publishing paths.
---
Build: pass | Tests: pass — 2372 passed (704 files)
Every file under docs/internal/ ends up in the bootstrap/skills-memory
artifact (per metadata.json: "Full project docs, internal docs, identity
files, and skill definitions"). Stale handoffs, dated build reports,
single-commit reviews, and superseded design notes were polluting the
embedding index with low-signal chunks.
Removed:
- TLS-CERT-LIFECYCLE-HANDOFF.md, GLASSPANE-FREEBSD-HANDOFF.md,
CMS-ASTRO-SOURCE-OF-TRUTH-HANDOFF.md (handoffs whose work has landed)
- HOST-DB-READINESS-REVIEW.md, HOST-DB-REBOOT-REVIEW.md,
HOST-DB-RECOVERY-PLAN.md, SYSTEM-NAMESPACE-BRANCH-REVIEW.md
(commit/branch reviews self-marked as historical)
- BUILD-TEST-REPORT-06.APR.2026.md, test-results.md (dated snapshots)
- DEBUG_CHECKLIST.md (Feb 2026 known-issues list, top item already fixed)
- BOOTABLE-ISO-PLAN-V1.md (V1 plan; ISO-FIRST-BOOT-IMPLEMENTATION.md is now
the source of truth)
- STRAPI-FREEBSD-SETUP.md, PI-SKILLS-INTEGRATION.md, CODEX-FREEBSD.md
(workarounds and one-off design notes for resolved/superseded paths)
- REFACTOR-PLAN.md, nanoclaw-architecture-final.md, AGENT-HARNESS-V2.md,
AGENT-SKILLS-VS-REALITY.md (older planning/architecture docs whose
decisions are now in code or ARCHITECTURE.md)
- BUILTIN-KNOWLEDGE-SPEC.md, LOCAL-KNOWLEDGE-BOOTSTRAP.md (early specs
superseded by SKILLS-ARTIFACT-V1-PLAN.md)
- HEARTBEAT.md (design doc; implementation lives in scripts/heartbeat.sh
and src/controlplane-heartbeat.ts)
- POSTGRES-PERMISSIONS.md (one-off fix recipe)
- RUNTIME-MANIFEST-DESIGN.md (status: Implemented; design is in code now)
Updates to remaining files patch broken cross-links:
- ARCHITECTURE.md drops the two table rows pointing at deleted docs
- doc/THREE-BIRD-ARCHITECTURE.md drops Strapi-setup link references
- docs/internal/SKILLS-ARTIFACT-V1-PLAN.md drops the "Depends on" line
- docs/internal/SUDO_REPLACEMENT.md trims its list of internal docs that
reference sudo
- .agent/skills/setup and .agent/skills/docs-deployment drop pointers to
REFACTOR-PLAN and DEBUG_CHECKLIST
Net: 23 files deleted, 7566 lines removed. docs/internal/ goes from 41 to
18 markdown files. The artifact's next refresh will see proportionally
less noise in retrieval.
---
Build: FAIL | Tests: FAIL — 16 failed
Sweep active code, tests, identity files, public docs, CMS seed content, and stale handoffs so old assistant-name fixtures no longer leak into current Clawdie/system-namespace behavior. Keep the skills-memory SQL artifact unchanged per regeneration policy.
---
Build: pass
Tests: pass — 2197 passed (164 files)
---
Build: pass | Tests: pass — 2197 passed (650 files)
Phase 7e jail isolation, coordinator validation, and daily sync check
are the only remaining deletion criteria.
---
Build: pass | Tests: untested — doc only
Update harness validation checklist based on confirmed Telegram /model flow; clarify PF isolation checks (use TCP probes since ping is blocked in jails).
- All 6 original bugs fixed, 3 new found in fixes (2 fixed, 2 open trivial/low)
- Remaining: /model e2e, Phase 7e jail isolation, coordinator scenarios
- Full commit history table (20 commits this session)
- Architecture notes updated with advisory lock flow and budget breakdown
---
Build: pass | Tests: untested — doc only
6 bugs found in FreeBSD agent's 6 commits:
- HIGH: dead code ensureSymlinkOnlyWhenMissing() in setup/cms.ts
- HIGH: three competing bastille list parsers need unification
- MEDIUM: ensureSitemapStub() may be unnecessary after zod removal
- MEDIUM: dashboard column parsing may be wrong
- LOW: budget allocation magic numbers undocumented
- TRIVIAL: double blank line in .gitignore
---
Build: pass | Tests: untested — doc only
The hostd-bridge now routes through the controlplane API instead of
direct Unix socket. 6 files updated:
- ARCHITECTURE.md: jail isolation section — hostd via API, no socket mount
- doc/CONTROLPLANE-ARCHITECTURE.md: hostd tree shows API proxy route
- doc/CONTROLPLANE-MESSAGE-CONTRACT.md: add POST /api/controlplane/hostd
endpoint with request/response examples
- docs/public/operate/security.md: hostd section describes HTTP proxy
model with CONTROLPLANE_SHARED_SECRET auth
- .env.example: document CONTROLPLANE_HOST_IP (default 10.0.1.1)
- doc/HANDOFF-ISO-AGENT.md: add sections 4 (hostd API proxy) and 5
(legacy agent ID removal) to breaking changes
Build: pass | Tests: not run (Linux) (Sam & Claude)
- controlplane-runner.ts: remove CANONICAL_AGENT_MAP and 3 legacy entries
from AGENT_JAIL_MAP (sysadmin, db-admin, git-admin). Legacy IDs were
removed from the DB schema in 0f7fbc4 — these mappings are dead code.
resolveCanonicalAgentId now returns input unchanged.
- Delete doc/HANDOFF-JAIL-EXTENSIONS.md — resolved by re-running
'sudo just setup-agent-jails' which writes the PI_EXTENSIONS_DIR mount.
Build: pass | Tests: not run (Linux) (Sam & Claude)
Correctness:
- controlplane-db.ts: add claimTask() — atomically updates task status
from 'pending' to 'in_progress' via conditional UPDATE. Returns false
if already claimed, preventing double-execution between onTaskCreated
callback and heartbeat loop.
- controlplane-heartbeat.ts: claim task before running agent in both
the on-demand task loop and the per-agent heartbeat task pickup.
Skip with 'task_already_claimed' if race lost.
Dead code removed:
- config.ts: remove ENCRYPTED_DIR, ENCRYPTED_SCREENSHOTS_DIR,
TMP_SKILLS_DIR, SCREENSHOTS_DIR, AGENT_LOG_FILE, AGENT_ERROR_LOG
(never imported outside config.ts)
- controlplane-db.ts: remove getAgentById (never imported)
Docs:
- GIT-JAIL-FORGEJO-HANDOFF.md: note SSH automation in setup/git.ts,
GIT_LOCAL_URL injection, local-first skills update
- MULTI-PROVIDER-ARCHITECTURE.md: note current default zai/glm-5-turbo
and just pi-config for runtime changes
Build: pass | Tests: not run (Linux) (Sam & Claude)
When the model hits context window exceeded, send a short guidance reply and avoid infinite cursor rollback retries.
Also record the incident and local wrapper regeneration note in session handoff.
Critical:
- config.ts: add CONTROLPLANE_SHARED_SECRET to readEnvFile allowlist so it
actually gets read from .env (was dead code — agent auth always rejected).
Added envConfig fallback (was only reading process.env).
- controlplane-api.ts: use config export instead of raw process.env.
- .env.example: document CONTROLPLANE_SHARED_SECRET, fix bind host default
to 127.0.0.1.
Reliability:
- controlplane-heartbeat.ts: wrap writeSessionEntry in try/catch — disk full
no longer prevents task status updates or agent reply delivery.
- controlplane-heartbeat.ts: per-agent try/catch in loop — one throwing
agent (DB error) no longer starves remaining agents for that tick.
- db.ts, memory-pg.ts, skills-pg.ts: add connectionTimeoutMillis: 30000 to
all pg.Pool instances. Prevents indefinite blocking on pool exhaustion.
Observability:
- channels/telegram.ts: warn when TELEGRAM_BOT_TOKEN is missing instead of
silently disabling the channel.
- index.ts: startup config validation warns for missing TELEGRAM_BOT_TOKEN,
CONTROLPLANE_SHARED_SECRET, and OPENAI_API_KEY.
Build: pass | Tests: not run (Linux) (Sam & Claude)
Agent replies frequently contain unbalanced backticks, _, * which break
Telegram's strict Markdown parser. Every code message triggered a failed
Markdown attempt followed by a plain-text retry — wasting an API call and
adding latency. Send plain text directly; it's reliable and fast.
Also:
- Delete stale doc/HANDOFF-GIT-JAIL-OPENSSH.md (SUPERSEDED)
- Update doc/GIT-JAIL-PLAN.md Phase 2 as automated by setup/git.ts
Build: pass | Tests: not run (Linux) (Sam & Claude)
Thin jails share the host's /usr and /bin, so /usr/sbin/sshd is
already available. The openssh-portable ports package is unnecessary.
The Linux agent was right in 29439b2. My 4cc2f7d adding
openssh-portable was based on a wrong assumption. This corrects it.
Updated handoff doc to reflect the correction.
---
Build: pass | Tests: pass — 1530 passed (91 files)
The Linux agent removed openssh from git-jail packages thinking sshd
is in FreeBSD base. This is wrong for thin jails — they don't inherit
base services. openssh-portable fix already merged on main.
---
Build: pass | Tests: pass — 1530 passed (91 files)
- setup/git.ts: add setupGitJailSsh() — generates host keys, writes
sshd_config (keys-only), creates git user with git-shell restricted
shell, deploys operator SSH public key to authorized_keys, enables
sshd. Fully idempotent, key sourced from SSH_PUBLIC_KEY or
GIT_SSH_KEY_PATH.pub or ~/.ssh/id_ed25519.pub.
- infra/packages/git-jail.txt: add openssh package
- Delete run-clawdie.sh and run-mevy.sh symlink from git; add run-*.sh
to .gitignore. setup/service.ts already generates run-${AGENT_NAME}.sh
at install time — the checked-in template was redundant.
- .env.example: document GIT_SSH_KEY_PATH for git-admin agent
- Update HANDOFF-ISO-AGENT.md with run-*.sh change
Build: pass | Tests: not run (Linux) (Sam & Claude)