clawdie-ai/doc/BROWSER-JAIL-HANDOFF.md
Operator & Codex 6d662d5d3b Add browser clone hostd lifecycle ops
---
Build: pass | Tests: pass — 2395 passed (175 files)
2026-05-11 18:18:44 +02:00

9.3 KiB

Browser Jail Handoff

From: Claude/Codex collaboration Date: 11.maj.2026 Status: IN-PROGRESS — Phase 0.6 PASS; Phase 1 implementation next

Design reference: docs/internal/BROWSER-JAIL.md Validation record: docs/internal/BROWSER-JAIL-CLONE-LIFECYCLE-VALIDATION.md UI-TARS direction: docs/internal/UI-TARS-ADOPTION.md Template proposal history: BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md

Deletion Criteria

  • Redefined Phase 0.6 passes: cookie injection round-trip, cross-clone injection, 3 clean cycles.
  • docs/internal/BROWSER-JAIL.md is updated from design target to implementation record.
  • setup/browser-jail.ts or successor provisioning path creates the fixed thick browser template.
  • hostd/controlplane clone cleanup and reaper behavior are implemented and tested.
  • Full suite passes after implementation.

Current Baseline

The browser/operator substrate now has a fixed registry slot:

jail:      browser
IP:        WARDEN_BROWSER_IP, default <subnet>.6
live host: 192.168.72.6
shape:     thick Bastille VNET jail
boot:      off
packages:  chromium, node22, npm-node22

The old thin browserop validation jail was destroyed after the thick browser template was created and verified. Do not recreate browserop unless you are reading historical validation steps.

Current branch:

main

Most recent relevant commits:

3ea26f2 Validate browser clone cookie injection
35ddc4a Polish browser-jail doc constellation
55a6dee Align browser jail design docs
54f612e Fix browser jail registry slot
8855543 Pivot template+clone proposal to credentials store + CDP injection
f893828 Document browser clone validation findings

Decisions Already Made

  • Use one fixed, thick, credential-free browser template.
  • Do not keep credentials in templates or Chromium profile bytes.
  • Use Clawdie-owned credentials store + CDP cookie injection.
  • MVP credentials are cookies-only, domain-filtered, tenant-scoped, and grant-token-gated.
  • localStorage, IndexedDB, passkeys/WebAuthn, and credential write-back from task sessions are out of MVP.
  • Future task isolation uses browsertaskNNN clones.
  • Jail names must stay alphanumeric for Bastille VNET.
  • browser_run_task is the pi-facing high-level tool; pi gets one compact result.
  • UI-TARS is the reference loop/operator shape; do not invent a parallel model loop unless adaptation fails.
  • Screenshots are session-level: off | transient | audit, default transient, N=50 ring buffer.

Phase 0.5 — FreeBSD Viability

Complete. See docs/internal/BROWSER-JAIL-FREEBSD-VIABILITY.md.

Confirmed:

  • FreeBSD pkg Chromium works in a Bastille jail.
  • /usr/local/bin/chrome launches headless with CDP.
  • puppeteer-core@24.43.0 works against system Chromium.
  • DOM read and PNG screenshot smoke passed.

Phase 0.6 — Redefined Validation

Original profile-byte clone validation paused after cookie/profile failure. This was not a substrate failure. It means authenticated Chromium profile bytes are not a reliable credential transport. The redefined CDP cookie-injection validation passed.

  • Start browser.
  • Start a tiny local HTTP server inside the jail that can set and echo a cookie.
  • Launch Chromium with an empty profile.
  • Navigate normally to set the cookie; avoid CDP-only page.setCookie for the seed.
  • Export via CDP Network.getAllCookies.
  • Store exported cookies outside the jail; filesystem under repo tmp/ is OK for this validation.
  • Restart Chromium with another empty profile.
  • Inject via CDP Network.setCookie.
  • Navigate to the echo endpoint and verify the request sends the cookie.

Task 2 — Cross-clone injection

  • Snapshot/clone the fixed thick browser template to browsertask001.
  • Patch jail name/IP/VNET config as needed.
  • Start the clone.
  • Start Chromium with an empty profile in the clone.
  • Inject cookies exported in Task 1.
  • Verify cookie is visible/sent through Puppeteer and HTTP request headers.
  • Destroy clone cleanly.

Task 3 — 3-cycle lifecycle loop

  • Run 3 sequential clone → start → inject → smoke → destroy cycles.
  • Record clone+start+inject latency; median target < 2s, p95 informational for 3 cycles.
  • Verify no browsertask* jails remain.
  • Verify no browsertask* datasets remain.
  • Verify no stale btNNN epairs remain.
  • Verify no Chromium processes remain.
  • Deliberately orphan one clone and run reaper twice; second run must be a no-op success.

Phase 0.6 results

Passing snapshot: phase06-injection-11.maj.2026-1603.

Median timings:

  • clone: 113 ms
  • Bastille jail start: 1006 ms
  • Chromium CDP readiness: 1502 ms
  • CDP injection + smoke: 593 ms
  • clone + jail start + injection smoke: 1718 ms
  • clone + jail start + Chromium ready + injection smoke: 3211 ms

Final state after validation:

  • browser remains stopped, boot off.
  • no browsertask* jails/datasets remain.
  • no btNNN epairs remain.
  • no Chromium or cookie-server processes remain.

Implementation Work After Phase 0.6 Passes

Order of work — important

The sections below are organized by component, not by build order. Build order matters because each layer depends on the one below:

  1. Setup / infra (template provisioning, packages, quotas) — must exist first; everything else clones or operates against it.
  2. hostd / lifecycle (clone, destroy, reaper, PF table ops) — required before any task clone can be spawned by the controlplane.
  3. Browser backend (jail-side HTTP API in the clone) — required before the controlplane can issue actions.
  4. Controlplane: credentials store + grant tokens — required before credential_mode: "operator" works at all.
  5. Controlplane: ClawdieBrowserOperator + injection wiring — connects credentials to clones via CDP at session open.
  6. Controlplane: browser.run_task MCP endpoint + audit — the high-level surface that external clients call.
  7. pi clawdie-harness extension: browser_run_task tool — pi-facing surface; cannot land before (6) because there is nothing to call.
  8. End-to-end smoke (mock → real-model trivial task → mixed operator workflow).

Slipping any step earlier means later steps have no working dependency to integrate against. The corollary: pi-side browser_run_task test cases require the full stack and should not be the first integration target.

Setup / infra

  • Add setup/browser-jail.ts or equivalent install step.
  • Add setup/browser-jail.ts or equivalent install step.
  • Add infra/packages/browser-jail.txt.
  • Add browser to infra/jails.yaml at suffix .6.
  • Add WARDEN_BROWSER_IP / BROWSER_JAIL_IP config support.
  • Ensure provisioning creates a thick VNET jail and sets boot off.
  • Ensure pkg-cache mount is temporary for install and removed after install.
  • Add initial ZFS quotas.

hostd / lifecycle

  • Add narrow hostd ops for browser clone/create/destroy/reaper cleanup.
  • Use static PF table browser_tasks with add/delete per clone IP.
  • Implement forced-unmount fallback after busy dataset destroy failure.
  • Remove stale epairs before retrying a clone name.
  • Use rc.d/PID-file shutdown, not broad pkill.

Browser backend

  • Jail-side plain HTTP API on internal network only.
  • Health endpoint.
  • Session open/close.
  • Navigate/screenshot/click/type/scroll/read_dom.
  • Screenshot persistence modes.
  • Resource limits/deadlines/max sessions.

Controlplane

  • Credentials store schema and encryption.
  • Grant token validation for credential_mode: "operator".
  • Domain-filtered cookie injection via CDP.
  • Audit events for all browser actions and injection events.
  • UI-TARS-compatible ClawdieBrowserOperator.
  • High-level browser.run_task / pi browser_run_task result.

Known Failure Modes / Lessons

  • Thin jail raw clones preserved fstab paths pointing back to the template and caused /bin/sh failures. This is why the fixed template is thick.
  • Failed starts can leave stale epairs; reaper must explicitly destroy them.
  • Chromium Singleton files must not be present in any profile being reused, but the new design avoids profile-byte credential reuse.
  • Busy ZFS datasets occurred even when fstat showed no obvious holder; forced unmount fallback is required.
  • --password-store=basic did not make encrypted cookie profile cloning reliable.

Validation Commands

Before any commit:

just build
npm test

Full-suite footer must be fresh and accurate.

Open Questions

Decided before Phase 1:

  • Credentials backend: Postgres. Filesystem is acceptable only for validation artifacts.
  • Refresh UX: controlplane-streamed browser clone.

Escalate before changing:

  • browser template name/IP slot,
  • cookies-only MVP scope,
  • UI-TARS-compatible loop direction,
  • no credentials in template/profile bytes,
  • screenshot recording semantics.

Delete After

When all deletion criteria are checked, remove this handoff:

git rm doc/BROWSER-JAIL-HANDOFF.md