Operator & Codex 6c549e7ad0 Rename browser validation assets

---
Build: pass | Tests: pass — 2383 passed (175 files)

2026-05-11 17:32:22 +02:00

23 KiB

Raw Blame History

Browser Jail — Template + Clone Proposal

Author: Claude (proposal) + Codex (review, Phase 0.6 validation) Date: 11.maj.2026 Status: HISTORICAL — superseded by docs/internal/BROWSER-JAIL.md and docs/internal/UI-TARS-ADOPTION.md. Retained for the pivot reasoning (why profile-byte credential cloning was dropped, why session credential modes replaced two-template policy). Do not update this doc with new decisions; update BROWSER-JAIL.md instead.

Companion docs (authoritative): docs/internal/BROWSER-JAIL.md, UI-TARS-ADOPTION.md, BROWSER-JAIL-CLONE-LIFECYCLE-VALIDATION.md

Template + clone as a substrate pattern is alive. The specific feature that a sealed Chromium profile carries operator authentication into clones is dead. Architecture pivots accordingly.

Pivot summary (11.maj.2026)

Phase 0.6 ran on FreeBSD and produced this split:

Layer	Validated	Status
ZFS clone mechanics (~40 ms)	yes	viable
Bastille start of cloned jail (~1 s) after config patching	yes	viable
Stale epair / busy-dataset cleanup	partially	needs explicit reaper logic
Chromium encrypted profile inheritance	no	not adopted

Cookies survived as SQLite rows in the cloned profile but did not decrypt or present to Chromium in the clone. Cause is Chromium's profile-bound encryption and host/app identity, not the substrate. See BROWSER-JAIL-CLONE-LIFECYCLE-VALIDATION.md.

Pivot: decouple substrate from identity.

Substrate (template + clone) keeps doing what it's good at: install amortization, jail-level isolation, fast spawn, clean teardown.
Identity moves out of the profile: a clawdie-owned credentials store holds cookies (and only cookies in MVP); they are injected per-session via CDP at clone start.
The fixed template becomes credential-free. The canonical jail is now browser at subnet suffix .6. It is a thick browser-capable template; operator mode is a session credential mode, not a separate authenticated template.

The operator_grant_token from UI-TARS-ADOPTION.md and the earlier pi-integration commits now authorizes credential injection, not access to a magic pre-authenticated template. That is a stronger model — injection is domain-filtered, audited, and revocable.

What changes

Previous design (superseded in BROWSER-JAIL.md):

One long-lived browser jail.
Per-task BrowserContext inside a shared Chromium.
Every task starts blank — no logins persist.
Credentials story punts entirely.

Adopted direction for validation: one credential-free thick browser template + ephemeral per-task ZFS clones + a clawdie-owned credentials store

per-session CDP cookie injection.

HOST (FreeBSD + ZFS)
│
├── browser            (TEMPLATE, persistent, fixed IP suffix .6, thick, credential-free)
│    ├── chromium installed once (~2 GiB)
│    ├── Node 22 + puppeteer-core present
│    ├── NO operator cookies, NO logins, NO authenticated profile state
│    └── session policy decides whether credentials may be injected
│
├── browsertask001     (EPHEMERAL clone of browser)
├── browsertask002     (EPHEMERAL clone of either template)
│    ├── zfs clone — fast, no re-install
│    ├── own jail, own IP (from browser-task pool), own PF table membership
│    ├── Chromium starts clean; cookies injected via CDP after start
│    └── destroyed at session end
│
├── pkg cache mount → shared into templates for installs
│
├── /var/db/browser-jail/sessions/      ← screenshots/audit live OUTSIDE
│    /<tenant>/<session_id>/             clone datasets, on the host
│
├── credentials store                    ← NEW, controlplane-owned
│    /var/db/clawdie/credentials/       (or postgres-backed)
│    ├── per (tenant, domain) cookie set
│    ├── encrypted at rest
│    └── refreshed via interactive refresh session
│
└── PF: static ruleset matching pfctl table "browser_tasks"
     ├── per-clone: pfctl -t browser_tasks -T add/delete <ip>
     └── no per-clone full PF reload

Naming note (preserved): Bastille VNET rejects - and _ in jail names. All template and task jail names are alphanumeric: browser, browsertask001. This keeps Bastille VNET happy and gives the browser runtime a fixed spot in the jail registry after Data Service (db).

Per-task cost: ZFS clone + jail start + CDP cookie injection. Phase 0.6 already measured clone + start at sub-second to ~1 s in best cases (with forced cleanup needed for repeat cycles). Injection is one CDP round-trip per cookie set, on the order of 10–50 ms. Median target stays < 2 s end-to-end.

Install cost: ~2 GiB × 1 thick template, one-time.

Credentials: never inherited via profile bytes; injected from clawdie's store at session start; cloned profile sees them in normal Chromium memory; clone destruction discards the injected state along with the rest of the jail.

Component impact

Component	Today (per `BROWSER-JAIL.md`)	With pivoted template+clone model
hostd	`bastille start browser` once	New ops: `clone_browser(template)`, `destroy_clone(name)`, `force_unmount_clone(name)` (reaper fallback). No "sealed snapshot" management needed.
controlplane	session_id → BrowserContext inside one jail	session_id → ephemeral jail name + IP. Owns credentials store. Performs cookie injection at session start. Validates `operator_grant_token` before injection.
watchdog	One jail to health-check	Templates are infra (must stay up); `browsertask*` jails are session-owned ephemeral resources. Disappearance is normal once a session closes.
MCP proxy	All sessions → one fixed jail IP	Per-session jail IP lookup; transparent dispatch.
PF	Static ruleset on one jail	One static ruleset matching pfctl table `browser_tasks`; `pfctl -t browser_tasks -T add/delete` per clone — no full reloads.
Install	Once, ~2 GiB	Once per template (2), ~2 GiB each. Templates remain dumb.
Per-task spawn	~50 ms	Target median < 2 s (clone + start + cookie inject).
Per-task teardown	Close context	`zfs destroy` clone (~1 s) plus the 7-step cleanup sequence.
Audit log	Same	Same — proxy writes, screenshots stored outside clone dataset.
Credentials lifecycle	None	NEW. Refresh workflow + encrypted store + injection. See Credentials store + injection.

Credentials store + injection

This is the new identity layer. It replaces the "operator template carries authentication" idea.

Storage

Path: postgres-backed (preferred — reuses clawdie's existing encrypted secret-storage story via better-auth-style or pgcrypto) OR a filesystem store under /var/db/clawdie/credentials/.
Scope: per (tenant_id, domain). Cookies for github.com and stripe.com are distinct rows; injection is per-domain.
Encryption at rest: mandatory. Plaintext cookie bytes never on disk. Use clawdie's existing encryption mechanism (decision: reuse better-auth secret encryption OR a dedicated hostd-wrapped key; pick during implementation).
Schema (sketch):

credentials.cookies
  - id
  - tenant_id
  - domain                  ('github.com')
  - cookies_encrypted       (encrypted JSON array of CDP cookie shapes)
  - created_at
  - last_refreshed_at
  - last_injected_at
  - approved_grant_scope    (which grant scopes can inject this row)

Refresh workflow

operator → controlplane "credential refresh" endpoint
  │
  ▼
controlplane spins up a refresh-mode browsertask clone
  - same substrate (template + clone), no cookies injected
  - clone is operator-driven (interactive, not agent-driven)
  - operator navigates, logs in, clears 2FA, etc.
  │
  ▼
operator signals "done" in controlplane UI
  │
  ▼
controlplane:
  - CDP Network.getAllCookies on the clone's Chromium
  - filter by domains the operator wants saved
  - encrypt + write to credentials store, partitioned by (tenant, domain)
  │
  ▼
clone destroyed (normal 7-step cleanup)

The refresh clone is ephemeral. Credentials end up in the store, never on the browser template filesystem.

Template shape

The canonical template is one thick jail:

name: browser
IP suffix: .6 (WARDEN_BROWSER_IP, default <subnet>.6)
ZFS dataset: zroot/clawdie-runtime/jails/browser
contents: FreeBSD base + Chromium + Node 22 + npm + browser validation deps
boot: off by default; controlplane starts/clones it intentionally
credentials: none

Thick is intentional here. The browser template is a golden execution image, not a small ordinary service jail. A thick jail makes snapshots/clones more self-contained and avoids the thin-jail nullfs/fstab surprises Phase 0.6 hit when raw-cloning browserop. ZFS clones stay cheap because they share blocks until written.

Injection at session start (per task)

new session opens with credential_mode:"operator" + valid operator_grant_token
  │
  ▼
controlplane:
  - decode the grant token's allowed domain scope
  - fetch from credentials store: cookies for (tenant_id, domains in scope)
  - decrypt
  - clone the template, start jail, start Chromium
  │
  ▼
ClawdieBrowserOperator (or session bootstrap inside the clone):
  - CDP Network.setCookie for each cookie in scope
  - verify cookie presence (optional smoke probe to a test endpoint)
  │
  ▼
session ready; pi/UI-TARS proceeds normally

credential_mode:"clean" sessions skip injection entirely. Same substrate, no cookies, no operator_grant_token required.

What the grant token authorizes

Refined from earlier:

operator_grant_token = {
  iss: controlplane,
  tenant_id,
  allowed_domains: ["github.com", "stripe.com"],
  expires_at,
  origin_session_id   // which operator-authorized task spawned this
}

Injection is the only thing the token authorizes. Pi cannot use the token to do anything except invoke browser_run_task({ credential_mode: "operator" }) with domains within the token's scope.

Lumina + real browser (unchanged but simplified)

If the host runs Lumina with a real Firefox, the operator uses it locally for their own browsing. The "credential refresh" workflow above replaces the earlier "drive the template directly" pattern — operator interacts with the refresh-mode clone, not the template, so browser stays pristine.

Firefox-profile-to-Chromium-cookies sync is still deferred. The cookies an operator wants persistently stored in clawdie's credential store are entered through the refresh workflow.

Open questions — refined per Codex review

1. One template or two?

One: browser. Default credential mode is clean.

The distinction is now session policy, not template contents. browser is always credential-free at rest. A session may request operator-mode credential injection only when paired with a valid operator_grant_token; otherwise it runs clean.

This is the reframe Codex pushed for: session credential mode, not template-contains-auth.

2. Who chooses template / session mode at open?

Explicit param + policy constraint + token validation.

open_session({
  "credential_mode": "clean" | "operator",
  "record":          "off" | "transient" | "audit",
  "domains":         ["github.com"]       // required when credential_mode == "operator"
})

Defaults if params omitted:

credential_mode = "clean"
record          = "transient"
domains         = []

Tenant policy can further restrict (e.g., disable credential_mode: "operator" for a tenant). Controlplane validates operator_grant_token whenever operator credential mode is requested; rejects with a clear error if missing, expired, scope-mismatched, or wrong tenant.

3. Per-domain credential scope (resolved differently than before)

MVP scope is domain-filtered cookies, not full-access.

The earlier proposal accepted "full-access on operator template" because profile-byte cloning gave you all cookies whether you wanted them or not. With explicit injection, filtering by domain is free — we only inject what the grant token's allowed_domains lists. Use it.

MVP rules:

Cookies only. No localStorage, IndexedDB, passkeys, OAuth tokens in separate stores.
Domains explicit in the grant token; no wildcard scope.
One grant token per task, not per session. Refreshing the same set of domains across multiple tasks requires the originating operator authorization to repeat.

Out of scope for MVP (acknowledged):

localStorage / IndexedDB — many modern apps store auth tokens there. Adds origin scoping + freshness + audit semantics. Phase 2 work.
Passkeys / WebAuthn — hardware-bound, not portable. Sites that require passkey-only auth are not supported by clawdie agent flows.
Credential persistence back from task sessions. Tasks do not write to the credentials store. Refresh is a separate, operator-driven workflow.

4. Credential refresh workflow (replaces "template refresh")

The old proposal had "operator refreshes template, snapshot, clones source from snapshot." That mechanism is replaced by the credential refresh workflow described in Credentials store + injection.

Open detail: where the operator actually does the refresh. Options:

A. Controlplane web UI streams the refresh-mode clone's Chromium back to the operator via WebRTC / a VNC-style channel. Clean but real engineering work.
B. Operator drives the refresh-mode clone via X-forwarding / VNC / a SPICE-style proxy from a known operator workstation. Simpler to ship, less integrated.
C. Operator uses Lumina on the host directly, log into services in a local Firefox, then a controlplane action exports cookies via Marionette / WebExtension into the credentials store. Avoids the refresh-mode clone entirely. Different shape, worth considering.

Defer the choice. All three are compatible with the credentials store contract; pick during refresh-workflow implementation.

5. Auth on the template jail itself (unchanged, simpler)

The browser template is credential-free. The "no external ingress to operator template" concern softens — even if someone reached the template's CDP port, they'd find no cookies. Still: keep PF locked to controlplane access only, on principle. No reason to relax the rule.

6. Lumina + Firefox integration: defer (unchanged)

7. Sealed snapshot mechanics (mostly retired)

Codex spent meaningful Phase 0.6 effort working around Chromium SingletonLock files, encryption survival, profile lock acquisition. With profile-byte inheritance dropped, this complexity goes away:

Clones are made from any reasonably-recent snapshot of the template root dataset. No "sealing" required.
SingletonLock / SingletonSocket / SingletonCookie cleanup is still needed if the template is left running between snapshots — but if the template is normally stopped (only started for refresh, then stopped again), the files don't exist to lock.
Encryption-survival is irrelevant — there's nothing to decrypt.

Phase 0.6's documentation of this complexity remains useful as a record of the path we avoided.

Seven additional questions (status after pivot)

#	Question	Status
A	VNET-safe naming + IP pool from `browser-task` range	Still required; unchanged
B	Sealed-snapshot mechanics	Retired — no sealing needed
C	Chromium profile clone correctness (cookies, passkeys, SQLite)	Confirmed broken; dropped
D	Per-clone RCTL limits	Still required; unchanged
E	Screenshots/audit outside clone dataset	Still required; unchanged
F	Orphan clone reaper, idempotent, forced-unmount fallback	Still required; Phase 0.6 confirmed forced unmount path
G	Operator credential authorization UX	Refined — now "credential refresh UX" and "grant-token UX." Two surfaces.

New questions arising from the pivot:

H. Credentials store backend. Postgres with pgcrypto vs. filesystem-backed? Lean postgres for transactional safety + existing better-auth integration.
I. Credential rotation cadence + alerts. When cookies expire on the service side, injection still proceeds; the task hits a "please log in" page. Detect this and surface to the operator. Out of MVP scope but the refresh-workflow design should leave a hook for it.
J. Audit semantics for injection. Every injection event written to the audit log with (tenant, session, domains, grant_token_id). Cleartext cookies never logged.

Cleanup order (every clone destroy must follow this)

Unchanged from the earlier revision, but Phase 0.6 confirmed step 7 needs forced unmount as a fallback path:

1. Stop the in-jail browser HTTP service       (kill the Node CDP server)
2. TERM, then KILL Chromium                    (PID-file based, not pkill)
3. Unmount nullfs / pkg-cache / screenshot mounts
4. bastille stop <clone>
5. pfctl -t browser_tasks -T delete <clone_ip>
6. Release the IP back to the pool
7. zfs destroy <clone_dataset>
   on busy-dataset failure: zfs unmount -f then retry destroy

Reaper must support the forced-unmount step. Phase 0.6 documented that fstat did not always show obvious holders for "busy" datasets — this is treated as a normal cleanup case, not an alert.

Production should use an rc.d service inside the browser jail for Chromium lifecycle, not ad hoc PID management:

bastille cmd <clone> service clawdie_browser stop
bastille stop <clone>
zfs destroy -r <clone_dataset>

Phase 0.6 — clone lifecycle validation (REDEFINED)

Original Phase 0.6 acceptance criteria assumed profile-byte credential inheritance. With the pivot, the validation reshapes around cookie injection.

Owner: Codex. Branch: continue on main. Starts from the new fixed thick browser jail at 192.168.72.6 / WARDEN_BROWSER_IP; the old thin browserop validation jail has been retired.

Acceptance criteria (HARD)

Cookie injection round-trip within the template.
- Start a small local HTTP server inside the template that sets and reads a known cookie.
- Set the cookie via real Chromium navigation (not just CDP).
- Export cookies via CDP Network.getAllCookies.
- Persist to a clawdie-side store (filesystem is fine for validation).
- Restart Chromium with a new, empty profile in the template.
- Inject the saved cookies via CDP Network.setCookie.
- Navigate back to the local HTTP server; verify the cookie is sent in the request headers.
Cookie injection across the clone boundary.
- Repeat (1) but the second Chromium runs in a fresh browsertask001 clone (no profile state) instead of the template.
- Confirms injection works post-clone, not just within-template.
3-cycle clone → start → inject → smoke → destroy loop.
- 3 sequential cycles complete end-to-end with zero orphaned datasets, mounts, epairs, or Chromium processes.
- Median clone+start+inject latency < 2 s.
- Cookie visible after injection (Puppeteer verifies via a request to the test endpoint).
- Idempotent reaper: deliberately orphan a clone, run reaper twice, no errors, no residual state.

Failure modes that change the verdict (still)

If cookie injection doesn't reliably work cross-clone (e.g., some sites require additional headers or device IDs): revisit per-domain scripted login rather than cookie persistence.
If 3 cycles can complete cleanly but resource leaks appear at higher rates: scale-test before adoption.
If the busy-dataset failure persists after the forced-unmount path: investigate whether nullfs unmount ordering or vnode lifecycle holds need more explicit cleanup.

What lands if adopted

In order, after Phase 0.6 (redefined) passes:

Reshape docs/internal/BROWSER-JAIL.md to template + clone + credentials store + injection. Templates documented as credential-free. Threat model updated.
Update doc/BROWSER-JAIL-HANDOFF.md Phase 1A steps:
- Step 1: template creation (credential-free thick browser jail).
- Step 2: clone lifecycle + reaper (hostd ops + forced-unmount path).
- Step 3: credentials store + encryption + schema.
- Step 4: CDP injection wiring in ClawdieBrowserOperator.
- Step 5: refresh workflow (likely the slowest piece; can split phases).
- Steps 6+ from earlier (HTTP service inside clones, PF wiring).
New BROWSER-JAIL-CREDENTIALS-OPS.md operator-facing doc covering the refresh workflow, grant tokens, expected expiration behavior.
Implementation:
- setup/browser-jail.ts creates one thick, credential-free browser template jail at WARDEN_BROWSER_IP.
- hostd gains clone_browser, destroy_clone, force_unmount_clone.
- controlplane gains credentials store + injection + grant-token validation.
- Watchdog learns "templates infra, tasks ephemeral."
- PF gets static browser_tasks ruleset.

None of that happens until Phase 0.6 (redefined) is green.

Status summary

Item	Decided	Status
Template + clone as substrate pattern	Yes	Phase 0.6 confirmed viable
One fixed thick `browser` template, default credential mode `clean`	Yes	Credential-free at rest
Explicit template param + policy	Yes	—
Sealed-snapshot clones	No (retired)	Not needed under injection model
Profile-byte credential inheritance	No (dropped)	Phase 0.6 confirmed not viable
Credentials store + CDP injection	Yes (new)	Phase 0.6 redefined to validate this
No external ingress to template	Yes	—
Defer Firefox sync	Yes	—
Cookies-only MVP, domain-filtered	Yes	—
`operator_grant_token` gates injection, not template access	Yes	Refined meaning
VNET-safe names	Yes	Names preserved despite semantic shift
Static PF ruleset + `pfctl -t`	Yes	—
Cleanup order (7 steps, forced unmount fallback)	Yes	Phase 0.6 confirmed forced unmount path
Screenshots/audit outside clone dataset	Yes	—
Orphan reaper, idempotent	Required	Phase 0.6 redefined validation step
Per-clone RCTL limits	Required	—
Per-domain templates (`browsergithub` etc.)	No — not needed	Domain filter is now per-grant-token, not per-template
localStorage / IndexedDB	No — Phase 2+	—
Passkeys / WebAuthn	No — unsupported	Hardware-bound, not portable
Lumina + Firefox sync	No — defer	—

23 KiB Raw Blame History Unescape Escape