19 KiB
Browser Jail
Date: 11.maj.2026 Status: DESIGN — Phase 0.6 injection validation passed; ready for Phase 1 implementation Phase: 0.5 viability PASS → 0.6 injection validation PASS → 1 implementation
Clawdie's browser-computer-use backend is a FreeBSD/Bastille browser execution template plus future ephemeral task clones. The jail layer executes browser operations only; the controlplane owns MCP, auth, audit, credentials, task state, and the UI-TARS-compatible model loop.
This document is the current implementation target. The prior long-lived shared
Chromium-context model is superseded by the fixed thick browser template and
per-task clone direction, validated by the redefined Phase 0.6 injection run.
Related docs
| Doc | Role |
|---|---|
docs/internal/BROWSER-JAIL.md (this) |
Current design + implementation target |
doc/BROWSER-JAIL-HANDOFF.md |
Next tasks: Phase 1 implementation sequence |
docs/internal/BROWSER-JAIL-FREEBSD-VIABILITY.md |
Phase 0.5 record (PASS) |
docs/internal/BROWSER-JAIL-CLONE-LIFECYCLE-VALIDATION.md |
Phase 0.6 record (PASS for CDP injection; profile-byte inheritance dropped) |
docs/internal/UI-TARS-ADOPTION.md |
Direction: UI-TARS as the agent-loop reference; clawdie owns substrate |
docs/internal/VISION-GROUNDING-FINDINGS.md |
Vision-grounding validation — input for model selection |
doc/BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md |
Historical — pivot reasoning only; do not update |
Goals
- Server-side headless browser automation on FreeBSD.
- One fixed, credential-free thick browser template jail at a stable registry slot.
- Future per-task jail clones for stronger isolation and clean teardown.
- Controlplane-owned credentials store with explicit CDP cookie injection.
- UI-TARS-compatible operator surface: screenshot → prediction → execute.
- pi integration as one compact
browser_run_taskresult, not screenshot spam in JSONL history.
Non-goals (MVP)
- Model logic inside the jail.
- Storing operator credentials inside jail profiles or templates.
- Browser profile-byte authentication cloning.
- Downloads/uploads.
- localStorage / IndexedDB credential injection.
- Passkeys / WebAuthn portability.
- Electron/nut-js desktop control on the FreeBSD server path.
Fixed jail registry slot
The canonical browser substrate is:
name: browser
ip: WARDEN_BROWSER_IP, default <subnet>.6
live IP: 192.168.72.6 on the current validation host
shape: thick Bastille VNET jail
boot: off by default
packages: chromium, node22, npm-node22
workspace: /opt/browser-validation for current validation scripts
credentials: none
Registry source of truth:
infra/jails.yaml
infra/packages/browser-jail.txt
Why thick:
- The browser template is a golden execution image, not a small ordinary service jail.
- Thick snapshots/clones are self-contained and avoid the thin-jail nullfs/fstab clone failures observed in Phase 0.6.
- ZFS clones remain cheap because unchanged blocks are shared.
- Chromium's package payload dominates the size; the copied base userland is acceptable for reproducibility.
The old thin browserop validation jail is retired. Operator mode is not a
separate authenticated template; it is a session credential mode authorized
by controlplane policy.
Architecture
External clients / tasks
├── Claude Desktop MCP
├── UI-TARS-compatible runner
├── pi browser_run_task
└── Clawdie controlplane tasks
│
▼
controlplane (host, clawdie service)
├── better-auth / operator auth
├── tenant + grant-token policy
├── credentials store + CDP cookie injection
├── audit log
├── screenshot retention store
├── UI-TARS-compatible GUIAgent runner
├── hostd calls for Bastille/ZFS/PF lifecycle
└── per-session route table: session_id → browsertaskNNN/IP
│
▼
browsertaskNNN clone (future Phase 1 runtime)
├── cloned from thick browser template
├── plain HTTP API only, reachable from controlplane
├── Chromium from FreeBSD pkg
├── Node 22 + puppeteer-core/CDP bridge
├── no credentials at rest before injection
└── destroyed at session end
The fixed browser jail is the template/source image. Runtime task work should
happen in browsertaskNNN clones. During validation, the
same HTTP/CDP code may be exercised directly inside browser to avoid clone
noise, but production semantics are clone-backed.
Controlplane responsibilities
The controlplane owns every security-sensitive and product-facing concern:
- MCP/HTTP tool surface.
- better-auth session validation.
- tenant and operator identity resolution.
operator_grant_tokenvalidation.- credentials storage, decryption, and domain filtering.
- CDP cookie injection at session open.
- audit writes before forwarding actions to the jail.
- screenshot recording/retention outside clone datasets.
- clone lifecycle through hostd.
- orphan reaper and forced-unmount fallback.
- UI-TARS-compatible browser task loop.
The jail owns only:
- starting Chromium,
- accepting plain HTTP requests from controlplane,
- executing browser actions via CDP,
- returning screenshots/DOM/action results.
No MCP server, long-lived auth secret, model loop, or audit DB access belongs in the jail.
Credentials store + injection
MVP scope
- Cookies only.
- Domain-filtered.
- Tenant-scoped.
- Grant-token-gated for operator mode.
- Encrypted at rest.
- No persistence back from task sessions.
Out of MVP:
- localStorage,
- IndexedDB,
- passkeys/WebAuthn,
- automatic credential refresh when a site expires a session.
Session credential modes
{
"credential_mode": "clean | operator",
"domains": ["github.com"],
"record": "off | transient | audit"
}
Defaults:
credential_mode = clean
record = transient
domains = []
credential_mode: "clean" never injects cookies.
credential_mode: "operator" requires a valid operator_grant_token whose
scope matches the tenant and requested domains. The token authorizes injection
only; it does not grant shell access, jail access, or blanket cookie export.
operator_grant_token schema
Concrete shape (resolved 11.maj.2026):
operator_grant_token = {
jti: uuid # unique token id, audit key
iss: "clawdie-cp" # issuer, fixed
tenant_id: uuid
origin_session_id: uuid # the operator-authorized session that issued this token
operator_id: uuid
allowed_domains: ["github.com", "stripe.com"] # exact host match; no wildcards
issued_at: iso8601
expires_at: iso8601 # short-lived: 15 min default, configurable per tenant
single_use: true | false # true = consumed by the first injection call
}
- Format: clawdie-internal opaque token (random
jti) → controlplane looks up the rest in its own table. Not a JWT — no signature verification needed because issuance and validation both happen inside the controlplane. - Storage: Postgres
auth.operator_grant_tokenstable; rows expire and GC'd by the existing cron pattern. Cleartext token only lives in the originating session's response and in the env of the pi task it spawns. - Issuance: controlplane issues during operator-authorized task creation (Telegram approval, dashboard action). Issued tokens are audit-logged.
- Validation: at
browser_run_task/open_sessionentry, controlplane verifiesjtiexists, not expired,tenant_idmatches the caller's tenant,allowed_domainsis a superset of the request'sdomains. Onsingle_use: true, the row is marked consumed atomically. - Revocation: delete the row by
jti. Pending validations against the samejtifail.
The single_use default for MVP is true — each grant authorizes one task.
Long-running approvals are explicit: operator marks a session
single_use: false and a longer expires_at if they want to chain multiple
tasks under one grant.
Store sketch
Preferred backend: Postgres, because it keeps credentials transactional, backupable, tenant-scoped, and close to Clawdie's existing secret story. Filesystem storage is acceptable only for early validation runs.
credentials.cookies
id
tenant_id
domain
cookies_encrypted # encrypted JSON array of CDP cookie shapes
created_at
last_refreshed_at
last_injected_at
grant_scope
Refresh workflow
Credential refresh is an operator-driven workflow, separate from task sessions:
operator requests refresh
→ controlplane starts a refresh browser session/clone
→ operator logs in interactively
→ controlplane exports cookies via CDP Network.getAllCookies
→ controlplane filters selected domains
→ encrypted cookies are written to credentials store
→ refresh clone/session is destroyed
The refresh UX is not decided yet. Candidate shapes:
- controlplane-streamed browser view,
- VNC/X-forwarding/SPICE-style access,
- future Lumina/Firefox cookie export bridge.
The storage/injection contract is independent of that UX choice.
Injection workflow
open browser task with credential_mode=operator
→ validate operator_grant_token
→ select cookies for (tenant, allowed domains)
→ decrypt in controlplane
→ clone browser → browsertaskNNN
→ start Chromium with empty profile
→ inject cookies via CDP Network.setCookie
→ verify via smoke probe where possible
→ hand session to UI-TARS/operator loop
Task sessions do not write updated cookies back to the store. If cookies expire, the task should fail clearly and ask for a refresh workflow.
Tool surface
Primitive browser tools remain useful for debugging and MCP compatibility:
| Tool | Purpose | Returns |
|---|---|---|
browser.open_session |
Start a browser session/clone with {credential_mode, domains, record}. |
{ session_id, credential_mode, record, started_at } |
browser.navigate |
Load a URL. | { status, final_url, title } |
browser.screenshot |
Capture viewport by default; full_page: true explicit. |
{ image_base64, width, height, captured_at, persisted_path? } |
browser.click |
Click (x,y) or CSS selector. |
{ success, after_screenshot? } |
browser.type |
Type text into focused element or selector. | { success, after_screenshot? } |
browser.scroll |
Scroll page/selector. | { success, after_screenshot? } |
browser.read_dom |
Return truncated DOM for grounding fallback. | { html, truncated_to } |
browser.close_session |
Stop browser service, destroy clone/session resources. | { closed_at } |
Normal product integration should prefer one high-level call:
browser.run_task({ instruction, credential_mode, domains, record, max_steps })
pi exposes this as browser_run_task and receives one compact result:
{
"status": "finished | max_steps | error | aborted",
"summary": "final answer or useful task summary",
"result_data": {},
"trace_id": "controlplane trace/session id",
"step_count": 8,
"final_screenshot_path": "/var/db/browser-jail/sessions/.../final.png"
}
Screenshots stay in the UI-TARS loop and Clawdie recording store. They are not appended turn-by-turn into pi JSONL history.
UI-TARS-compatible operator
Clawdie should adapt UI-TARS' mature GUI-agent loop shape rather than inventing a separate parser/loop:
instruction
→ screenshot()
→ model prediction
→ execute(prediction)
→ repeat until finished/max_steps/error
ClawdieBrowserOperator should expose:
screenshot()backed bybrowser.screenshot,execute(prediction)translating parsed UI-TARS actions to browser tools,finished()/ close semantics backed by controlplane session cleanup.
The model loop runs in controlplane or an external UI-TARS-compatible client, never inside the jail.
Network policy
Ingress:
- Browser clone HTTP API reachable only from controlplane on the internal jail network.
- No public ingress.
- No MCP/auth endpoint inside the jail.
Egress:
- Public web egress allowed for browser tasks.
- PF denies internal targets by default:
- RFC1918,
- loopback,
- link-local,
- IPv6 ULA/link-local,
- cloud metadata (
169.254.169.254).
Future PF shape for clones:
- static ruleset,
- table
browser_tasks, - add/delete clone IPs with
pfctl -t browser_tasks -T add/delete, - no full PF reload per clone.
Screenshot recording
Recording is session-level and immutable for the session:
| Mode | Disk | Retention | Use |
|---|---|---|---|
off |
none | — | disposable tests or sensitive sessions the operator chooses not to record |
transient |
N=50 FIFO ring buffer | until close/eviction | default action-loop debugging |
audit |
every screenshot | 7d default | sensitive operations needing forensic trace |
Path:
/var/db/browser-jail/sessions/<tenant_id>/<session_id>/<seq>.png
Screenshots live on the host/controlplane side, outside clone datasets. Clone destroy must not delete audit material.
Clone lifecycle
Task clone names are alphanumeric for Bastille VNET compatibility:
browsertask001
browsertask002
...
Expected create path:
- ensure
browsertemplate is stopped/quiescent, - snapshot or clone from a known-good template state,
- call hostd
browser-clone-createto clone datasets, patch jail name/IP/VNET config, clear stale epairs, and add the clone IP to PFbrowser_tasks, - start jail,
- start browser HTTP service / Chromium,
- inject cookies if
credential_mode=operator, - run task loop.
Destroy/reaper order:
- call hostd
browser-clone-destroy/browser-clone-reap, - stop in-jail browser HTTP service through rc.d,
- TERM Chromium by PID file,
- KILL by PID only as fallback,
- unmount nullfs/pkg-cache/session mounts,
bastille stop <clone>,- remove clone IP from PF table,
- release IP,
zfs destroy -r <clone_dataset>,- on busy dataset:
browser-clone-force-unmount/zfs unmount -fand retry destroy, - remove stale epairs before retrying a clone name.
Hostd operations added for Phase 1A:
browser-clone-create— clone the thickbrowserdatasets from a named snapshot, patch Bastille/VNET config, clear stalebtNNNepairs, and add the task IP to PF tablebrowser_tasks.browser-clone-destroy— stop the in-jail browser service, perform PID-file-targeted shutdown, stop the jail, remove PF membership, clear epairs, and destroy clone datasets with forced-unmount retry.browser-clone-reap— idempotent destroy path for orphaned clones.browser-clone-force-unmount— narrow reaper fallback for busy clone datasets.
Broad pkill chrome is not production behavior. Use rc.d service stop or
PID-file-targeted shutdown.
Resource quotas and limits
Starting points:
zroot/<runtime>/jails/browser quota=10G
zroot/<runtime>/browser-screenshots quota=20G
Per-clone runtime limits remain required:
- memory RCTL,
- openfiles RCTL,
- per-session deadline,
- max concurrent browser tasks per tenant/host.
Exact values should be tuned after Phase 0.6/Phase 1 measurements.
Threat model highlights
Credential exfiltration
Credentials are not stored in templates or clone profiles before injection. Injection is domain-filtered and grant-token-gated. Cleartext cookies are never logged. Screenshot recording mode is explicit, and typed text is redacted in audit logs.
SSRF/internal network probing
PF denies internal address ranges and metadata endpoints. The browser has public egress but should not reach private service jails or the host controlplane.
Jail compromise
The jail has no audit DB credentials, no MCP auth secret, and no credentials store. A compromised clone can affect its own task session but not the controlplane-owned store or audit trail.
Resource exhaustion
ZFS quotas, RCTL, deadlines, max sessions, and reaper cleanup bound disk, memory, CPU, and process leaks.
Profile bleed
The rejected profile-byte clone model is not used. Cookie injection starts from an empty profile per task clone. Cross-session cookie leakage remains an integration test requirement.
Validation state
Phase 0.5 — FreeBSD Chromium/CDP viability
Passed. See BROWSER-JAIL-FREEBSD-VIABILITY.md.
Confirmed:
- FreeBSD 15 jail can run pkg Chromium headless.
puppeteer-coreworks against system Chromium over CDP.- screenshots and DOM reads work.
Phase 0.6 — clone lifecycle / credential injection
Passed. See BROWSER-JAIL-CLONE-LIFECYCLE-VALIDATION.md.
Learned:
- ZFS clone mechanics are viable.
- Bastille cloned jail start is viable after config patching.
- Thin-jail fstab/nullfs details were painful; fixed template is now thick.
- Stale epairs and busy datasets require explicit cleanup/reaper logic.
- Chromium encrypted profile-byte credential inheritance is not viable.
- CDP cookie injection works inside
browserand across the clone boundary.
Passing run:
- cookie export/import round-trip in
browser: pass, - cookie injection across
browser→browsertask001: pass, - 3-cycle clone/start/inject/smoke/destroy: pass,
- idempotent orphan reaper: pass.
Median timings from the passing run:
- clone: 113 ms,
- Bastille jail start: 1006 ms,
- Chromium CDP readiness: 1502 ms,
- CDP injection + smoke: 593 ms,
- clone + jail start + injection smoke: 1718 ms,
- clone + jail start + Chromium ready + injection smoke: 3211 ms.
Phases
| Phase | Status | Output |
|---|---|---|
| 0 — Design | COMPLETE | Browser jail docs + UI-TARS direction |
| 0.5 — FreeBSD viability | PASS | Chromium + CDP confirmed |
| 0.6 — Injection clone validation | PASS | cookie injection + 3-cycle clone lifecycle |
| 1 — Browser backend implementation | READY | setup/browser-jail, hostd clone ops, jail HTTP API |
| 2 — Controlplane integration | PENDING | credentials store, grant token policy, browser_run_task |
| 3 — Refresh UX | PENDING | operator credential refresh flow |
| 4 — Downloads/uploads | DEFERRED | explicit security design needed |
Decided
Resolved 11.maj.2026 after Phase 0.6 pivot:
- Credentials store backend: Postgres. Reuses clawdie's existing encryption/transaction story; tenant-scoped via row constraints; single backup path. Filesystem store retained only as a validation-time scaffold, removed before Phase 1 implementation lands.
- Refresh UX: controlplane-streamed clone. Operator interacts with a
refresh-mode
browsertaskNNNclone via a web-streamed channel served by the controlplane. Only option that works regardless of operator's machine; VNC / X-forwarding alternatives discarded because they assume a configured operator workstation.
Blocking before adoption (gates remain):
- CDP cookie injection must work reliably across clone boundary.
- Clone cleanup/reaper must be idempotent.
- PF table strategy must be validated with task clone IPs.
- hostd must expose narrow clone/destroy/force-clean operations.