Commit graph

10 commits

Author SHA1 Message Date
6d662d5d3b Add browser clone hostd lifecycle ops
---
Build: pass | Tests: pass — 2395 passed (175 files)
2026-05-11 18:18:44 +02:00
6c549e7ad0 Rename browser validation assets
---
Build: pass | Tests: pass — 2383 passed (175 files)
2026-05-11 17:32:22 +02:00
3ea26f231d Validate browser clone cookie injection
---
Build: pass | Tests: pass — 2383 passed (175 files)
2026-05-11 16:19:12 +02:00
Operator & Claude Code
35ddc4afb2 Polish browser-jail doc constellation
Small follow-ups after the design alignment sweep, surfaced as
improvements worth landing before Phase 0.6 starts.

- BROWSER-JAIL.md: add a "Related docs" index at the top so a new reader
  can find current design vs phase records vs direction vs history
  without grepping. Resolve the two non-blocking choices: credentials
  store backend = Postgres, refresh UX = controlplane-streamed clone.
  Add a concrete operator_grant_token schema (jti/iss/tenant_id/
  origin_session_id/operator_id/allowed_domains/issued_at/expires_at/
  single_use) with clawdie-internal opaque-token storage in postgres,
  validation rules, and revocation semantics.

- BROWSER-JAIL-HANDOFF.md: add an "Order of work" preface to the
  implementation section so the component-organized checklists are read
  with the dependency chain in mind. Setup → hostd → backend →
  credentials/grants → operator/injection → run_task endpoint → pi
  extension → smoke. pi-side test cases require the full stack and
  should not be the first integration target.

- VISION-GROUNDING-FINDINGS.md: add a "Role under UI-TARS adoption"
  footer reframing the doc as Phase 1 model-selection input (which
  vision model to pair with the UI-TARS-compatible runner), not "should
  we build vision grounding."

- BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md: mark HISTORICAL. Retained
  for pivot reasoning (why profile-byte cloning was dropped). New
  decisions land in BROWSER-JAIL.md, not here.

Nothing in this commit changes the architecture or blocks any track.
All five items are doc hygiene that future implementation work will
appreciate.
2026-05-11 15:12:26 +02:00
55a6dee215 Align browser jail design docs
---
Build: pass | Tests: pass — 2383 passed (175 files)
2026-05-11 15:05:31 +02:00
ba33a349cc Document UI-TARS adoption direction
---
Build: pass | Tests: pass — 2382 passed (175 files)
2026-05-11 12:33:13 +02:00
Operator & Claude Code
f2a5c59273 Session-level screenshot recording modes (off/transient/audit)
Replace per-call persist:false with a session-level record mode set at
open_session, immutable for the session's life. Three modes:

- off:       nothing written to disk; model still sees screenshots in
             context.
- transient: last N=50 screenshots in a FIFO ring buffer per session.
             Default. Enough for post-hoc debugging without unbounded
             growth.
- audit:     persist all with 7d retention. Explicit opt-in for
             sensitive operations.

Default resolution: explicit param → tenant default → system default
("transient"). MVP hardcodes the system default; tenant overrides are
Phase 2.

Rationale: screenshots serve three different jobs (agent's eyes,
debugging trace, forensic audit), and a single retention policy can't
serve all three without either drowning in disk or losing audit value.
The dashcam analogy in the doc covers this directly. Per-call
persistence flags are messy and per-tenant audit-flagging at session
level was the wrong granularity.

Also:

- Credential-exfiltration mitigation in the threat model now describes
  the off/audit levers an operator has.
- Future enhancement noted: browser.freeze_session to promote a
  transient ring buffer to audit retention without restarting.
- Phase 1A handoff updated: POST /sessions accepts record, response
  echoes it; /screenshot persistence behavior tied to session record
  mode with explicit test points.
2026-05-11 11:19:23 +02:00
Operator & Claude Code
2458485974 Fold Phase 0.5 storage/lifecycle decisions into browser-jail design
Architectural decisions surfaced by the FreeBSD viability spike, locked
in before Phase 1A code lands so the implementation has a clear contract:

- Jail lifecycle: long-lived, not per-task. Records the install cost
  (~189 pkgs, ~442 MiB download, ~2 GiB resident, several minutes).
  Pkg cache must be mounted at jail creation via the existing
  mountPkgCacheInJail helper.
- Screenshot persistence: default-persist preserved (audit
  completeness). 24h normal retention, 7d audit-flagged. ZFS-quota-backed
  with FIFO eviction within retention windows under pressure. Per-call
  persist:false opt-out for throwaway captures.
- ZFS quota model: two datasets, two quotas — jail base 10 GiB,
  screenshots 20 GiB (starting points).
- Screenshot default: viewport-only. full_page:true is explicit.
  Audit/end-of-task captures use it; action loops don't.
- Threat model: new "Heavy package surface" entry. Containment via the
  jail, deny-internal egress, no host mounts beyond pkg cache, audit
  log written by the proxy, downloads/uploads disabled.
- Phase 0 status: COMPLETE (design + vision findings + handoff all on
  main; viability doc on browser-jail-spike).
2026-05-11 10:57:16 +02:00
466ad73cee Document browser jail FreeBSD viability
---
Build: pass | Tests: pass — 2382 passed (175 files)
2026-05-11 10:44:42 +02:00
Operator & Claude Code
3070fa323f Add browser-jail design, threat model, and Phase 0 spike artifacts
Three coordinated docs that anchor the FreeBSD-hosted headless browser
work:

- docs/internal/BROWSER-JAIL.md — full design (architecture, MCP tool
  surface, isolation model, auth via better-auth, PF egress policy,
  screenshot retention, audit logging) and a threat-model section
  covering SSRF, credential leakage, cross-session bleed, audit
  poisoning, and resource exhaustion.
- docs/internal/VISION-GROUNDING-FINDINGS.md — spike methodology
  (3 deterministic HTML fixtures, DOM-extracted ground truth,
  30 px tolerance, identical prompt across models). Claude Opus 4.7
  column complete: 17/17 PASS, mean 1 px, max 8 px. GPT-4o, GLM-4V,
  and UI-TARS columns pending — harness ready under
  tmp/browser-jail-spike/.
- doc/BROWSER-JAIL-HANDOFF.md — Codex handoff for Phase 0.5 (FreeBSD
  viability spike) and Phase 1 (jail HTTP service + controlplane MCP
  proxy + PF rules) with per-commit validation requirements.

Runtime constraint baked in: Node v22+ everywhere on the FreeBSD path,
no Bun. CDP client is puppeteer-core against system-pkg Chromium —
full Playwright avoided due to FreeBSD bundling gaps.
2026-05-11 09:58:14 +02:00