Small follow-ups after the design alignment sweep, surfaced as
improvements worth landing before Phase 0.6 starts.
- BROWSER-JAIL.md: add a "Related docs" index at the top so a new reader
can find current design vs phase records vs direction vs history
without grepping. Resolve the two non-blocking choices: credentials
store backend = Postgres, refresh UX = controlplane-streamed clone.
Add a concrete operator_grant_token schema (jti/iss/tenant_id/
origin_session_id/operator_id/allowed_domains/issued_at/expires_at/
single_use) with clawdie-internal opaque-token storage in postgres,
validation rules, and revocation semantics.
- BROWSER-JAIL-HANDOFF.md: add an "Order of work" preface to the
implementation section so the component-organized checklists are read
with the dependency chain in mind. Setup → hostd → backend →
credentials/grants → operator/injection → run_task endpoint → pi
extension → smoke. pi-side test cases require the full stack and
should not be the first integration target.
- VISION-GROUNDING-FINDINGS.md: add a "Role under UI-TARS adoption"
footer reframing the doc as Phase 1 model-selection input (which
vision model to pair with the UI-TARS-compatible runner), not "should
we build vision grounding."
- BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md: mark HISTORICAL. Retained
for pivot reasoning (why profile-byte cloning was dropped). New
decisions land in BROWSER-JAIL.md, not here.
Nothing in this commit changes the architecture or blocks any track.
All five items are doc hygiene that future implementation work will
appreciate.
Replace per-call persist:false with a session-level record mode set at
open_session, immutable for the session's life. Three modes:
- off: nothing written to disk; model still sees screenshots in
context.
- transient: last N=50 screenshots in a FIFO ring buffer per session.
Default. Enough for post-hoc debugging without unbounded
growth.
- audit: persist all with 7d retention. Explicit opt-in for
sensitive operations.
Default resolution: explicit param → tenant default → system default
("transient"). MVP hardcodes the system default; tenant overrides are
Phase 2.
Rationale: screenshots serve three different jobs (agent's eyes,
debugging trace, forensic audit), and a single retention policy can't
serve all three without either drowning in disk or losing audit value.
The dashcam analogy in the doc covers this directly. Per-call
persistence flags are messy and per-tenant audit-flagging at session
level was the wrong granularity.
Also:
- Credential-exfiltration mitigation in the threat model now describes
the off/audit levers an operator has.
- Future enhancement noted: browser.freeze_session to promote a
transient ring buffer to audit retention without restarting.
- Phase 1A handoff updated: POST /sessions accepts record, response
echoes it; /screenshot persistence behavior tied to session record
mode with explicit test points.
Architectural decisions surfaced by the FreeBSD viability spike, locked
in before Phase 1A code lands so the implementation has a clear contract:
- Jail lifecycle: long-lived, not per-task. Records the install cost
(~189 pkgs, ~442 MiB download, ~2 GiB resident, several minutes).
Pkg cache must be mounted at jail creation via the existing
mountPkgCacheInJail helper.
- Screenshot persistence: default-persist preserved (audit
completeness). 24h normal retention, 7d audit-flagged. ZFS-quota-backed
with FIFO eviction within retention windows under pressure. Per-call
persist:false opt-out for throwaway captures.
- ZFS quota model: two datasets, two quotas — jail base 10 GiB,
screenshots 20 GiB (starting points).
- Screenshot default: viewport-only. full_page:true is explicit.
Audit/end-of-task captures use it; action loops don't.
- Threat model: new "Heavy package surface" entry. Containment via the
jail, deny-internal egress, no host mounts beyond pkg cache, audit
log written by the proxy, downloads/uploads disabled.
- Phase 0 status: COMPLETE (design + vision findings + handoff all on
main; viability doc on browser-jail-spike).