clawdie-ai

Author	SHA1	Message	Date
Operator & Codex	6d662d5d3b	Add browser clone hostd lifecycle ops --- Build: pass \| Tests: pass — 2395 passed (175 files)	2026-05-11 18:18:44 +02:00
Operator & Codex	6c549e7ad0	Rename browser validation assets --- Build: pass \| Tests: pass — 2383 passed (175 files)	2026-05-11 17:32:22 +02:00
Operator & Codex	3ea26f231d	Validate browser clone cookie injection --- Build: pass \| Tests: pass — 2383 passed (175 files)	2026-05-11 16:19:12 +02:00
Operator & Claude Code	35ddc4afb2	Polish browser-jail doc constellation Small follow-ups after the design alignment sweep, surfaced as improvements worth landing before Phase 0.6 starts. - BROWSER-JAIL.md: add a "Related docs" index at the top so a new reader can find current design vs phase records vs direction vs history without grepping. Resolve the two non-blocking choices: credentials store backend = Postgres, refresh UX = controlplane-streamed clone. Add a concrete operator_grant_token schema (jti/iss/tenant_id/ origin_session_id/operator_id/allowed_domains/issued_at/expires_at/ single_use) with clawdie-internal opaque-token storage in postgres, validation rules, and revocation semantics. - BROWSER-JAIL-HANDOFF.md: add an "Order of work" preface to the implementation section so the component-organized checklists are read with the dependency chain in mind. Setup → hostd → backend → credentials/grants → operator/injection → run_task endpoint → pi extension → smoke. pi-side test cases require the full stack and should not be the first integration target. - VISION-GROUNDING-FINDINGS.md: add a "Role under UI-TARS adoption" footer reframing the doc as Phase 1 model-selection input (which vision model to pair with the UI-TARS-compatible runner), not "should we build vision grounding." - BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md: mark HISTORICAL. Retained for pivot reasoning (why profile-byte cloning was dropped). New decisions land in BROWSER-JAIL.md, not here. Nothing in this commit changes the architecture or blocks any track. All five items are doc hygiene that future implementation work will appreciate.	2026-05-11 15:12:26 +02:00
Operator & Codex	55a6dee215	Align browser jail design docs --- Build: pass \| Tests: pass — 2383 passed (175 files)	2026-05-11 15:05:31 +02:00
Operator & Codex	ba33a349cc	Document UI-TARS adoption direction --- Build: pass \| Tests: pass — 2382 passed (175 files)	2026-05-11 12:33:13 +02:00
Operator & Claude Code	f2a5c59273	Session-level screenshot recording modes (off/transient/audit) Replace per-call persist:false with a session-level record mode set at open_session, immutable for the session's life. Three modes: - off: nothing written to disk; model still sees screenshots in context. - transient: last N=50 screenshots in a FIFO ring buffer per session. Default. Enough for post-hoc debugging without unbounded growth. - audit: persist all with 7d retention. Explicit opt-in for sensitive operations. Default resolution: explicit param → tenant default → system default ("transient"). MVP hardcodes the system default; tenant overrides are Phase 2. Rationale: screenshots serve three different jobs (agent's eyes, debugging trace, forensic audit), and a single retention policy can't serve all three without either drowning in disk or losing audit value. The dashcam analogy in the doc covers this directly. Per-call persistence flags are messy and per-tenant audit-flagging at session level was the wrong granularity. Also: - Credential-exfiltration mitigation in the threat model now describes the off/audit levers an operator has. - Future enhancement noted: browser.freeze_session to promote a transient ring buffer to audit retention without restarting. - Phase 1A handoff updated: POST /sessions accepts record, response echoes it; /screenshot persistence behavior tied to session record mode with explicit test points.	2026-05-11 11:19:23 +02:00
Operator & Claude Code	2458485974	Fold Phase 0.5 storage/lifecycle decisions into browser-jail design Architectural decisions surfaced by the FreeBSD viability spike, locked in before Phase 1A code lands so the implementation has a clear contract: - Jail lifecycle: long-lived, not per-task. Records the install cost (~189 pkgs, ~442 MiB download, ~2 GiB resident, several minutes). Pkg cache must be mounted at jail creation via the existing mountPkgCacheInJail helper. - Screenshot persistence: default-persist preserved (audit completeness). 24h normal retention, 7d audit-flagged. ZFS-quota-backed with FIFO eviction within retention windows under pressure. Per-call persist:false opt-out for throwaway captures. - ZFS quota model: two datasets, two quotas — jail base 10 GiB, screenshots 20 GiB (starting points). - Screenshot default: viewport-only. full_page:true is explicit. Audit/end-of-task captures use it; action loops don't. - Threat model: new "Heavy package surface" entry. Containment via the jail, deny-internal egress, no host mounts beyond pkg cache, audit log written by the proxy, downloads/uploads disabled. - Phase 0 status: COMPLETE (design + vision findings + handoff all on main; viability doc on browser-jail-spike).	2026-05-11 10:57:16 +02:00
Operator & Codex	466ad73cee	Document browser jail FreeBSD viability --- Build: pass \| Tests: pass — 2382 passed (175 files)	2026-05-11 10:44:42 +02:00
Operator & Claude Code	3070fa323f	Add browser-jail design, threat model, and Phase 0 spike artifacts Three coordinated docs that anchor the FreeBSD-hosted headless browser work: - docs/internal/BROWSER-JAIL.md — full design (architecture, MCP tool surface, isolation model, auth via better-auth, PF egress policy, screenshot retention, audit logging) and a threat-model section covering SSRF, credential leakage, cross-session bleed, audit poisoning, and resource exhaustion. - docs/internal/VISION-GROUNDING-FINDINGS.md — spike methodology (3 deterministic HTML fixtures, DOM-extracted ground truth, 30 px tolerance, identical prompt across models). Claude Opus 4.7 column complete: 17/17 PASS, mean 1 px, max 8 px. GPT-4o, GLM-4V, and UI-TARS columns pending — harness ready under tmp/browser-jail-spike/. - doc/BROWSER-JAIL-HANDOFF.md — Codex handoff for Phase 0.5 (FreeBSD viability spike) and Phase 1 (jail HTTP service + controlplane MCP proxy + PF rules) with per-commit validation requirements. Runtime constraint baked in: Node v22+ everywhere on the FreeBSD path, no Bun. CDP client is puppeteer-core against system-pkg Chromium — full Playwright avoided due to FreeBSD bundling gaps.	2026-05-11 09:58:14 +02:00

10 commits