Commit graph

294 commits

Author SHA1 Message Date
Sam & Claude
79368f64bc Fix two env-dependent test assertions (Sam & Claude)
setup/cms.test.ts: mock AGENT_DOMAIN='clawdie.si' via vi.mock so
server_name assertions don't depend on env/registry at test time.

src/explanation-grounder.test.ts: accept either PGPASSWORD branch or
no-password branch — OPS_DB_URL may or may not carry a password
depending on the install environment.

Both tests now pass on debby (Linux/Debian 13) and should be
environment-independent on all platforms.

---
Build: pass | Tests: FAIL — 17 failed
2026-05-27 08:45:00 +02:00
83feb0d736 Avoid npm install lifecycle for Clawdie installer
---
Build: pass | Tests: pass — 2451 passed (182 files)
2026-05-12 21:16:17 +02:00
Operator & Claude Code
2c9c031fea Fix browser rc.d NODE_PATH and update stale Decided section (Sam & Claude)
- setup/browser-jail.ts: NODE_PATH=/opt/browser-validation/node_modules -> /opt/clawdie/node_modules to match where ensureBrowserBackendDeps actually installs deps
- docs/internal/BROWSER-JAIL.md: Decided section now says host-resident xpra over SSH, not controlplane-streamed clone
2026-05-12 18:18:12 +02:00
8145830ebf Align first boot provider setup with Codex recommendation
---
Build: pass | Tests: pass — 2441 passed (181 files)
2026-05-12 09:56:02 +02:00
fcd1172939 Add browser jail HTTP backend
---
Build: pass | Tests: pass — 2400 passed (176 files)
2026-05-11 19:16:46 +02:00
5eeb51d68b Provision browser template jail
---
Build: pass | Tests: pass — 2383 passed (175 files)
2026-05-11 16:38:49 +02:00
54f612edf2 Fix browser jail registry slot
---
Build: pass | Tests: pass — 2383 passed (175 files)
2026-05-11 14:53:12 +02:00
dcae5878fa Replace jail sudo path with hostd bastille-cmd
Route jail exec through a new hostd bastille-cmd operation, remove the agent-jail sudoers requirement, and fall back to repository ownership when elevation does not provide SUDO_* metadata.

---
Build: pass | Tests: n/a (Vitest not run in this Linux environment per repo policy)

---
Build: pass | Tests: pass — 2375 passed (704 files)
2026-05-10 22:48:04 +02:00
391ed30cb0 Add mac_do verification notes for FreeBSD 15 (Codex)
Document the FreeBSD 15 mac_do rule shape and expose soft setup verification for module/rule state without enforcing live host changes.

---
Build: pass | Tests: pass — 2373 passed (704 files)
2026-05-10 21:54:29 +02:00
8b3cf07e59 Silence ACL verification stderr noise (Codex)
Check jail root existence before getfacl and capture getfacl stderr so soft ACL verification failures do not pollute test output.

---
Build: pass | Tests: pass — 2370 passed (704 files)
2026-05-10 21:35:15 +02:00
f1dc7ea6df Drop stale jail and agent migration paths (Codex)
Remove completed controlplane agent-id migration, simplify jail-name resolution to current canonical names, and drop SUDO_UID ownership fallback from service setup.

---
Build: pass | Tests: pass — 2370 passed (704 files)
2026-05-10 21:30:17 +02:00
50a915c414 Drop Astro docs path compatibility noise (Codex)
Remove the ASTRO_SITE_PATH alias and stale STRIPPED/refactor comments now that CMS_DOCS_SITE_PATH is the canonical docs project path.

---
Build: pass | Tests: pass — 2372 passed (704 files)

---
Build: pass | Tests: pass — 2372 passed (704 files)

---
Build: pass | Tests: pass — 2372 passed (704 files)

---
Build: pass | Tests: pass — 2372 passed (704 files)
2026-05-10 20:47:10 +02:00
d96cac3632 Remove clawdie-site cleanup compatibility (Codex)
Drop the temporary cleanup helper and all remaining clawdie-site references now that the docs project path is clawdie-docs.

---
Build: pass | Tests: pass — 2372 passed (704 files)
2026-05-10 20:13:35 +02:00
Operator & Claude Code
f3accc155a Clean up legacy clawdie-site mount + skill references
setup/cms.ts: removeLegacyDocsBootstrapMount() now strips any fstab
entry whose source path ends with bootstrap/cms/clawdie-site and
best-effort umounts the stale target before adding the new
clawdie-docs mount. Without this, existing installs would carry a
broken fstab line pointing at a renamed directory; the mount works
in-memory until the next bastille restart, then fails confusingly.

.agent/skills/{ansible-freebsd,astro,strapi}: replace clawdie-site
with clawdie-docs in skill text so agents are pointed at the new
project path.

NOTE: bootstrap/skills-memory/artifact.sql still needs to be
regenerated on the host via `just refresh-skills-artifact` once
OPENROUTER_API_KEY is set there — embedding regen cannot run from
this Linux side.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-10 19:58:46 +02:00
e3ad322d3b Rename Astro docs project to clawdie-docs (Sam & Claude)
Make the docs renderer name match its purpose, add CMS_DOCS_SITE_PATH with ASTRO_SITE_PATH compatibility, and update docs publishing paths.

---
Build: pass | Tests: pass — 2372 passed (704 files)
2026-05-10 19:49:39 +02:00
1934d12bd1 Clarify public and internal domain defaults (Sam & Claude)
Leave AGENT_DOMAIN blank until a real public DNS name exists, keep home.arpa for internal jail/service names, and mark ZFS as required for Bastille jails.

---
Build: pass
Tests: pass — 59 passed (4 files)

---
Build: pass | Tests: pass — 2372 passed (704 files)
2026-05-10 18:49:08 +02:00
Operator & Claude Code
bbff424272 Skip redundant install-cert after acme.sh --renew
The 10.maj.2026 force-renew run reloaded nginx twice back-to-back: once
because acme.sh's --renew ran the saved Le_ReloadCmd, and again because
setup/tls.ts unconditionally followed it with --install-cert. Short-circuit
the second call when Le_RealKeyPath, Le_RealFullchainPath, and Le_ReloadCmd
in the domain conf already match our canonical values; first issue and
no-prior-conf force-issue paths still install as before.

Also: per-cert failures no longer strand the rest of the batch. The run
loop aggregates failures, still installs the renewal cron, then exits 1
with FAILED_LABELS surfaced in the status line.

---
Build: FAIL | Tests: FAIL — 16 failed
2026-05-10 16:56:52 +02:00
771e19e1c7 Drop FreeBSD 14 support (Sam & Claude)
Require the tracked FreeBSD 15.x line during install and environment checks, and align docs and skill compatibility metadata with 15.x only.

---
Build: pass
Tests: pass — 37 passed (2 files)

---
Build: pass | Tests: pass — 2363 passed (701 files)
2026-05-10 16:31:40 +02:00
a7f183fc33 Harden TLS renewal cron handling
---
Build: pass | Tests: pass — 2361 passed (700 files)
2026-05-10 12:12:14 +02:00
b7bb3a7537 Add TLS lifecycle smoke test
---
Build: pass | Tests: pass — 2360 passed (700 files)
2026-05-10 11:47:20 +02:00
60d1ad2d63 Implement TLS lifecycle apply path
---
Build: pass | Tests: pass — 2357 passed (700 files)
2026-05-10 11:29:29 +02:00
Operator & Claude Code
c4709ab404 Draft setup/tls.ts skeleton for acme.sh codification (Sam & Claude)
The doctor checks at a2f2be6 surfaced that acme.sh is missing on the
live host: cert files exist at canonical nginx paths but nothing is
renewing them. clawdie.si expires 02.jun.2026 (~23 days); docs.clawdie.si
expires 11.jun.2026. The handoff doc's Step 4 ("optional, codify
acme.sh setup") is now necessary, not optional.

This commit lands the skeleton so the work doesn't get forgotten and
Codex has a concrete starting point. Default --dry-run means the step
is safe to call against the host without mutation.

setup/tls.ts shape:

- Phase 1: ensure acme.sh installed (pkg install or curl fallback —
  Codex confirms which works on the deployed FreeBSD release)
- Phase 2: register Let's Encrypt account with $ACME_EMAIL
- Phase 3+4: per cert in DEFAULT_MANAGED_CERTS, --issue if missing or
  --force-renew, --install-cert at canonical paths with reload hook
- Phase 5: --install-cronjob for renewal

Pure helpers tested from dev box (12 tests):
- buildAcmeIssueArgs / buildAcmeInstallCertArgs argument shapes
- parseArgs CLI surface (--apply, --cert, --email, --force-renew,
  unknown-arg rejection)
- DEFAULT_MANAGED_CERTS sanity (clawdie + docs labels, www SAN)

Phases 2–4 throw `TODO(Codex)` errors when --apply is set, so
accidental invocation fails loud rather than half-acting. Phase 1 and
Phase 5 are implemented (idempotent, low-risk). Wired into setup/index
as `tls` step.

Handoff doc updated with the post-doctor finding, the skeleton
status, and a manual one-liner Codex can use to renew clawdie.si
before 02.jun.2026 in case the codification work isn't ready in time.

---
2291 tests passing locally (+12 from new tls tests). Pre-existing
argon2/controlplane-*/cms.test failures unchanged.

---
Build: FAIL | Tests: FAIL — 16 failed
2026-05-10 11:11:59 +02:00
5c382ae5cb Redact skills memory DB URL in setup status (Sam & Codex)
---
Build: pass | Tests: pass — 2260 passed (671 files)
2026-05-09 15:36:17 +02:00
149d90196f Fix dnsmasq deploy address handling (C&C)
---
Build: pass | Tests: pass — 2247 passed (666 files)
2026-05-09 13:03:15 +02:00
Operator & Claude Code
fee3b1a36a Make setup/dns deploy and start dnsmasq on the host (Sam & Claude)
The dns step previously rendered dnsmasq.conf to a project-local
tmp/ path and stopped there — the platform classified dnsmasq as a
shared service but never actually deployed, enabled, or started it.
This is the dnsmasq blind spot.

What changed:

- On FreeBSD as root, the rendered config now lands at
  /usr/local/etc/dnsmasq.conf with a .bak of any prior content.
- sysrc dnsmasq_enable=YES is set so the service comes up on boot.
- service dnsmasq restart fires only when the config actually changed
  or the service is not currently running. Idempotent.
- Off-host (tests, dev) the step still writes to tmp/dnsmasq.conf
  and skips all lifecycle ops, so unit tests and dry-runs are unchanged.

Upstream resolvers were the other half of the gap — the previous
config had `no-resolv` with no `server=` directive, making it unusable
as a system resolver standalone. We now resolve upstream from, in
priority order:

1. $DNSMASQ_UPSTREAM (comma- or space-separated; operator override)
2. non-loopback nameservers parsed from /etc/resolv.conf
3. 1.1.1.1 + 9.9.9.9 fallback

Each step renders `server=<ip>` lines so dnsmasq can forward queries
outside the local zone.

Deliberately NOT changed: /etc/resolv.conf. Pointing the host at
dnsmasq for system-wide resolution is a separate operator decision —
mis-doing it can take a host offline. The structured status now emits
SETUP_DNS with DEPLOYED/SYSRC_ENABLED/SERVICE state so /publishreport
can surface where the resolver stands.

---
13 dns tests pass (up from 5). Pre-existing cms.test.ts failure in
the wider setup/ run is unrelated to this change.

---
Build: FAIL | Tests: FAIL — 16 failed
2026-05-09 12:49:57 +02:00
Operator & Claude Code
5c54aea011 Wire system-update into orchestrator + daily cron (Sam & Claude)
system-update was complete but unwired — only callable directly via
`npx tsx setup/system-update.ts`. This commit:

- Adds it to setup/index.ts STEPS so `npm run setup -- --step system-update`
  works the same as the other lifecycle steps.
- Adds a sibling `system-update-cron` step that drops a managed wrapper
  at /usr/local/sbin/clawdie-system-update and a cron entry at
  /etc/cron.d/clawdie-system-update. Default schedule is 06:50 daily so
  the morning patch state lands before the 08:00 operator status report.
- Folds `pkg audit -F` into the system-update run — read-only CVE scan
  that always executes (even in dry-run) and surfaces vulnerable count
  in the structured status.
- Adds a reboot-pending detector that compares running kernel (uname -r)
  to installed userland (freebsd-version). When a kernel patch lands,
  REBOOT_PENDING=yes appears in the status; the platform never reboots
  itself — the operator decides.

Cadence is daily, not weekly: freebsd-update fetches are cheap, security
patches benefit from same-day rollout, and pairing with the morning
report makes the result legible. Heavier `pkg upgrade` (full userland
refresh, not just CVE scan) is a separate question for later.

Tests cover the new pure helpers (parsePkgAudit, rebootPending) plus
the cron entry/wrapper builders. The orchestrator wiring is mechanical.

---
Targeted tests pass (system-update + system-update-cron, 21 tests).
Codex to validate end-to-end on host: install the cron module, confirm
/etc/cron.d/clawdie-system-update lands, confirm the wrapper is exec'd
on the next 06:50, and confirm the structured status reaches the 08:00
report pipeline.

---
Build: FAIL | Tests: FAIL — 16 failed

---
Build: FAIL | Tests: FAIL — 16 failed
2026-05-09 12:40:39 +02:00
5e0bd9eb12 Add pi update skill and package rename
---
Build: pass | Tests: pass — 2227 passed (656 files)
2026-05-09 12:35:22 +02:00
34e2265ad9 Apply Clawdie brand overlay to docs (Sam & Codex)
Add a small Starlight CSS overlay, Clawdie triangle logo, and header links that align docs.clawdie.si with the clawdie.si landing palette while keeping the default docs typography.

---

Build: pass

Tests: pass — 2 passed (1 file)

---
Build: pass | Tests: pass — 2221 passed (656 files)

---
Build: pass | Tests: pass — 2221 passed (656 files)
2026-05-09 08:08:18 +02:00
247d4cdd0c Fix docs site navigation and Slovenian locale (Sam & Codex)
Autogenerate the docs sidebar from the public content tree, sync Slovenian docs into the Starlight content copy, remove stale Astro-only English and guide duplicates, use honest 404s for missing docs pages, and repair stale Codeberg links.

---

Build: pass

Tests: pass — 2221 passed (166 files)

---
Build: pass | Tests: pass — 2221 passed (656 files)
2026-05-08 17:31:40 +02:00
33750fd5c9 Keep landing redirects on HTTPS (Sam & Codex)
Make the cms jail root redirect emit an HTTPS clawdie.si target when served behind host nginx, and align the sample host vhost certificate paths with the live clawdie certificate layout.

---

Build: pass

Tests: pass — 2221 passed (166 files)

---
Build: pass | Tests: pass — 2221 passed (656 files)
2026-05-08 12:25:14 +02:00
576438c9cb Wire clawdie.si landing publishing (Sam & Codex)
Mount and deploy the platform landing Astro site from the CMS setup step, add the cms jail nginx server block for clawdie.si/www.clawdie.si, and surface platform landing/docs availability in /publishreport.

---

Build: pass

Tests: pass — 2221 passed (166 files)

---
Build: pass | Tests: pass — 2221 passed (656 files)
2026-05-08 10:06:59 +02:00
10f83c46d2 Reject malformed pi auth primitives (Sam & Codex)
Treat boolean and numeric auth.json provider entries as unusable credentials while preserving non-empty string, array, and object entries. Extend pi-config tests for empty arrays and primitive malformed entries.

---

Build: pass

Tests: pass — 2210 passed (166 files)

---
Build: pass | Tests: pass — 2210 passed (654 files)
2026-05-07 20:11:27 +02:00
f3817166ed Tighten Codex auth validation (Sam & Codex)
Sync the Codex login docs into Astro mirrors, reject empty pi auth.json provider entries before falling back to env credentials, and make pi-config report missing provider auth rather than API-key-only state.

---

Build: pass

Tests: pass — 2210 passed (166 files)

---
Build: pass | Tests: pass — 2210 passed (654 files)
2026-05-07 17:08:38 +02:00
97243a8092 Support Codex auth in pi-config
Recognize Pi auth.json credentials for OAuth-backed providers such as openai-codex, keep env and Ollama validation intact, and document the headless Codex login flow in install/operator docs.

---
Build: pass | Tests: pass — 2209 passed (654 files)
2026-05-07 16:58:37 +02:00
f6acf8e256 Prune stale first-boot docs and scripts (Sam & Codex)
Make the first-boot implementation spec self-contained, remove the superseded secrets handoff and obsolete manual jail setup scripts, and align hostname defaulting with the assistant-name separation rule. Update PostgreSQL permission notes and sync the public first-boot page into Astro docs.

---

Build: pass

Tests: pass — 2197 passed (164 files)

---
Build: pass | Tests: pass — 2197 passed (650 files)
2026-05-07 12:40:47 +02:00
6de0ed87ab Remove legacy Mevy references (Sam & Codex)
Sweep active code, tests, identity files, public docs, CMS seed content, and stale handoffs so old assistant-name fixtures no longer leak into current Clawdie/system-namespace behavior. Keep the skills-memory SQL artifact unchanged per regeneration policy.

---

Build: pass

Tests: pass — 2197 passed (164 files)

---
Build: pass | Tests: pass — 2197 passed (650 files)
2026-05-07 11:16:40 +02:00
d5c5c39144 Clean up system-namespace test debt (Sam & Codex)
Rewrite skipped PLATFORM_* identity tests against the current constant-based platform model, remove stale mock exports/comments, and delete the completed routing handoff.

---

Build: pass

Tests: pass — 2197 passed (164 files)

---
Build: pass | Tests: pass — 2197 passed (650 files)
2026-05-07 10:51:54 +02:00
3345123b3f Harden db jail selection in system update
---
Build: pass | Tests: pass — 9 passed (1 file)

---
Build: pass | Tests: pass — 2190 passed (648 files)
2026-05-06 09:58:22 +02:00
c4239b2b11 Align jail policy and add system update path
---
Build: pass | Tests: pass — 51 passed (3 files)

---
Build: pass | Tests: pass — 2189 passed (648 files)
2026-05-06 09:43:08 +02:00
aabc403648 Align subnet defaults and public jail docs
---
Build: pass | Tests: pass — 71 passed (3 files)

---
Build: pass | Tests: pass — 2163 passed (630 files)
2026-05-05 22:23:42 +02:00
98a6fd8900 Add btop to glasspane host baseline
---
Build: pass | Tests: pass — 2147 passed (625 files)
2026-05-05 16:57:14 +02:00
e5b65cd21a Remove duplicate PostgreSQL client prerequisite
---
Build: pass | Tests: pass — 2147 passed (625 files)
2026-05-05 15:52:52 +02:00
1751678000 Use fd-find in FreeBSD host baseline
---
Build: pass | Tests: pass — 2147 passed (625 files)
2026-05-05 15:34:45 +02:00
Operator & Claude Code
0fcac57e42 Use RUNTIME_ID for setup-side label interpolation
Follow-up to a99f971: covers the remaining ${TENANT_ID} interpolation
sites that produced leading-hyphen / empty-path values on root installs.

- setup/ollama.ts, setup/llama-cpp.ts: preferred jail names
- setup/sanoid.ts: tenant-era home candidate
- setup/hosts.ts: jail-name discovery filter (+ test mock)
- src/telegram-commands.ts: status identity line, suppress empty
  tenant clause on root installs

Root-detection sites that key off TENANT_ID === '' are intentionally
left untouched; the invariant is preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: FAIL | Tests: FAIL — 15 failed
2026-05-04 06:31:21 +02:00
7acf771a3b Stabilize regression-driven test fixtures
---
Build: pass | Tests: pass — 2137 passed (622 files)
2026-05-03 20:58:27 +02:00
5c685f1285 Harden dashboard password reset flow
---
Build: pass | Tests: FAIL — 26 failed
2026-05-03 20:45:48 +02:00
8414953776 Harden controlplane agent API key lookup
---
Build: pass | Tests: FAIL — 26 failed
2026-05-03 18:10:46 +02:00
2874dbae4f Add Telegram dashboard password reset
---
Build: pass | Tests: FAIL — 31 failed
2026-05-03 10:31:40 +02:00
0d6414a75f Use per-agent controlplane API tokens
---
Build: pass | Tests: FAIL — 4 failed (pre-existing tenant fixture issue in src/controlplane-api.test.ts)
2026-05-03 07:20:51 +02:00
f4cb61bad5 Serve root public domains from CMS jail
---
Build: pass | Tests: FAIL — Tests  11 failed | 2089 passed | 4 skipped (2104)
2026-05-02 22:12:51 +02:00