Commit graph

93 commits

Author SHA1 Message Date
975f37f895 feat(install): add versioned setup and system contracts
---
Build: pass | Tests: pass — Tests  2000 passed (2000)
2026-04-27 10:06:44 +02:00
d5182ec480 docs+setup: clarify install mode names
---
Build: pass | Tests: pass — Tests  1992 passed (1992)
2026-04-27 09:07:18 +02:00
bcb27d4d56 feat(install): backfill setup from inspect output
---
Build: pass | Tests: FAIL — Tests  2 failed | 1989 passed (1991)
2026-04-27 08:55:21 +02:00
7b14e27783 feat(install): add shell-based inspect mode
---
Build: pass | Tests: pass — Tests  1991 passed (1991)
2026-04-27 08:47:56 +02:00
2ab3fa050a refactor(setup): unify operator auth entrypoints
---
Build: pass | Tests: pass — Tests  1991 passed (1991)
2026-04-27 08:13:36 +02:00
Operator & claude
a16838b772 docs(handoff): record adopt-mode decisions + flag operator-auth unification
Round 5 in the handoff doc captures the five agreed adopt-mode
decisions (INSTALL_MODE field, fill-blanks default, identity
mismatch blocks, Telegram identity changes require explicit flag,
fingerprint gate) so they survive into Codex's design doc.

Implementation doc gets an "Adopt Mode (V1.1)" section with the
proposed 4-task split + per-field freeze contract table, plus a
task-4 followup subsection naming the legacy `operators` table
sync gap and the unification plan with Codex's
setup/operator-auth.ts. scripts/set-operator.ts gets a TODO(unify)
header pointing at the same gap.

first-boot.md notes adopt mode is V1.1 and to back up before
reflashing until then.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  3 failed | 1972 passed (1975)
2026-04-27 07:12:55 +02:00
Operator & claude
b9e771316d feat(setup): add set-operator script for post-install dashboard credentials
Lands task 4 from the ISO first-boot implementation split as a
standalone scripts/set-operator.ts (matches existing scripts/
convention — no clawdie-admin umbrella). Reuses
ensureControlplaneBootstrapOperator() for the Better Auth signUp
path. Prompts password via stdin with echo suppressed; refuses
non-TTY runs; updates OPERATOR_PASSWORD in .env (mode 0600).
First-set only — rotation goes through the dashboard.

Both planning docs updated to drop "notional" references and point
at the real npm run set-operator -- <email> command.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  3 failed | 1972 passed (1975)
2026-04-27 06:41:53 +02:00
1389e17ec4 fix(runtime): align startup brief and test status paths
---
Build: pass | Tests: pass — Tests  1951 passed (1951)
2026-04-26 12:48:47 +02:00
1e87f34121 feat(dashboard): expand operator tenant and publish view
---
Build: FAIL | Tests: FAIL

---
Build: FAIL | Tests: FAIL
2026-04-26 08:49:24 +02:00
af2648be87 fix(reports): keep test status artifacts in repo tmp
---
Build: FAIL | Tests: FAIL
2026-04-26 07:48:43 +02:00
Operator & claude
1759a8bd85 feat(reports): add structured test/build report
Reads JSON status files written by scripts/write-test-build-status.sh
so /testreport reflects the last real build/test run instead of model
memory. Missing or stale status degrades to "unknown" with an action
note rather than fabricating success.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: pass | Tests: pass — Tests  1914 passed (1914)

---
Build: pass | Tests: pass — Tests  1917 passed (1917)

---
Build: pass | Tests: pass — Tests  1921 passed (1921)
2026-04-26 07:44:21 +02:00
0d9ad52922 fix(controlplane): stop git push token burn in jail
---
Build: FAIL | Tests: FAIL
2026-04-25 19:37:54 +02:00
d8cbd5ca70 chore(multitenant): harden agent workflow and README sync
Move the multitenant agent-workflow decision into repo docs, enforce effective author/committer identities in the pre-commit hook, and replace the shell-based README version rewrite with a reusable Node helper.

---
Build: pass | Tests: pass — node scripts/update-readme-version.mjs --check; sh -n hooks/pre-commit

---
Build: FAIL | Tests: FAIL — Tests  58 failed | 1109 passed (1167)

---
Build: FAIL | Tests: FAIL — Tests  58 failed | 1107 passed (1165)
2026-04-25 07:58:18 +02:00
9605c7ad81 refactor(multitenant): collapse planTenantApply allowedResources duplication
Drop the allowedResources field from TenantApplyPlan — it was derived
field-for-field from resourceChecklist already, which was exactly the
"triplicate representation" flagged in the handoff's consolidation list.
Update scripts/tenant-lifecycle.ts to compute the same lists from the
checklist when it prints, and drop the tautological equality assertions
from the test (resourceChecklist is now the single source).

---
Build: pass | Tests: pass — 33 passed (1 file)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 19:12:12 +02:00
d8f43fc4a0 Clean up controlplane naming consumers
Fix the remaining operator-surface drift after the naming cutover. This aligns controlplane defaults around ai.<base>, makes the dashboard use the shared display-date helper and approved controlplane host, reuses the derived code-service hostname in Forgejo config, and fixes local-host syncing so underscore-form tenant jails are no longer skipped.

---
Build: pass | Tests: pass — 67 passed (5 files)
2026-04-24 16:50:08 +02:00
9fea739140 Finish controlplane naming propagation sweep
---
Build: pass | Tests: pass — 122 passed (7 files)
2026-04-24 16:03:47 +02:00
0c690d2065 Surface tenant naming overrides in apply plan 2026-04-24 11:06:20 +02:00
3992503522 Clarify tenant apply normalization hints
Explain in tenant-apply output when an existing tenant still carries declared non-default state, so operators can distinguish current tenant-specific carryover from the smaller V2 default for new tenants.

---
Build: pass | Tests: pass — 31 passed (1 file)
2026-04-24 10:17:41 +02:00
ae7a109da4 Consolidate tenant apply contract shape
Reduce duplication in planTenantApply by treating the resource checklist as the canonical resource list, deriving blockers from preflight state, and trimming redundant action-policy payloads.

---
Build: pass | Tests: pass — 31 passed (1 file)
2026-04-24 10:10:44 +02:00
daf29fa332 Add tenant apply resource checklist
Refine tenant-apply dry runs with per-resource status entries so databases, worker jails, and datasets are reported as explicit future-create candidates instead of only appearing inside summary sections.

---
Build: pass | Tests: pass — 31 passed (1 file)
2026-04-24 09:40:04 +02:00
253cdcecb6 Classify tenant apply policy actions
Refine tenant-apply planning so future automatic candidates, manual-only steps, and permanent out-of-scope actions are reported explicitly instead of being implied by generic prose.

---
Build: pass | Tests: pass — 31 passed (1 file)
2026-04-24 09:36:31 +02:00
2d3f2253c9 Define tenant apply preflight policy
Turn tenant-apply into a structured preflight contract that marks what already passes in the declarative model, what remains manual, and what still blocks any future automatic host mutation.

---
Build: pass | Tests: pass — 31 passed (1 file)
2026-04-24 09:32:10 +02:00
36827ab478 Add dry-run tenant apply planning
Introduce a separate tenant-apply contract that describes what a future live apply would be allowed to touch, what prerequisites it would require, and what stays explicitly manual or out of scope.

---
Build: pass | Tests: pass — 28 passed (1 file)
2026-04-24 09:14:37 +02:00
59c4006938 refactor(multitenant): platform domain config, richer CLI, comment-safe registry
- platform record now accepts internal_domain and internal_base; tenant
  internal-domain derivation honors platform.internal_base instead of
  hard-coding home.arpa
- validateTenantRecord now rejects a tenant whose internal_domain
  collides with the platform internal_domain
- tenant-lifecycle CLI now accepts --internal-domain, --service, and
  repeatable --dataset flags; tenant-list now prints
  id\\tservice\\tinternal-domain\\tdisplay-name
- writeTenantRegistry preserves YAML comments and key order via the
  yaml Document API instead of parse/stringify round-tripping
- platformHostd{SocketPath,PidFile} now use normalizeResourceId
  directly so platform-side helpers stop calling normalizeTenantId

Build: pass | Tests: pass — 1783 passed (114 files); two failing
suites (vision.test.ts, controlplane-api.test.ts) are pre-existing
on origin/multitenant and unrelated to this change.
2026-04-24 09:13:20 +02:00
b48e073848 Define tenant provisioning contract
Turn tenant planning into an explicit declarative contract that states which logical resources belong to a tenant and which host-level concerns remain intentionally out of scope.

---
Build: pass | Tests: pass — 20 passed (1 file)
2026-04-24 08:48:48 +02:00
56fbddb616 Define tenant removal safety boundaries
Make tenant removal planning distinguish declarative registry changes from protected platform resources, and block removal when a tenant overlaps platform identity or shared services.

---
Build: pass | Tests: pass — 18 passed (1 file)
2026-04-24 08:41:25 +02:00
311f663523 Harden tenant lifecycle validation
Reject empty tenant input, normalize read-path lookups, and treat shared platform resource aliases as reserved so lifecycle validation catches underscore and hyphen collisions consistently.

---
Build: pass | Tests: pass — 25 passed (2 files)
2026-04-24 08:38:29 +02:00
e040f5cfcc Add tenant lifecycle removal planning
Keep tenants as logical platform identities, preserve human display names while normalizing system ids, and add a dry-run removal path plus stronger registry validation.

---
Build: pass | Tests: pass — 28 passed (3 files)
2026-04-24 08:32:45 +02:00
ac160ea7f0 Refine tenant model to logical platform identities
Drop tenant home/repo workspace fields from the registry and lifecycle planning so tenants remain logical identities inside one platform deployment.

---
Build: pass | Tests: pass — 13 passed (2 files)
2026-04-24 08:13:58 +02:00
b8fd655f02 Refactor V2 identity and platform ownership model
Make the multitenant branch use a clean PLATFORM_*/TENANT_* model, remove active AGENT_NAME runtime usage, collapse hostd ownership into the shared platform, add operator audit surfaces, and add read-only tenant lifecycle commands.

---
Build: pass | Tests: pass — 151 passed (14 files)
2026-04-24 07:49:09 +02:00
c65c289f08 refactor(multitenant): make tenant and platform identity explicit
Replace ambiguous AGENT_NAME usage across runtime, setup, and helper scripts with explicit TENANT_ID or platform runtime identity where appropriate. Keep AGENT_NAME as a compatibility boundary instead of the primary source for shared runtime naming.

---
Build: pass | Tests: pass — 138 passed (10 files)
2026-04-23 21:41:42 +02:00
66a36a6548 refactor(multitenant): centralize controlplane session paths
Introduce a shared controlplane paths helper and use it in runtime plus operator tooling. This removes another tenant-derived path assumption and aligns controlplane session logs with the actual tmp-based layout used by the platform.

---
Build: pass | Tests: pass — 105 passed (7 files)
2026-04-23 10:11:40 +02:00
c8cfa898de refactor(multitenant): platform-scope worker jail naming
Move controlplane worker jail naming off tenant identity and onto the shared platform service identity. Also update operator-facing controlplane scripts so their error messages describe the platform service instead of implying tenant ownership.

---
Build: pass | Tests: pass — 103 passed (6 files)
2026-04-23 09:52:15 +02:00
0d801e6ecf refactor(multitenant): move shared runtime names to platform scope
Continue the platform runtime split by moving shared watchdog and controlplane defaults off tenant-derived names. Operator-facing dashboard and controlplane defaults now use the platform service identity, with tests covering the new config and socket behavior.

---
Build: pass | Tests: pass — 103 passed (6 files)
2026-04-23 09:26:36 +02:00
42393a2f99 fix(system-health): use display date format
Render the Generated timestamp using src/display-date.ts (DD.mmm.YYYY HH:MM:SS) instead of ISO 8601.

---

Build: pass | Tests: not run
2026-04-22 09:56:08 +02:00
374b3a8982 chore(harness): remove duplicate types and note local parser
Remove duplicate JailRow type in scripts/dashboard.ts and clarify that the pi extension keeps a local bastille parser copy. Also trims extra blank line.

---

Build: pass | Tests: pass — agent-runner
2026-04-22 09:49:19 +02:00
fb4104e0c3 fix(cli): correct jail-list columns (Sam & Codex)
Use shared bastille list parser so IP/name/state are correct for wide bastille output.

---
Build: pass | Tests: pass — 1683 passed (104 files)
2026-04-21 23:45:26 +02:00
2765bae250 refactor(bastille): unify list parsing (Sam & Codex)
- Add src/bastille-list.ts parser supporting wide + legacy formats (Up/Down/Stopped)
- Use shared parser in scripts dashboard/system-health
- Extend pi harness extension tests to cover wide table

---
Build: pass | Tests: pass — 1683 passed (104 files)
2026-04-21 23:36:37 +02:00
5a052718f5 fix(controlplane): repair agent-task scripts
- Add required Authorization header (CONTROLPLANE_SHARED_SECRET)
- Support selecting assigned role via `just agent-task "..." db-admin`
- Update agent-task-status to understand `task_id` and list recent tasks
- Update harness handoff Phase 7e example

---
Build: pass | Tests: pass — 103 files, 1680 tests
2026-04-21 22:23:02 +02:00
f3b2c0189a fix: restore harness validation commands (Sam & Codex)
- Fix `just dashboard`: correct hostd socket default, default output dir `html/dashboard`, mkdir output dir\n- Fix `just system-health`: parse `bastille list` correctly + call hostd `service-status` with `name`\n- Update harness validation handoff checkboxes + results\n\n---\nBuild: pass | Tests: pass — 1674 passed (102 files)
2026-04-21 21:07:06 +02:00
e97a1dec2c chore: commit backfill-embeddings maintenance script
Repairs memory_chunks rows missing vector embeddings — useful after
embedding API outages, provider switches, or fresh installs. Has
dry-run mode and rate-limit backoff. Not a skill; run manually when
semantic memory search degrades.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 09:01:23 +02:00
7a0d3888d5 fix: update all stale PostgreSQL 17 references to 18
data17 path and postgresql17 package refs were never updated when PG was
upgraded to 18. Fixes setup scripts, skills, docs, tests, and archived
playbooks to match the running system (PG 18.3, /var/db/postgres/data).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 09:12:48 +00:00
Charlie Root
9498ad28bd fix: replace hardcoded 'clawdie' with AGENT_NAME across 22 files
All hardcoded 'clawdie' references in production code now derive from
AGENT_NAME (default: 'clawdie'). This makes the mevy canary strategy
reliable — changing AGENT_NAME is all that's needed.

Changes:
- Hardcoded paths: CMS_WEBROOT, ASTRO_SITE_PATH, verify checks,
  controlplane dashboard dir, sessions dir, output dir, chown user
- Prometheus metrics: prefixed with AGENT_NAME for multi-install dashboards
- hostd log strings: use AGENT_NAME instead of 'clawdie-hostd'
- MCP server name: derived from AGENT_NAME
- Skill modify patches: container image and mount allowlist use AGENT_NAME
- SQL migration file renamed: clawdie-brain-hybrid-upgrade → brain-hybrid-upgrade
- Temp dir prefixes: all use AGENT_NAME

Kept as-is (correct pattern):
- 'clawdie' as default fallback when AGENT_NAME is unset
- .pi/extensions/clawdie-harness/ directory (pi package identity)
- html/docs-clawdie-si/ (public docs site URL)

---
Build: pass | Tests: pass — 1527 passed, 3 failed (2 files, pre-existing)
2026-04-15 21:41:41 +00:00
749431cad4 feat(phase5): system awareness + operator dashboard
Adds src/system-state.ts which collects jail/ZFS/PF/budget/task status
into a compact summary string. Injects that summary into every pi agent
session via --append-system-prompt so agents know what's running before
acting. Adds scripts/dashboard.ts which generates a self-contained HTML
operator dashboard (no LLM — plain TypeScript). Wires dashboard regen
into the controlplane heartbeat loop via CONTROLPLANE_DASHBOARD_INTERVAL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  40 failed | 766 passed (806)

---
Build: pass | Tests: FAIL — Tests  40 failed | 766 passed (806)
2026-04-14 04:52:25 +00:00
2160e7859e feat(phase4): just front door + safety rules + bugfix (Sam & Claude)
Rewrites justfile with 8 groups, 35+ recipes with docstrings.

Creates 8 CLI helper scripts (jail-status, hostd-cli, system-health,

agent-status, agent-task, agent-task-status, agent-logs, jail-provision).

Adds 4 hostd safety rules to safety.yaml (destroy, rollback, zfs-destroy, pf).

Fixes task_create empty assigned_to bug in controlplane-tools.ts.

Build: pass | Tests: not run (Linux)
2026-04-14 01:28:31 +02:00
af659e7c56 feat(phase4): just front door — 55 recipes + 10 CLI helper scripts
Adds justfile with 8 grouped recipe sections covering build, jail
management, skill catalog, agent ops, and system admin. Adds scripts for
skill-list/add/sync, jail-status, system-health, agent-task/status/logs,
harness-check, and hostd-cli. Fixes project root derivation to use
import.meta.url instead of process.cwd() so scripts work regardless of
invocation directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  40 failed | 766 passed (806)
2026-04-13 23:26:22 +00:00
2983111a5d feat(phase2): add skill library catalog and tooling
- agent/library.yaml: catalog of all 48 skills (37 operational + 11 features),
  4 agents with skill assignments, 2 prompt refs
- src/skill-library.ts: loadLibrary, searchSkills, getSkillContent,
  getAgentSkills, validateLibrary, reloadLibrary
- scripts/skill-list.ts: grouped table output with color, optional search query
- scripts/skill-add.ts: add skill from local/codeberg/github/raw source
- scripts/skill-sync.ts: refresh all remote-sourced skills in cache
- scripts/validate-library.ts: validate all local: sources exist on disk
- .agent/identities/: COORDINATOR, SYSADMIN, DB_ADMIN, GIT_ADMIN stubs
- .agent/context/FREEBSD.md: FreeBSD gotchas context for agents

Typecheck passes. `just skill-list` and `just skill-search` ready to wire up
in the justfile (Phase 4).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---
Build: pass | Tests: pass — Tests  942 passed (942)
2026-04-13 23:06:50 +00:00
afec91d1ac docs: fix 19 stale/broken issues across docs.clawdie.si and markdown sources (Sam & Claude)
- Fix stale version badge (v0.8.0 → v0.10.0) in docs/index.html
- Replace broken src/jail-runner.ts references with src/agent-runner.ts
- Replace broken <script src=/js/shared.js> in iso.html with inline JS
- Fix broken Codeberg links in nginx-ssl.html and nanoclaw-upstream.html
- Add missing changelog entries (v0.7.2, v1.0.2, v1.0.3, v0.10.0)
- Unify sidebar navigation across all 12 HTML pages
- Fix PI-SKILLS-INTEGRATION.md three-database contradiction
- Fix controlplane-install.md hardcoded clawdie0 reference
- Add missing OPS_DB_PASSWORD to CLEAN-RESET-PI-TUI preseed script
- Reposition Strapi as optional in deployment-models.md
- Update gen-changelog.ts to output docs.clawdie-si layout with sidebar

---
Build: pass | Tests: not run (Linux)
2026-04-12 13:36:33 +02:00
80d170a123 feat(harness): replace Codex controlplane runner with Aider+Pi (Sam & Claude)
The controlplane harness now uses Aider as the multi-agent orchestrator
instead of Codex. Aider launches via --message for non-interactive task
execution, with the same tmux glass-pane and log streaming as before.

Runtime changes:
- src/config.ts: ControlplaneRunner 'codex' -> 'aider', env vars renamed
- src/controlplane-codex-runner.ts -> controlplane-aider-runner.ts:
  CodexRunOptions -> AiderRunOptions, runCodexTask -> runAiderTask,
  adjusted CLI args for Aider (--message, --model)
- src/controlplane-heartbeat.ts: updated imports and branch logic
- setup/agent-cli-check.ts: pi and aider first, others optional
- setup/install.ts: added printAiderTip()
- scripts/glass.sh: right pane now launches aider instead of shell

Config:
- .env.example: CONTROLPLANE_CODEX_* -> CONTROLPLANE_AIDER_*

Docs:
- AGENTS.md: updated agent identity and CLI prerequisite
- README.md: Aider+Pi as primary driver
- doc/CONTROLPLANE-MESSAGE-CONTRACT.md: runner modes updated

Codex, Claude Code, and other CLIs remain installed and available
as optional tools. Only the harness default changed.

---
Build: pass | Tests: not run (Linux)
2026-04-12 08:26:20 +02:00
eb4ceac068 Align npm global paths and Aider install docs
---
Build: pass | Tests: pass — Tests  874 passed (874)
2026-04-12 06:19:36 +00:00