Operator & Claude Code 398bdd5f5f Prune stale docs/internal handoffs, reviews, and superseded plans

Every file under docs/internal/ ends up in the bootstrap/skills-memory
artifact (per metadata.json: "Full project docs, internal docs, identity
files, and skill definitions"). Stale handoffs, dated build reports,
single-commit reviews, and superseded design notes were polluting the
embedding index with low-signal chunks.

Removed:
- TLS-CERT-LIFECYCLE-HANDOFF.md, GLASSPANE-FREEBSD-HANDOFF.md,
  CMS-ASTRO-SOURCE-OF-TRUTH-HANDOFF.md (handoffs whose work has landed)
- HOST-DB-READINESS-REVIEW.md, HOST-DB-REBOOT-REVIEW.md,
  HOST-DB-RECOVERY-PLAN.md, SYSTEM-NAMESPACE-BRANCH-REVIEW.md
  (commit/branch reviews self-marked as historical)
- BUILD-TEST-REPORT-06.APR.2026.md, test-results.md (dated snapshots)
- DEBUG_CHECKLIST.md (Feb 2026 known-issues list, top item already fixed)
- BOOTABLE-ISO-PLAN-V1.md (V1 plan; ISO-FIRST-BOOT-IMPLEMENTATION.md is now
  the source of truth)
- STRAPI-FREEBSD-SETUP.md, PI-SKILLS-INTEGRATION.md, CODEX-FREEBSD.md
  (workarounds and one-off design notes for resolved/superseded paths)
- REFACTOR-PLAN.md, nanoclaw-architecture-final.md, AGENT-HARNESS-V2.md,
  AGENT-SKILLS-VS-REALITY.md (older planning/architecture docs whose
  decisions are now in code or ARCHITECTURE.md)
- BUILTIN-KNOWLEDGE-SPEC.md, LOCAL-KNOWLEDGE-BOOTSTRAP.md (early specs
  superseded by SKILLS-ARTIFACT-V1-PLAN.md)
- HEARTBEAT.md (design doc; implementation lives in scripts/heartbeat.sh
  and src/controlplane-heartbeat.ts)
- POSTGRES-PERMISSIONS.md (one-off fix recipe)
- RUNTIME-MANIFEST-DESIGN.md (status: Implemented; design is in code now)

Updates to remaining files patch broken cross-links:
- ARCHITECTURE.md drops the two table rows pointing at deleted docs
- doc/THREE-BIRD-ARCHITECTURE.md drops Strapi-setup link references
- docs/internal/SKILLS-ARTIFACT-V1-PLAN.md drops the "Depends on" line
- docs/internal/SUDO_REPLACEMENT.md trims its list of internal docs that
  reference sudo
- .agent/skills/setup and .agent/skills/docs-deployment drop pointers to
  REFACTOR-PLAN and DEBUG_CHECKLIST

Net: 23 files deleted, 7566 lines removed. docs/internal/ goes from 41 to
18 markdown files. The artifact's next refresh will see proportionally
less noise in retrieval.

---
Build: FAIL | Tests: FAIL — 16 failed

2026-05-10 13:34:27 +02:00

8.5 KiB

Raw Permalink Blame History

Architecture Overview

Last Updated: 16.apr.2026

Clawdie is a self-hosted AI assistant platform running on FreeBSD. It uses Bastille jails for service isolation, PostgreSQL for all data, and a multi-agent control plane for task orchestration.

High-Level Layout

FreeBSD Host (ZFS)
├── Agent Service (runs as AGENT_NAME user, port 3100)
│   ├── Telegram bot (message intake)
│   ├── HTTP REST API (control plane + health/metrics)
│   ├── Unified scheduler (task routing, heartbeats, budgets)
│   ├── Control plane runner (spawns pi/aider per task)
│   └── Watchdog (health checks, concurrency control)
│
├── hostd daemon (root, Unix socket)
│   └── Privileged ops: bastille, zfs, pf
│
├── PostgreSQL 18 (on host by default; db jail is opt-in via DB_RUNTIME=jail)
│   ├── {agent}_ops      — tasks, agents, activity, budgets, approvals
│   ├── {agent}_skills   — built-in knowledge (read-only artifact)
│   └── {agent}_memory   — user/agent dynamic memory, pgvector embeddings
│
└── Bastille Jails
    ├── db        (.3) — Data Service: PostgreSQL (only when DB_RUNTIME=jail; host is default)
    ├── cms       (.4) — Web Service: nginx + Astro static site
    ├── git       (.6) — Code Service: bare repos + Forgejo (optional)
    ├── llama-cpp (.5) — Local LLM inference (optional)
    ├── worker    (.101) — General worker jail (legacy)
    ├── db-worker (.211) — DB Admin agent jail (Phase 7)
    ├── git-worker (.212) — Git Admin agent jail (Phase 7)
    └── ctrl-worker (.213) — Coordinator agent jail (Phase 7)

Agent System

One agent runs per installation. The agent has a name (AGENT_NAME, default: clawdie) and runs as a FreeBSD service under that user.

Roles

Role	Budget	Heartbeat	Purpose
Orchestrator	80%	On-demand	Primary decision-maker, responds to Telegram
Sysadmin	10%	Daily	System health checks, ZFS, PF, jails
DB Admin	5%	On-demand	PostgreSQL maintenance, migrations
Git Admin	5%	On-demand	Repository management, backups

Each role has an identity file in .agent/identities/ that gets injected when the agent spawns for that role.

Task Flow

Telegram message / API request
  → Control plane queues task
  → Scheduler assigns to specialist role
  → Runner spawns pi/aider with role identity + budget + `--no-skills`
  → Agent gets: identity file + skill index + (on FreeBSD) pi extension tools
  → Output captured, activity logged
  → Response routed back to channel

Prompt Assembly

Context	Source	Frequency	Path
Identity	`.agent/identities/{ROLE}.md` + SOUL/USER/IDENTITY files	Per-run	Both (controlplane + telegram)
Runtime manifest	`src/runtime-manifest.ts` (repo/skills/capabilities)	Fresh per-message	Injected into main prompt
Skill index	`agent/library.yaml` → one-line summaries	Per-run	Controlplane (pi)
Profile rules	`src/pi-profile.ts`	Per-run	Telegram only
System state	`src/system-state.ts` (live hostd/ZFS/PF)	Per-run	Telegram only
Pi extension tools	`.pi/extensions/clawdie-harness/`	Per-run	Telegram only (needs loading for controlplane)

Runtime manifest (<runtime-manifest> block):

Generated fresh from local sources: .git config, agent/library.yaml, built-in artifact metadata
Answers: "What repo am I running from? What branch? What skills exist? What specialists can I coordinate?"
Injected as compact XML-like block (~50 tokens), solves the coherence gap where agent infrastructure facts were invisible to the model
See src/runtime-manifest.ts for implementation

Skills are injected as a compact index (~200 tokens) instead of full content (~15,000+ tokens). Full SKILL.md available on-demand through the skills_search extension tool.

Jail Isolation (Phase 7)

When CONTROLPLANE_JAIL_ISOLATION=YES, specialist agents run inside dedicated thin jails. Each jail gets scoped secrets (DB creds for db-worker, SSH keys for git-worker) and restricted network access via PF. Feature flag defaults to NO.

Jail agents reach hostd through the controlplane API (POST /api/controlplane/hostd), not via direct Unix socket. The API authenticates the request and proxies to the hostd daemon. This means no socket mount is needed inside jails — only network access to CONTROLPLANE_HOST_IP:CONTROLPLANE_API_PORT.

Split-Brain Database

All three databases run on the same PostgreSQL 18 instance, each with its own user and permissions:

Database	Contents	Write Pattern
`{agent}_ops`	Tasks, agents, activity log, budgets, approvals, auth	Frequent writes
`{agent}_skills`	Preloaded knowledge chunks with pgvector embeddings	Read-only after import
`{agent}_memory`	User facts, agent memories, semantic search via pgvector	Moderate writes

Multiple agents on the same host share the PostgreSQL instance but get their own set of 3 databases (e.g., clawdie_ops + atlas_ops).

Configuration

All runtime config comes from .env in the project root. Key variables:

Variable	Purpose	Default
`AGENT_NAME`	Agent identity	`clawdie`
`DB_RUNTIME`	PostgreSQL location	`host`
`CONTROLPLANE_JAIL_ISOLATION`	Enable per-specialist jails	`NO`
`WARDEN_SUBNET_BASE`	Jail IP subnet	`10.0.0`
`CONTROLPLANE_PORT`	API port	`3100`
`CONTROLPLANE_SHARED_SECRET`	API auth for agent subprocesses	``
`CONTROLPLANE_BIND_HOST`	API listen address	`0.0.0.0`
`AGENT_MAX_INBOUND_CHARS`	Inbound message cap	`12000`
`AGENT_SESSION_MAX_BYTES`	Session rollover threshold	`2000000`
`PI_TUI_PROVIDER`	LLM provider	(required)

Secrets (DB passwords, API keys) are generated by setup/secrets.ts and stored in .env.

Infrastructure as Code

infra/jails.yaml — Single source of truth for all jail definitions (IPs, packages, services, mounts)
setup/bastille-helpers.ts — Shared provisioner (create, start, install packages, configure services)
setup/install.ts — 20-step install orchestrator with ZFS checkpoints
justfile — CLI front door with 60+ recipes for common operations

Channels

Messages arrive via Telegram (grammy bot) or HTTP API. The router dispatches to the control plane, which queues tasks and assigns them to specialist agents.

Documentation Map

Topic	File
Agent development conventions	`AGENTS.md`
Contributing guide	`CONTRIBUTING.md`
Control plane architecture	`doc/CONTROLPLANE-ARCHITECTURE.md`
Agent roles and skills	`doc/CONTROLPLANE-AGENT-ROLES.md`
API message contracts	`doc/CONTROLPLANE-MESSAGE-CONTRACT.md`
Multi-LLM provider routing	`doc/MULTI-PROVIDER-ARCHITECTURE.md`
Docs localization pipeline	`doc/THREE-BIRD-ARCHITECTURE.md`
Install guide	`docs/public/install/install.md`
Deployment models	`docs/public/architecture/deployment-models.md`
Disaster recovery	`docs/public/operate/db-disaster-recovery.md`

8.5 KiB Raw Permalink Blame History