clawdie-ai/ARCHITECTURE.md

# Architecture Overview

**Last Updated:** 16.apr.2026

Clawdie is a self-hosted AI assistant platform running on FreeBSD. It uses Bastille jails for service isolation, PostgreSQL for all data, and a multi-agent control plane for task orchestration.

## High-Level Layout

```
FreeBSD Host (ZFS)
├── Agent Service (runs as AGENT_NAME user, port 3100)
│   ├── Telegram bot (message intake)
│   ├── HTTP REST API (control plane + health/metrics)
│   ├── Unified scheduler (task routing, heartbeats, budgets)
│   ├── Control plane runner (spawns pi/aider per task)
│   └── Watchdog (health checks, concurrency control)
│
├── hostd daemon (root, Unix socket)
│   └── Privileged ops: bastille, zfs, pf
│
├── PostgreSQL 18 (on host by default; db jail is opt-in via DB_RUNTIME=jail)
│   ├── {agent}_ops      — tasks, agents, activity, budgets, approvals
│   ├── {agent}_skills   — built-in knowledge (read-only artifact)
│   └── {agent}_memory   — user/agent dynamic memory, pgvector embeddings
│
└── Bastille Jails
    ├── db        (.3) — Data Service: PostgreSQL (only when DB_RUNTIME=jail; host is default)
    ├── cms       (.4) — Web Service: nginx + Astro static site
    ├── git       (.6) — Code Service: bare repos + Forgejo (optional)
    ├── llama-cpp (.5) — Local LLM inference (optional)
    ├── worker    (.101) — General worker jail (legacy)
    ├── db-worker (.211) — DB Admin agent jail (Phase 7)
    ├── git-worker (.212) — Git Admin agent jail (Phase 7)
    └── ctrl-worker (.213) — Coordinator agent jail (Phase 7)
```

## Agent System

One agent runs per installation. The agent has a name (`AGENT_NAME`, default: `clawdie`) and runs as a FreeBSD service under that user.

### Roles

| Role         | Budget | Heartbeat | Purpose                                      |
| ------------ | ------ | --------- | -------------------------------------------- |
| Orchestrator | 80%    | On-demand | Primary decision-maker, responds to Telegram |
| Sysadmin     | 10%    | Daily     | System health checks, ZFS, PF, jails         |
| DB Admin     | 5%     | On-demand | PostgreSQL maintenance, migrations           |
| Git Admin    | 5%     | On-demand | Repository management, backups               |

Each role has an identity file in `.agent/identities/` that gets injected when the agent spawns for that role.

### Task Flow

```
Telegram message / API request
  → Control plane queues task
  → Scheduler assigns to specialist role
  → Runner spawns pi/aider with role identity + budget + `--no-skills`
  → Agent gets: identity file + skill index + (on FreeBSD) pi extension tools
  → Output captured, activity logged
  → Response routed back to channel
```

### Prompt Assembly

| Context            | Source                                    | Frequency | Path                                           |
| ------------------ | ----------------------------------------- | --------- | ---------------------------------------------- |
| Identity           | `.agent/identities/{ROLE}.md` + SOUL/USER/IDENTITY files | Per-run   | Both (controlplane + telegram)                  |
| Runtime manifest   | `src/runtime-manifest.ts` (repo/skills/capabilities) | Fresh per-message | Injected into main prompt |
| Skill index        | `agent/library.yaml` → one-line summaries | Per-run   | Controlplane (pi)                              |
| Profile rules      | `src/pi-profile.ts`                       | Per-run   | Telegram only                                  |
| System state       | `src/system-state.ts` (live hostd/ZFS/PF) | Per-run   | Telegram only                                  |
| Pi extension tools | `.pi/extensions/clawdie-harness/`         | Per-run   | Telegram only (needs loading for controlplane) |

**Runtime manifest** (`<runtime-manifest>` block):
- Generated fresh from local sources: `.git` config, `agent/library.yaml`, built-in artifact metadata
- Answers: "What repo am I running from? What branch? What skills exist? What specialists can I coordinate?"
- Injected as compact XML-like block (~50 tokens), solves the coherence gap where agent infrastructure facts were invisible to the model
- See `src/runtime-manifest.ts` for implementation

Skills are injected as a compact index (~200 tokens) instead of full content (~15,000+ tokens). Full SKILL.md available on-demand through the `skills_search` extension tool.

### Jail Isolation (Phase 7)

When `CONTROLPLANE_JAIL_ISOLATION=YES`, specialist agents run inside dedicated thin jails. Each jail gets scoped secrets (DB creds for db-worker, SSH keys for git-worker) and restricted network access via PF. Feature flag defaults to `NO`.

Jail agents reach hostd **through the controlplane API** (`POST /api/controlplane/hostd`), not via direct Unix socket. The API authenticates the request and proxies to the hostd daemon. This means no socket mount is needed inside jails — only network access to `CONTROLPLANE_HOST_IP:CONTROLPLANE_API_PORT`.

## Split-Brain Database

All three databases run on the same PostgreSQL 18 instance, each with its own user and permissions:

| Database         | Contents                                                 | Write Pattern          |
| ---------------- | -------------------------------------------------------- | ---------------------- |
| `{agent}_ops`    | Tasks, agents, activity log, budgets, approvals, auth    | Frequent writes        |
| `{agent}_skills` | Preloaded knowledge chunks with pgvector embeddings      | Read-only after import |
| `{agent}_memory` | User facts, agent memories, semantic search via pgvector | Moderate writes        |

Multiple agents on the same host share the PostgreSQL instance but get their own set of 3 databases (e.g., `clawdie_ops` + `mevy_ops`).

## Configuration

All runtime config comes from `.env` in the project root. Key variables:

| Variable                      | Purpose                         | Default    |
| ----------------------------- | ------------------------------- | ---------- |
| `AGENT_NAME`                  | Agent identity                  | `clawdie`  |
| `DB_RUNTIME`                  | PostgreSQL location             | `host`     |
| `CONTROLPLANE_JAIL_ISOLATION` | Enable per-specialist jails     | `NO`       |
| `WARDEN_SUBNET_BASE`          | Jail IP subnet                  | `10.0.0`   |
| `CONTROLPLANE_PORT`           | API port                        | `3100`     |
| `CONTROLPLANE_SHARED_SECRET`  | API auth for agent subprocesses | ``         |
| `CONTROLPLANE_BIND_HOST`      | API listen address              | `0.0.0.0`  |
| `AGENT_MAX_INBOUND_CHARS`     | Inbound message cap             | `12000`    |
| `AGENT_SESSION_MAX_BYTES`     | Session rollover threshold      | `2000000`  |
| `PI_TUI_PROVIDER`             | LLM provider                    | (required) |

Secrets (DB passwords, API keys) are generated by `setup/secrets.ts` and stored in `.env`.

## Infrastructure as Code

- `infra/jails.yaml` — Single source of truth for all jail definitions (IPs, packages, services, mounts)
- `setup/bastille-helpers.ts` — Shared provisioner (create, start, install packages, configure services)
- `setup/install.ts` — 20-step install orchestrator with ZFS checkpoints
- `justfile` — CLI front door with 60+ recipes for common operations

## Channels

Messages arrive via Telegram (grammy bot) or HTTP API. The router dispatches to the control plane, which queues tasks and assigns them to specialist agents.

## Documentation Map

| Topic                         | File                                            |
| ----------------------------- | ----------------------------------------------- |
| Agent development conventions | `AGENTS.md`                                     |
| Contributing guide            | `CONTRIBUTING.md`                               |
| Control plane architecture    | `doc/CONTROLPLANE-ARCHITECTURE.md`              |
| Agent roles and skills        | `doc/CONTROLPLANE-AGENT-ROLES.md`               |
| API message contracts         | `doc/CONTROLPLANE-MESSAGE-CONTRACT.md`          |
| Multi-LLM provider routing    | `doc/MULTI-PROVIDER-ARCHITECTURE.md`            |
| Docs localization pipeline    | `doc/THREE-BIRD-ARCHITECTURE.md`                |
| Harness evolution plan        | `docs/internal/AGENT-HARNESS-V2.md`             |
| Skills architecture           | `docs/internal/nanoclaw-architecture-final.md`  |
| Install guide                 | `docs/public/install/install.md`                |
| Deployment models             | `docs/public/architecture/deployment-models.md` |
| Disaster recovery             | `docs/public/operate/db-disaster-recovery.md`   |