clawdie-ai/docs/internal/POSTGRES-MEMORY.md
Sam & Claude faf060e0ce docs: introduce Layered Memory Fabric terminology (Sam & Codex)
Replaces public split-brain wording with Layered Memory Fabric, documents the skills/brain/ops planes, and sketches the shared FreeBSD/Linux install contract around PostgreSQL, ZFS/OpenZFS, and platform isolation adapters.\n\nChecks: npx --yes prettier@3 --check touched docs/html; git diff --check

---
Build: pass | Tests: FAIL — 1 failed
2026-06-13 21:32:50 +02:00

6.4 KiB

PostgreSQL Memory Plan

This document defines the PostgreSQL-backed memory and platform database plan for Clawdie.

Decision

Default: host PostgreSQL with DB_RUNTIME=host. Optional: dedicated FreeBSD jail for the Data Service with DB_RUNTIME=jail.

Both paths run:

  • PostgreSQL 18
  • pgvector
  • pgcrypto
  • uuid-ossp

The database is mandatory. Clawdie does not run without a healthy PostgreSQL backend for memory and operational state.

Current runtime defaults

Host runtime (DB_RUNTIME=host) is the current default:

  • jails reach PostgreSQL on ${SUBNET_BASE}.1
  • DB_HOST resolves to the host address unless explicitly overridden
  • ZFS datasets:
    • zroot/${ZFS_PREFIX}/pgdata/var/db/postgres/data
    • zroot/${ZFS_PREFIX}/pgwal/var/db/postgres/wal

Optional jail runtime (DB_RUNTIME=jail) uses the current jail registry:

  • role label: db
  • on-disk jail name: current service-prefixed db jail (for example clawdie-db)
  • hostname: db.${AGENT_INTERNAL_DOMAIN}
  • current repo registry default:
    • subnet base: 10.0.1
    • gateway: 10.0.1.1
    • db IP suffix: .5
    • bridge: warden0

Do not hardcode old .3 or 10.0.0.x examples. Resolve the live jail address from infra/jails.yaml plus env overrides.

Why host PostgreSQL is the default

  • lower operational complexity than a separate db jail on every install
  • simpler upgrade and restart path
  • fewer moving parts during onboarding and recovery
  • still compatible with ZFS-backed storage and PostgreSQL extensions

The optional jail path still exists for installs that explicitly want DB isolation or a service-jail layout.

Thick vs thin

If the optional db jail is used, keep it thick.

Reason:

  • database state is persistent, not ephemeral
  • rollback and snapshots should remain self-contained
  • database upgrades should not depend on a thin base being in lockstep

Initial scope

First installation milestone:

  1. provision host PostgreSQL or the optional db jail
  2. install PostgreSQL 18
  3. initialize and start the service
  4. install postgresql18-contrib
  5. enable pgcrypto, uuid-ossp, and vector
  6. validate local access
  7. snapshot before schema work

If the optional db jail path is used, also enable:

sudo bastille config db set allow.sysvipc 1
sudo bastille restart db

Without that, service postgresql initdb can fail with shared-memory errors.

Deployment path

Current canonical automation lives in:

  • setup/db.ts
  • setup/jail-provision.ts
  • src/jail-schema.ts
  • infra/jails.yaml

The default path is host PostgreSQL.

If DB_RUNTIME=jail is chosen, the current repo-default jail create shape is:

  • network: vnet
  • IP: 10.0.1.5
  • bridge: warden0
  • gateway: 10.0.1.1

Canonical example for the current registry default:

sudo bastille create -T -B -g 10.0.1.1 clawdie-db 15.0-RELEASE 10.0.1.5/24 warden0

Treat missing default route or bad addressing as a provisioning defect. Fix the registry/create path rather than layering manual one-off network fixes.

Restore path

The deployment design should support restore from the start:

  1. provision host PostgreSQL or create db
  2. install PostgreSQL 18
  3. initialize cluster
  4. enable extensions
  5. optionally restore from .sql or PostgreSQL custom dump
  6. validate
  7. snapshot

Baseline resources

  • minimal: 1G RAM / 10G / 1 vCPU
  • balanced: 2G RAM / 15G / 1 vCPU

Snapshot points

  • @fresh
  • @postgres18-ready
  • @pre-schema
  • @post-extensions

ZFS note

Two different decisions matter here:

  1. ashift
  2. dataset properties

ashift is a pool/vdev decision and cannot be changed later. For modern 4 KiB devices, the expected value is usually 12.

Conservative starting settings for PostgreSQL data:

  • compression=lz4
  • atime=off
  • recordsize=16K

Database families and naming model

PostgreSQL hosts both shared platform DBs and tenant-derived DBs.

Shared platform DBs use the system_* prefix:

  • system_brain
  • system_skills
  • system_ops
  • system_git
  • system_web

Shared roles include:

  • system_brain
  • system_reader
  • system_ops
  • system_git
  • system_web

Tenant DB names are derived from dbSlug(tenantId) in src/db-identifiers.ts. Examples:

  • <slug>_brain
  • <slug>_skills
  • <slug>_ops
  • <slug>_forgejo

Do not derive DB names from ASSISTANT_NAME.

Layered Memory Fabric responsibilities

Three core data planes drive Clawdie's runtime behavior. This is not a "split-brain" design: PostgreSQL remains the common data fabric, while each plane has a clear lifecycle and ownership boundary.

Plane Default shared DB Purpose
Skills system_skills Reviewed read-only skills, install docs, operator workflows
Brain system_brain Dynamic conversation memory, preferences, compaction summaries
Ops system_ops Messages, tasks, sessions, routing, registered groups

Embeddings and full-text indexes accelerate recall, but PostgreSQL rows and committed skill artifacts remain the source of truth. On FreeBSD the fabric is usually protected with ZFS datasets and jail-aware service boundaries. On Linux, the same logical database contract can run against host PostgreSQL or a native service/container path, with OpenZFS preferred when available.

Additional platform service DBs also live in PostgreSQL when enabled:

  • system_git
  • system_web

Memory schema

The memory schema consists of three layers:

  1. memories — base table (session summaries, metadata)
  2. memory_chunks — chunked text with full-text search
  3. memory_embeddings — vector embeddings per chunk

See:

Next snapshot policy step

Once PostgreSQL is stable, extend snapshot policy to other persistent services with lighter retention than the database:

  • database: strongest retention as critical_data
  • git service: moderate retention as persistent_service
  • web service: moderate retention as persistent_service

Validation note

The PostgreSQL 18 + pgvector path has been validated in both host-runtime and service-jail-oriented development flows. Current repo defaults, however, are host-first. Any doc or skill that treats the db jail as the default is stale.