clawdie-ai/docs/internal/POSTGRES-MEMORY.md
Mevy Assistant 7a0d3888d5 fix: update all stale PostgreSQL 17 references to 18
data17 path and postgresql17 package refs were never updated when PG was
upgraded to 18. Fixes setup scripts, skills, docs, tests, and archived
playbooks to match the running system (PG 18.3, /var/db/postgres/data).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 09:12:48 +00:00

5.8 KiB

PostgreSQL Memory Plan

This document defines the PostgreSQL memory database architecture for Clawdie.

Decision

Default: dedicated FreeBSD jail named ${AGENT_NAME}-db. Optional: host-based PostgreSQL when DB_RUNTIME=host is set in .env.

Both paths run:

  • PostgreSQL 18
  • pgvector
  • pgcrypto and uuid-ossp

The database is mandatory. Clawdie will not start without a healthy connection to the memory database.

Canonical jail identity (DB_RUNTIME=jail):

  • jail: ${AGENT_NAME}-db (e.g. clawdie-db)
  • hostname: db.${AGENT_INTERNAL_DOMAIN} (e.g. db.clawdie.home.arpa)
  • provisioning: thick
  • networking: vnet
  • IP: ${SUBNET_BASE}.3 (e.g. 10.0.0.3)

This is the preferred local memory backend over trying to reproduce a full local Supabase stack immediately.

Host runtime (DB_RUNTIME=host) uses ZFS datasets:

  • zroot/${ZFS_PREFIX}/pgdata/var/db/postgres/data
  • zroot/${ZFS_PREFIX}/pgwal/var/db/postgres/wal

Use DB_COMPRESSION=lz4 (default) or DB_COMPRESSION=zstd to tune dataset compression.

Why

  • native FreeBSD packages
  • good fit for ZFS-backed jails
  • one database can hold relational memory data and vectors
  • lower operational complexity than a Linux VM or multiple specialized services

Thick vs Thin

Use a thick jail for db.

Reason:

  • database is a persistent service, not an ephemeral worker
  • upgrades and rollback should be self-contained
  • host coupling should be minimized
  • ZFS snapshots are more useful when the jail shape is stable

Initial scope

First installation milestone:

  1. create or prepare the db jail
  2. install PostgreSQL 18
  3. enable allow.sysvipc for the jail
  4. initialize and start the service
  5. install postgresql18-contrib
  6. enable pgcrypto, uuid-ossp, and vector
  7. validate local access
  8. snapshot

Deployment path

Fixed defaults — no install-time questions. The db jail is auto-created by npm run setup -- --step db.

Current deployment path:

  • network: vnet
  • IP: ${SUBNET_BASE}.3 (e.g. 10.0.0.3)
  • bridge: warden0
  • gateway: ${SUBNET_BASE}.1 (e.g. 10.0.0.1)

Canonical create command for that path:

sudo bastille create -T -B -g 10.0.0.1 clawdie-db 15.0-RELEASE 10.0.0.3/24 warden0

If a VNET db jail comes up without a default route, treat that as a provisioning defect:

  • the create path is missing the explicit -g 10.0.0.1 flag
  • fix the create command rather than adding the route manually and forgetting the root cause

Another required jail-side prerequisite discovered during real bring-up:

sudo bastille config db set allow.sysvipc 1
sudo bastille restart db

Without that, service postgresql initdb can fail with shared-memory errors.

Restore path

The deployment design should support restore from the start.

Target future flow:

  1. create db
  2. install PostgreSQL 18
  3. enable allow.sysvipc
  4. initialize cluster
  5. enable extensions
  6. optionally restore from .sql or PostgreSQL custom dump
  7. validate
  8. snapshot

Baseline resources

  • minimal: 1G RAM / 10G / 1 vCPU
  • balanced: 2G RAM / 15G / 1 vCPU

Snapshot points

  • @fresh
  • @postgres17-ready
  • @pre-schema
  • @post-extensions

ZFS note

Two different decisions matter here:

  1. ashift
  2. dataset properties

ashift is a pool/vdev decision and cannot be changed later. For modern 4 KiB devices, the expected value is usually 12.

Dataset-level settings can still be tuned later. Conservative starting settings for PostgreSQL data are:

  • compression=lz4
  • atime=off
  • recordsize=16K

Split-brain architecture

Three databases in one jail — all mandatory:

Database Role Lifecycle
Agent System Skills Preloaded read-only skills, install docs, operator workflows Updated by pulling a new versioned release
User/Agent Memory Dynamic conversation memory, user preferences, agent context Grows with use; follows its own backup lifecycle
Operational State Messages, tasks, sessions, routing, registered groups High-frequency read/write from message router

Current role split

  • PostgreSQL (Agent System Skills): preloaded read-only knowledge, chunked + embedded before release
  • PostgreSQL (User/Agent Memory): long-term memory backend with hybrid search (full-text + vector)
  • PostgreSQL (Operational State): real-time message routing, task scheduling, session tracking

Databases

Agent System Skills:

  • Name: ${AGENT_NAME}_skills (e.g. clawdie_skills)
  • Role: ${AGENT_NAME}_reader
  • Access: read-only at runtime
  • Retrieval: PostgreSQL full-text search by default, with optional vector use later

User/Agent Memory:

  • Name: ${AGENT_NAME}_brain (e.g. clawdie_brain)
  • Role: ${AGENT_NAME}_brain
  • Retrieval: hybrid memory search over the memory backend

Schema

The schema consists of three layers:

  1. memories — base table (session summaries, metadata)
  2. memory_chunks — chunked text with full-text search
  3. memory_embeddings — vector embeddings per chunk

See:

Next snapshot policy step

Once db is stable, extend snapshot policy to the other persistent service jails with lighter retention than the database:

  • db: strongest retention as critical-data
  • git: moderate retention as persistent-service
  • cms: moderate retention as persistent-service

Validation Note

The PostgreSQL 18 + pgvector installation path documented here was validated on 09.mar.2026 by the operator and Codex in the db jail on the current FreeBSD host.