Sam & Claude 2ed6245b11 feat(backup): add backup script and restore runbook

npm run backup exports all critical state to a portable tarball:
  - messages.db (SQLite — all chats, tasks, sessions)
  - memory_db.sql + skills_db.sql (pg_dump from db jail)
  - .env, groups/, mount-allowlist.json

Takes ZFS snapshots via hostd before export. Flags:
  --skip-skills   skip skills_db (large, regenerable)
  --output <dir>  write archive to specific directory
  --no-snapshot   skip ZFS snapshots

setup/sanoid.ts: add management jail dataset to snapshot retention policy.
docs/sessions/2026-03-16-backup-restore.md: full restore runbook covering
SQLite, PostgreSQL, ZFS rollback, hardware migration, and cron automation.

---
Build: pass | Tests: pass — 489 passed (48 files)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---
Build: pass | Tests: pass — Tests  489 passed | 10 skipped (499)

2026-03-16 11:17:46 +00:00

6 KiB

Raw Blame History

Session: Backup & Restore

Date: 16.mar.2026 Topic: Backup and restore procedure for a Clawdie installation

What needs backing up

Clawdie state lives in four places. All four must be backed up for a full restore.

What	Where	Why critical
messages.db	`store/messages.db`	All conversation history, tasks, sessions, chat metadata
memory_db	PostgreSQL in db jail	Agent memory, user memories, embeddings
skills_db	PostgreSQL in db jail	Built-in knowledge vectors (can be regenerated, but slow)
.env	Project root	All config: API keys, db passwords, domain, subnet, agent name
groups/	Project root	Registered Telegram groups and channel configs
mount-allowlist.json	`~/.config/clawdie-cp/`	Security allowlist for jail mounts
ZFS datasets	`zroot/${ZFS_PREFIX}/jails/{db,git,cms,management}`	Full jail filesystems via Sanoid snapshots

What you do not need to back up:

data/health/ — rebuilt on startup
data/ipc/ — ephemeral, cleared on restart
data/sessions/ — logs only, not required for restore
tmp/ — ephemeral
node_modules/ — npm install
dist/ — npm run build
artifacts/skills.db — regenerated by npm run setup -- --step skills-memory

Running a backup

npm run backup

Creates ~/clawdie-backup-DD.mmm.YYYY-HHMM.tar.gz. Also takes ZFS snapshots of the db, git, cms, and management jail datasets via hostd (requires hostd running).

Options:

npm run backup -- --skip-skills          # skip skills_db (large, regenerable)
npm run backup -- --output /mnt/usb      # write to a specific directory
npm run backup -- --no-snapshot          # skip ZFS snapshots (e.g. no root)

Requires pg_dump on the host. If missing:

pkg install postgresql17-client

Tarball contents

clawdie-backup-16.mar.2026-1130/
  manifest.json          — agent name, timestamp, list of included items
  messages.db            — SQLite database
  memory_db.sql          — pg_dump --clean (DROP + CREATE before data)
  skills_db.sql          — pg_dump --clean (omitted with --skip-skills)
  env                    — copy of .env (contains API keys — store securely)
  groups/                — group config files
  mount-allowlist.json   — mount security allowlist

Restore procedure

Step 1 — Fresh FreeBSD host

Run the standard install through at least the jails step:

npm run install-all -- --from pf --to jails

The db jail must exist and PostgreSQL must be running before restoring data.

Step 2 — Restore .env

cp path/to/backup/env .env

Do this before any other step. The db passwords in .env are needed for everything that follows.

Step 3 — Restore messages.db

Stop the agent first if it's running:

npm stop   # or: kill $(cat store/clawdie.pid)

Then:

cp path/to/backup/messages.db store/messages.db

Step 4 — Restore PostgreSQL

The db jail must be running. Verify:

jls -N name | grep db

Restore memory_db:

# Get DB_HOST from .env (WARDEN_DB_IP or subnet .3)
DB_HOST=$(grep WARDEN_DB_IP .env | cut -d= -f2)
MEMORY_DB_USER=$(grep MEMORY_DB_USER .env | cut -d= -f2)
MEMORY_DB_NAME=$(grep MEMORY_DB_NAME .env | cut -d= -f2)

psql -h "$DB_HOST" -U "$MEMORY_DB_USER" "$MEMORY_DB_NAME" < path/to/backup/memory_db.sql

Restore skills_db (if included):

SKILLS_DB_USER=$(grep SKILLS_DB_USER .env | cut -d= -f2)
SKILLS_DB_NAME=$(grep SKILLS_DB_NAME .env | cut -d= -f2)

psql -h "$DB_HOST" -U "$SKILLS_DB_USER" "$SKILLS_DB_NAME" < path/to/backup/skills_db.sql

If skills_db.sql was not included in the backup, regenerate instead:

npm run setup -- --step skills-memory

Step 5 — Restore groups and mount allowlist

cp -r path/to/backup/groups/ ./groups/
mkdir -p ~/.config/clawdie-cp
cp path/to/backup/mount-allowlist.json ~/.config/clawdie-cp/mount-allowlist.json

Step 6 — Start the agent

npm start

Verify health:

npm run doctor

ZFS restore (disk failure / full reinstall)

If you have Sanoid snapshots and need to restore a full jail filesystem:

# List available snapshots
zfs list -t snapshot zroot/${ZFS_PREFIX}/jails/db

# Roll back to a specific snapshot (destroys newer data)
zfs rollback zroot/${ZFS_PREFIX}/jails/db@autosnap_2026-03-16_04:00:00_hourly

# Or: receive from a syncoid stream (migration to new hardware)
# On old host:
syncoid zroot/${ZFS_PREFIX}/jails/db user@newhost:zroot/${ZFS_PREFIX}/jails/db

# Then on new host:
bastille start clawdie-db

Sanoid snapshot retention (configured in setup/sanoid.ts):

Dataset	Hourly	Daily	Monthly
db	24	14	3
git, cms, management	12	7	2

Migration to new hardware

Run npm run backup on old host
Transfer tarball to new host (scp or USB)
Clone repo on new host: git clone ...
Copy tarball to new host home directory
Extract: tar -xzf clawdie-backup-*.tar.gz
Follow restore procedure above (Steps 1–6)
Optional: use syncoid to transfer ZFS jails (preserves full filesystem + snapshot history)

Automating backups (cron)

Weekly export on Sunday at 02:00, keeping last 4:

# crontab -e (as clawdie user)
0 2 * * 0  cd /home/clawdie/clawdie-ai && npm run backup -- --skip-skills --output /mnt/backup >> logs/backup.log 2>&1

Pair with a cleanup to keep only the last 4 tarballs:

# After the backup line:
0 2 * * 0  find /mnt/backup -name 'clawdie-backup-*.tar.gz' | sort | head -n -4 | xargs rm -f

What the backup does NOT cover

Jail filesystem contents (nginx config, Strapi data, Git repos) — covered by ZFS snapshots via Sanoid, not this script. Use syncoid for full jail migration.
SSL certificates — stored in the cms jail. Back up via ZFS or re-issue from Let's Encrypt.
Tailscale state — re-authenticate after restore with tailscale up.
SSH keys on the host — back up ~/.ssh/ separately.

6 KiB Raw Blame History Unescape Escape