layered-soul/skills/zfs-snapshot-audit/SKILL.md
Sam & Claude e013f32145 skill: add zfs-snapshot-audit — detect orphaned ZFS snapshots and sanoid config gaps
Discovered 2026-06-22: zroot/home/clawdie was missing from sanoid config,
allowing 10 autosnaps from April to accumulate 23.6G of dead weight.
Skill covers: pool/dataset audit, sanoid coverage check, safe destroy
of orphaned snapshots, template reference, and pitfall avoidance.
2026-06-22 06:59:42 +02:00

3.5 KiB

name description category
zfs-snapshot-audit Audit ZFS snapshots and sanoid config — find orphaned/leaked snapshots, verify dataset coverage, safely destroy dead weight. freebsd

ZFS Snapshot Audit & Sanoid Coverage

Use this when disk space is tight and you suspect ZFS snapshots are holding dead weight, or when you need to verify sanoid coverage across datasets.

1. Quick check — pool and dataset overview

zpool list zroot
zfs get -H -o property,value used,refer,usedbysnapshots,available zroot/home/clawdie
df -h /

If usedbysnapshots is high (>1G), investigate.

2. List all snapshots with size

zfs list -t snapshot -o name,used,creation -S creation -r zroot 2>/dev/null

Look for snapshots from dates that exceed the sanoid retention policy.

3. Check sanoid coverage

Config lives at /usr/local/etc/sanoid/sanoid.conf. Every dataset that has autosnap snapshots should have a matching [dataset] entry with a template. Without one, sanoid --prune-snapshots won't touch them — they accumulate forever.

# List all datasets with snapshots
zfs list -t snapshot -r zroot 2>/dev/null | awk -F@ '{print $1}' | sort -u

# Cross-reference against sanoid config
grep '^\[' /usr/local/etc/sanoid/sanoid.conf

Any dataset with snapshots but no sanoid entry = orphaned.

4. Available templates (sanoid.conf)

Template hourly daily monthly Use case
operator_home_minimal 6 3 0 Operator home dir
operator_home_full 24 14 3 Full home retention
persistent_service 12 7 2 Jails (cms, git, etc.)
critical_data 24 14 3 Databases (pgdata, pgwal)

5. Destroy orphaned snapshots

Safe: destroying ZFS snapshots does NOT touch the live filesystem. Only the unique blocks held by that point-in-time copy are freed. Verify first:

# Confirm: usedbysnapshots = dead weight, referenced = live data
zfs get -H -o property,value used,refer,usedbysnapshots zroot/home/clawdie

Destroy individual snapshots:

zfs destroy zroot/home/clawdie@autosnap_2026-04-20_00:15:00_daily

Or batch by pattern (FreeBSD 15 ZFS supports % glob):

zfs destroy zroot/home/clawdie@autosnap_%

6. Add missing dataset to sanoid config

Append to /usr/local/etc/sanoid/sanoid.conf (root-owned, use sudo tee -a):

[zroot/home/clawdie]
        use_template = operator_home_minimal

Verify it takes effect (next sanoid --prune-snapshots cron run, or manually):

sudo sanoid --prune-snapshots --verbose 2>&1 | grep home/clawdie

7. Verify after cleanup

zfs get -H -o value usedbysnapshots zroot/home/clawdie
# Should be 0B (or low if new autosnaps have been taken)
zpool list zroot
df -h /

Pitfalls

  • Do NOT delete sanoid-managed snapshots by hand. If autoprune=yes, sanoid handles retention. Manual deletion of recent snapshots can confuse the policy. This workflow is for orphaned snapshots only — those with no matching sanoid [dataset] entry.
  • Don't destroy snapshots on datasets with autoprune=yes thinking you're helping — you'll just fight the cron job. Fix the policy instead.
  • The config header says "do not edit by hand" but the dataset list is the operator's domain — adding a dataset is safe.

Discovery log

2026-06-22: zroot/home/clawdie was missing from sanoid config. 10 orphaned snapshots from April 20-22 held 23.6G of dead weight (usedbysnapshots). Added operator_home_minimal template and destroyed all 10. Freed 23.5G.