From e013f32145de3450bdbc0cb5c5257f52859d9546 Mon Sep 17 00:00:00 2001 From: Sam & Claude Date: Mon, 22 Jun 2026 06:59:42 +0200 Subject: [PATCH] =?UTF-8?q?skill:=20add=20zfs-snapshot-audit=20=E2=80=94?= =?UTF-8?q?=20detect=20orphaned=20ZFS=20snapshots=20and=20sanoid=20config?= =?UTF-8?q?=20gaps?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Discovered 2026-06-22: zroot/home/clawdie was missing from sanoid config, allowing 10 autosnaps from April to accumulate 23.6G of dead weight. Skill covers: pool/dataset audit, sanoid coverage check, safe destroy of orphaned snapshots, template reference, and pitfall avoidance. --- skills/zfs-snapshot-audit/SKILL.md | 116 +++++++++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 skills/zfs-snapshot-audit/SKILL.md diff --git a/skills/zfs-snapshot-audit/SKILL.md b/skills/zfs-snapshot-audit/SKILL.md new file mode 100644 index 0000000..59a2b75 --- /dev/null +++ b/skills/zfs-snapshot-audit/SKILL.md @@ -0,0 +1,116 @@ +--- +name: zfs-snapshot-audit +description: Audit ZFS snapshots and sanoid config — find orphaned/leaked snapshots, verify dataset coverage, safely destroy dead weight. +category: freebsd +--- + +# ZFS Snapshot Audit & Sanoid Coverage + +Use this when disk space is tight and you suspect ZFS snapshots are holding dead weight, or when you need to verify sanoid coverage across datasets. + +## 1. Quick check — pool and dataset overview + +```bash +zpool list zroot +zfs get -H -o property,value used,refer,usedbysnapshots,available zroot/home/clawdie +df -h / +``` + +If `usedbysnapshots` is high (>1G), investigate. + +## 2. List all snapshots with size + +```bash +zfs list -t snapshot -o name,used,creation -S creation -r zroot 2>/dev/null +``` + +Look for snapshots from dates that exceed the sanoid retention policy. + +## 3. Check sanoid coverage + +Config lives at `/usr/local/etc/sanoid/sanoid.conf`. Every dataset that has +`autosnap` snapshots should have a matching `[dataset]` entry with a template. +Without one, `sanoid --prune-snapshots` won't touch them — they accumulate +forever. + +```bash +# List all datasets with snapshots +zfs list -t snapshot -r zroot 2>/dev/null | awk -F@ '{print $1}' | sort -u + +# Cross-reference against sanoid config +grep '^\[' /usr/local/etc/sanoid/sanoid.conf +``` + +Any dataset with snapshots but no sanoid entry = **orphaned**. + +## 4. Available templates (sanoid.conf) + +| Template | hourly | daily | monthly | Use case | +|----------|--------|-------|---------|----------| +| `operator_home_minimal` | 6 | 3 | 0 | Operator home dir | +| `operator_home_full` | 24 | 14 | 3 | Full home retention | +| `persistent_service` | 12 | 7 | 2 | Jails (cms, git, etc.) | +| `critical_data` | 24 | 14 | 3 | Databases (pgdata, pgwal) | + +## 5. Destroy orphaned snapshots + +**Safe:** destroying ZFS snapshots does NOT touch the live filesystem. Only the +unique blocks held by that point-in-time copy are freed. Verify first: + +```bash +# Confirm: usedbysnapshots = dead weight, referenced = live data +zfs get -H -o property,value used,refer,usedbysnapshots zroot/home/clawdie +``` + +Destroy individual snapshots: + +```bash +zfs destroy zroot/home/clawdie@autosnap_2026-04-20_00:15:00_daily +``` + +Or batch by pattern (FreeBSD 15 ZFS supports `%` glob): + +```bash +zfs destroy zroot/home/clawdie@autosnap_% +``` + +## 6. Add missing dataset to sanoid config + +Append to `/usr/local/etc/sanoid/sanoid.conf` (root-owned, use `sudo tee -a`): + +``` +[zroot/home/clawdie] + use_template = operator_home_minimal +``` + +Verify it takes effect (next `sanoid --prune-snapshots` cron run, or manually): + +```bash +sudo sanoid --prune-snapshots --verbose 2>&1 | grep home/clawdie +``` + +## 7. Verify after cleanup + +```bash +zfs get -H -o value usedbysnapshots zroot/home/clawdie +# Should be 0B (or low if new autosnaps have been taken) +zpool list zroot +df -h / +``` + +## Pitfalls + +- **Do NOT delete sanoid-managed snapshots by hand.** If `autoprune=yes`, sanoid + handles retention. Manual deletion of recent snapshots can confuse the policy. + This workflow is for **orphaned** snapshots only — those with no matching + sanoid `[dataset]` entry. +- **Don't destroy snapshots on datasets with `autoprune=yes`** thinking you're + helping — you'll just fight the cron job. Fix the policy instead. +- The config header says "do not edit by hand" but the dataset list is the + operator's domain — adding a dataset is safe. + +## Discovery log + +2026-06-22: `zroot/home/clawdie` was missing from sanoid config. 10 orphaned +snapshots from April 20-22 held 23.6G of dead weight (`usedbysnapshots`). +Added `operator_home_minimal` template and destroyed all 10. Freed 23.5G.