Covers the case where df unchanged after rm -rf or cargo clean because sanoid snapshots captured the deleted files. Documents the vacuum procedure: identify holding snapshots, destroy them to reclaim space immediately, or use sanoid --prune-snapshots for the gentler path. Updates Pitfalls to acknowledge this as the exception to "never touch sanoid-managed snaps." Discovered 2026-06-24: cargo clean freed 5.5G but df showed 16G unchanged. usedbysnapshots = 26.6G across 9 sanoid snapshots. Full vacuum freed 13G (16G → 29G free, pool 80% → 72%).
171 lines
5.8 KiB
Markdown
171 lines
5.8 KiB
Markdown
---
|
|
name: zfs-snapshot-audit
|
|
description: Audit ZFS snapshots and sanoid config — find orphaned/leaked snapshots, verify dataset coverage, safely destroy dead weight.
|
|
category: freebsd
|
|
---
|
|
|
|
# ZFS Snapshot Audit & Sanoid Coverage
|
|
|
|
Use this when disk space is tight and you suspect ZFS snapshots are holding dead weight, or when you need to verify sanoid coverage across datasets.
|
|
|
|
## 1. Quick check — pool and dataset overview
|
|
|
|
```bash
|
|
zpool list zroot
|
|
zfs get -H -o property,value used,refer,usedbysnapshots,available zroot/home/clawdie
|
|
df -h /
|
|
```
|
|
|
|
If `usedbysnapshots` is high (>1G), investigate.
|
|
|
|
## 2. List all snapshots with size
|
|
|
|
```bash
|
|
zfs list -t snapshot -o name,used,creation -S creation -r zroot 2>/dev/null
|
|
```
|
|
|
|
Look for snapshots from dates that exceed the sanoid retention policy.
|
|
|
|
## 3. Check sanoid coverage
|
|
|
|
Config lives at `/usr/local/etc/sanoid/sanoid.conf`. Every dataset that has
|
|
`autosnap` snapshots should have a matching `[dataset]` entry with a template.
|
|
Without one, `sanoid --prune-snapshots` won't touch them — they accumulate
|
|
forever.
|
|
|
|
```bash
|
|
# List all datasets with snapshots
|
|
zfs list -t snapshot -r zroot 2>/dev/null | awk -F@ '{print $1}' | sort -u
|
|
|
|
# Cross-reference against sanoid config
|
|
grep '^\[' /usr/local/etc/sanoid/sanoid.conf
|
|
```
|
|
|
|
Any dataset with snapshots but no sanoid entry = **orphaned**.
|
|
|
|
## 4. Available templates (sanoid.conf)
|
|
|
|
| Template | hourly | daily | monthly | Use case |
|
|
|----------|--------|-------|---------|----------|
|
|
| `operator_home_minimal` | 6 | 3 | 0 | Operator home dir |
|
|
| `operator_home_full` | 24 | 14 | 3 | Full home retention |
|
|
| `persistent_service` | 12 | 7 | 2 | Jails (cms, git, etc.) |
|
|
| `critical_data` | 24 | 14 | 3 | Databases (pgdata, pgwal) |
|
|
|
|
## 5. Destroy orphaned snapshots
|
|
|
|
**Safe:** destroying ZFS snapshots does NOT touch the live filesystem. Only the
|
|
unique blocks held by that point-in-time copy are freed. Verify first:
|
|
|
|
```bash
|
|
# Confirm: usedbysnapshots = dead weight, referenced = live data
|
|
zfs get -H -o property,value used,refer,usedbysnapshots zroot/home/clawdie
|
|
```
|
|
|
|
Destroy individual snapshots:
|
|
|
|
```bash
|
|
zfs destroy zroot/home/clawdie@autosnap_2026-04-20_00:15:00_daily
|
|
```
|
|
|
|
Or batch by pattern (FreeBSD 15 ZFS supports `%` glob):
|
|
|
|
```bash
|
|
zfs destroy zroot/home/clawdie@autosnap_%
|
|
```
|
|
|
|
## 8. Disk-pressure after large deletions (snapshot vacuum)
|
|
|
|
When you delete a large directory (e.g. `cargo clean` freeing 5.5G of
|
|
`target/`), the space is NOT freed if sanoid-managed snapshots captured the
|
|
files. df shows no change. The deleted blocks are locked in the snapshot chain
|
|
until every snapshot that captured them is rotated out.
|
|
|
|
**Symptom:** `df` unchanged after `cargo clean` or `rm -rf large-dir`.
|
|
|
|
**Check:**
|
|
|
|
```bash
|
|
zfs get -H -o value usedbysnapshots zroot/home/clawdie
|
|
# If high (>5G) and you just did a large deletion → vacuum needed
|
|
zfs list -t snapshot -o name,used,creation -r zroot/home/clawdie
|
|
# Find the snapshots taken during/after the files were created
|
|
```
|
|
|
|
**Fix — reclaim space immediately:**
|
|
|
|
```bash
|
|
# Destroy ALL snapshots on the dataset (aggressive, zero history retained):
|
|
sudo zfs destroy zroot/home/clawdie@autosnap_%
|
|
|
|
# Or: destroy specific snapshots that captured pre-deletion state:
|
|
sudo zfs destroy zroot/home/clawdie@autosnap_2026-06-24_14:00:00_hourly
|
|
# Repeat for each snapshot, then verify:
|
|
zfs get -H -o value usedbysnapshots zroot/home/clawdie
|
|
# Should approach 0B
|
|
```
|
|
|
|
**After reclaim:**
|
|
|
|
```bash
|
|
df -h /home/clawdie # free space should jump
|
|
zpool list zroot # pool capacity drops
|
|
```
|
|
|
|
Sanoid will begin taking fresh snapshots on its next cron tick.
|
|
|
|
**Prevent next time:** after any large deletion, run `sudo sanoid --prune-snapshots`
|
|
to immediately rotate out hourlies that captured the deleted data, without
|
|
losing the daily safety net.
|
|
|
|
## 9. Add missing dataset to sanoid config
|
|
|
|
Append to `/usr/local/etc/sanoid/sanoid.conf` (root-owned, use `sudo tee -a`):
|
|
|
|
```
|
|
[zroot/home/clawdie]
|
|
use_template = operator_home_minimal
|
|
```
|
|
|
|
Verify it takes effect (next `sanoid --prune-snapshots` cron run, or manually):
|
|
|
|
```bash
|
|
sudo sanoid --prune-snapshots --verbose 2>&1 | grep home/clawdie
|
|
```
|
|
|
|
## 10. Verify after cleanup
|
|
|
|
```bash
|
|
zfs get -H -o value usedbysnapshots zroot/home/clawdie
|
|
# Should be 0B (or low if new autosnaps have been taken)
|
|
zpool list zroot
|
|
df -h /
|
|
```
|
|
|
|
## Pitfalls
|
|
|
|
- **Prefer sanoid prune over manual destroy.** If disk pressure is not urgent,
|
|
run `sudo sanoid --prune-snapshots` and let retention policy rotate out old
|
|
snapshots. This preserves the daily safety net.
|
|
- **Exception: snapshot vacuum after large deletions.** When you need the space
|
|
NOW (e.g. `cargo clean` freed 5.5G but df shows no change), manual destroy
|
|
of sanoid-managed snapshots is warranted. See §8 above. Destroy all snapshots
|
|
on the target dataset, then let sanoid rebuild them.
|
|
- **Don't destroy snapshots on datasets with `autoprune=yes`** during normal
|
|
operations — you'll fight the cron job. This is only for the vacuum case.
|
|
- The config header says "do not edit by hand" but the dataset list is the
|
|
operator's domain — adding a dataset is safe.
|
|
|
|
## Discovery log
|
|
|
|
2026-06-24: Hit "snapshot vacuum" — `cargo clean` freed 5.5G but df showed 16G
|
|
unchanged. `usedbysnapshots` = 26.6G. Sanoid's hourly snapshots (14:00-20:00)
|
|
captured the target/ directory before deletion. Destroyed a May boot environment
|
|
(3.5G), old checkpoints (14M), pre-reinstall snaps (3.2M), then all 9 sanoid
|
|
hourly+daily snaps on home/clawdie. Final: 16G → 29G free, pool 80% → 72%.
|
|
Lesson: after any large deletion, either `sanoid --prune-snapshots` to rotate
|
|
immediately, or manual destroy if desperate. Added §8 to this skill.
|
|
|
|
2026-06-22: `zroot/home/clawdie` was missing from sanoid config. 10 orphaned
|
|
snapshots from April 20-22 held 23.6G of dead weight (`usedbysnapshots`).
|
|
Added `operator_home_minimal` template and destroyed all 10. Freed 23.5G.
|