From 43c43a4848232ff4614eae879fbc1a7251cbd806 Mon Sep 17 00:00:00 2001 From: Sam & Claude Date: Sun, 28 Jun 2026 00:15:44 +0200 Subject: [PATCH 1/2] feat(skills): fail2ban-tailscale + freebsd-admin PF rate limiting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit fail2ban-tailscale (new skill): Root cause: key negotiation triggers password-fallback, fail2ban bans IP Path A: PasswordAuthentication no — one line, zero maintenance Path B: Specific fleet IP whitelist — if passwords must stay on Path C: Both — production hardening Security: do NOT whitelist 100.64.0.0/10 (trusts every tailnet) FreeBSD PF equivalent: max-src-conn-rate + overload table Platform table: Linux fail2ban / FreeBSD PF / Mother PF freebsd-admin (PF SSH rate limiting): max-src-conn-rate 5/60 + overload table Manual operations: show, delete specific IP, flush Cross-reference to fail2ban-tailscale skill Rule placement guidance (block drop all last, pass out first) Wiki-lint: 187 refs, 0 failures. Prettier 3.8.4: clean. --- .agent/skills/fail2ban-tailscale/SKILL.md | 123 ++++++++++++++++++++++ .agent/skills/freebsd-admin/SKILL.md | 40 ++++++- 2 files changed, 161 insertions(+), 2 deletions(-) create mode 100644 .agent/skills/fail2ban-tailscale/SKILL.md diff --git a/.agent/skills/fail2ban-tailscale/SKILL.md b/.agent/skills/fail2ban-tailscale/SKILL.md new file mode 100644 index 0000000..4175689 --- /dev/null +++ b/.agent/skills/fail2ban-tailscale/SKILL.md @@ -0,0 +1,123 @@ +--- +name: fail2ban-tailscale +description: "Prevent fail2ban from banning fleet SSH traffic. Root cause: password auth enabled triggers password-fallback failures during key negotiation. Fix: disable password auth or whitelist fleet IPs." +platforms: [linux] +--- + +# fail2ban & Fleet SSH Reliability + +## Root cause + +When a fleet node connects via SSH and the key doesn't match on first +attempt, `sshd` falls back to password authentication. Those password +failures accumulate in fail2ban's counters. After `maxretry = 5`, the +source Tailscale IP is banned — breaking all fleet SSH to that node. + +The trigger is NOT a brute-force attack. It's the key negotiation +sequence between trusted nodes during normal fleet operation. + +## Fix — choose one path + +### Path A: Disable password auth (recommended if key-only) + +One line, permanent. Removes the attack surface entirely — no password +attempts means no fail2ban bans: + +```sh +sudo sed -i 's/^#*PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config +sudo systemctl reload sshd +``` + +Pros: Zero ongoing maintenance. Works for all hosts, known or unknown. +No IP lists to update. fail2ban becomes irrelevant for SSH. + +Cons: Password login is disabled. If a node loses its private key, +physical/console access is needed. Not suitable for OOTB setups that +need password auth. + +Verification: + +```sh +ssh -o PreferredAuthentications=password localhost +# Should fail: "Permission denied (publickey)" +``` + +### Path B: Whitelist specific fleet IPs (if password auth must stay on) + +For nodes that need password auth (OOTB state, temporary access, shared +machines). Whitelist only known fleet nodes — do NOT whitelist the +entire `100.64.0.0/10` (that trusts every Tailscale device on any +tailnet): + +```sh +# Get fleet IPs from any node: +tailscale status | awk '/active|idle/{print $1}' + +echo '[DEFAULT] +ignoreip = 127.0.0.1/8 ::1 100.72.229.63 100.103.255.41 100.73.44.93 100.108.235.54 + +[sshd] +enabled = true' | sudo tee /etc/fail2ban/jail.local && sudo systemctl reload fail2ban +``` + +Pros: Password auth stays usable for operators. + +Cons: Manual maintenance — add new node IPs on join. IP changes +require updates. Forgetting to update → ban returns. + +### Path C: Both (production hardening) + +Two independent controls — if someone accidentally re-enables passwords, +the whitelist still protects; if the whitelist misses a node, key-only +auth still blocks brute-force. Apply both Path A and Path B. + +## What happens without this + +The symptom is `Connection refused` on port 22, even when: + +- `sshd` is running and listening on `0.0.0.0:22` +- `ufw`/`iptables` allows port 22 +- `tailscale ping` works (35ms pong) + +The fail2ban ban targets the Tailscale IP — the node appears reachable +but SSH is silently dropped at the kernel level. + +## FreeBSD equivalent — PF rate limiting + +FreeBSD nodes don't use fail2ban. The equivalent is PF SSH rate limiting +with `max-src-conn-rate` and an overload table: + +```pf +# /etc/pf.conf +table persist + +pass in quick on tailscale0 proto tcp from any to any port = ssh \ + flags S/SA keep state \ + (max-src-conn-rate 5/60, overload flush global) + +block quick from +``` + +5 new connections per 60 seconds per source IP. Exceeding adds the +source to `` (blocked for 10 minutes). Established +connections aren't counted — only new TCP handshakes. + +Manual unban: + +```sh +sudo pfctl -t ssh_brutes -T delete 100.72.229.63 +``` + +## Platform summary + +| Platform | Tool | Fix | +| ------------ | -------- | ---------------------------------------------- | +| Linux | fail2ban | Path A (password off) or Path B (IP whitelist) | +| FreeBSD | PF | `max-src-conn-rate` + overload table | +| Mother (osa) | PF | `max-src-conn-rate` on tailscale0 SSH rule | + +## Related + +- `freebsd-admin` — PF rule management, `max-src-conn-rate` SSH rate limiting +- `mother-hive` wiki — per-node SSH key strategy, forced-command confinement +- `hive-routing` wiki — fleet communication reliability diff --git a/.agent/skills/freebsd-admin/SKILL.md b/.agent/skills/freebsd-admin/SKILL.md index b37cc77..aba9d5a 100644 --- a/.agent/skills/freebsd-admin/SKILL.md +++ b/.agent/skills/freebsd-admin/SKILL.md @@ -56,11 +56,47 @@ For update-status questions, use the existing read-only hostd audit ops the sysadmin update-report path. Do not expose `freebsd-update fetch` or run mutating update commands for a status report. -## Tailscale controlplane exposure +## SSH & service exposure (PF rules) -When the controlplane API/dashboard is only exposed on Tailscale: +### Controlplane service ports + +When the controlplane API/dashboard is exposed on Tailscale: - allow `tailscale0` ingress to ports `3100` (direct API) and `443` (nginx proxy) +- validate PF before reload (`sudo pfctl -nf /etc/pf.conf`) and then `sudo pfctl -f /etc/pf.conf` + +### SSH rate limiting (FreeBSD equivalent of fail2ban) + +FreeBSD doesn't use fail2ban. PF handles SSH brute-force protection with +`max-src-conn-rate` and an overload table: + +```pf +# /etc/pf.conf +table persist + +pass in quick on tailscale0 proto tcp from any to any port = ssh \ + flags S/SA keep state \ + (max-src-conn-rate 5/60, overload flush global) + +block quick from +``` + +- `5/60`: 5 new connections per 60 seconds per source IP +- `overload`: source added to `` table on exceed +- `flush global`: entries expire after 600 seconds (10 min) +- `keep state`: only new TCP handshakes count; existing sessions are free + +Manual operations: + +```sh +sudo pfctl -t ssh_brutes -T show # list banned IPs +sudo pfctl -t ssh_brutes -T delete 100.72.229.63 # unban specific IP +sudo pfctl -t ssh_brutes -T flush # clear all bans +``` + +For the Linux fleet fail2ban equivalent, see +[fail2ban-tailscale skill](../fail2ban-tailscale/SKILL.md). + - validate PF before reload (`sudo pfctl -nf /etc/pf.conf`) and then `sudo service pf reload` ## Workflow -- 2.45.3 From 40f091135d1dac787a6126f5e82ce979951142c6 Mon Sep 17 00:00:00 2001 From: Sam & Claude Date: Sun, 28 Jun 2026 00:20:33 +0200 Subject: [PATCH 2/2] fix(skills): remove duplicate PF validate line in freebsd-admin SKILL MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The PR added a 'validate PF before reload' bullet in the Controlplane service ports subsection, but the original file already had one at the end using the FreeBSD-native 'service pf reload'. Keep only the one at the bottom — avoids confusing operators with two different reload commands. Sam & Claude --- .agent/skills/freebsd-admin/SKILL.md | 1 - 1 file changed, 1 deletion(-) diff --git a/.agent/skills/freebsd-admin/SKILL.md b/.agent/skills/freebsd-admin/SKILL.md index aba9d5a..1bc74a0 100644 --- a/.agent/skills/freebsd-admin/SKILL.md +++ b/.agent/skills/freebsd-admin/SKILL.md @@ -63,7 +63,6 @@ mutating update commands for a status report. When the controlplane API/dashboard is exposed on Tailscale: - allow `tailscale0` ingress to ports `3100` (direct API) and `443` (nginx proxy) -- validate PF before reload (`sudo pfctl -nf /etc/pf.conf`) and then `sudo pfctl -f /etc/pf.conf` ### SSH rate limiting (FreeBSD equivalent of fail2ban) -- 2.45.3