feat/wiki-astro #214

Merged
clawdie merged 19 commits from feat/wiki-astro into main 2026-06-26 14:16:50 +02:00
Showing only changes of commit 9643790739 - Show all commits

677
.agent/nginx/SKILL.md Normal file
View file

@ -0,0 +1,677 @@
---
name: nginx
description: Manage nginx web serving for the cms jail on FreeBSD. Host nginx terminates SSL and proxies to jail nginx at ${AGENT_SUBNET_BASE}.4. Use when configuring vhosts, SSL, static site serving, or Strapi reverse proxy. Triggers on "nginx", "website", "vhost", "update site", "add vhost", "ssl".
---
# Nginx
Use this skill for nginx decisions on the **cms jail** (`${AGENT_SUBNET_BASE}.4`) and the **host nginx** SSL proxy layer.
## Current surface model
- Operator app: `ai.<base>` — not served by cms jail nginx
- Shared CMS admin/API: `cms.<base>` — shared service surface
- Shared code admin: `git.<base>` — separate service surface, not cms nginx
- Tenant home: `<tenant>.<base>` — served by cms jail nginx
- Tenant site: `<site>.<tenant>.<base>` — served by cms jail nginx
For the internal default install, `<base>` is `home.arpa`.
## Actual architecture (as deployed)
```text
operator at ai.<base>
→ controlplane app (not cms nginx)
tenant/browser at <tenant>.<base> or <site>.<tenant>.<base>
→ host nginx (optional public SSL terminator / proxy)
→ cms jail nginx at ${AGENT_SUBNET_BASE}.4
→ tenant home + tenant sites (static output)
operator/editor at cms.<base>
→ host nginx (optional public/internal proxy)
→ cms jail nginx
→ Strapi admin + CMS API
```
**Host nginx** terminates SSL and proxies to the jail when public exposure exists.
**Jail nginx** is the real web server for tenant homes, tenant sites, and the shared CMS surface.
This diverges from the original design (PF RDR → jail nginx handling SSL itself). The reason:
host nginx was already running for clawdie.si, so PF RDR to the jail would break the existing
site. Host-proxy is the correct pattern when multiple domains share the same host.
## Scope
This skill covers:
- Host nginx SSL proxy vhosts (SSL termination, proxy_pass to jail)
- Jail nginx server_name routing for `cms.<base>`, `<tenant>.<base>`, and `<site>.<tenant>.<base>`
- SSL certificate management for public-exposed surfaces
- ACME challenge pattern when host nginx proxies to jail
- Strapi admin/API reverse proxy on the shared CMS surface
- Tenant home and tenant-site static serving
This skill does not replace:
- `warden-pf` for PF firewall rules
- `freebsd-admin` for host-level system changes
## SSL certificate tools (two tools on this host)
| Domain | Tool | Location |
| ------------------------------------------------------- | ------- | ------------------------------------------- |
| `clawdie.si`, `docs.clawdie.si`, `osa.smilepowered.org` | acme.sh | `/root/.acme.sh/<domain>_ecc/` |
| `samob.smilepowered.org` | certbot | `/usr/local/etc/letsencrypt/live/<domain>/` |
Keep the tools separate. Do not migrate certbot domains to acme.sh without a plan.
`just doctor` audits the canonical acme.sh-backed public cert paths and reports `TLS_<LABEL>` expiry plus `ACME_RENEWAL_CRON` presence. Treat missing renewal cron as an operational warning; do not renew or reinstall cert tooling from the doctor path.
## ACME challenge pattern (when host proxies to jail)
When host nginx is proxying a domain to the jail, certbot HTTP-01 renewal must
be served from the **host**, not the jail. The jail doesn't know about certbot.
Pattern in every proxied host vhost:
```nginx
server {
listen 80; listen [::]:80;
server_name <domain>;
# certbot challenge served from host — do NOT proxy this
location /.well-known/acme-challenge/ {
root /var/www/certbot-challenge;
}
location / {
return 301 https://$server_name$request_uri;
}
}
```
The directory `/var/www/certbot-challenge` must exist on the host. certbot writes
challenge files there; nginx serves them before the HTTPS redirect fires.
## Paths
**Host nginx:**
- vhosts: `/usr/local/etc/nginx/vhosts/*.conf`
- certbot certs: `/usr/local/etc/letsencrypt/live/<domain>/`
- acme.sh certs: `/usr/local/etc/nginx/ssl/<name>/`
- ACME challenge webroot: `/var/www/certbot-challenge/`
**Jail nginx** — commands via `bastille cmd cms <cmd>` or `bastille console cms`:
- nginx config: `/usr/local/etc/nginx/nginx.conf`
- vhosts: `/usr/local/etc/nginx/vhosts/*.conf`
- webroots: `/var/www/<site>/dist/`
## Hosted surfaces
| Surface | Typical host | Proxied from host | Served by cms jail nginx |
| ------------- | ------------------------ | ----------------- | ------------------------ |
| Tenant home | `<tenant>.<base>` | maybe | yes |
| Tenant site | `<site>.<tenant>.<base>` | maybe | yes |
| CMS admin/API | `cms.<base>` | maybe | yes |
| Operator app | `ai.<base>` | separate stack | no |
**acme.sh webroot for host-proxied domains:** keep a real directory at
`/usr/local/www/<domain>/` on the host even when the domain proxies to the jail.
acme.sh --webroot renewal writes challenge files there; the HTTP vhost serves
`/.well-known/acme-challenge/` from it before the HTTPS redirect fires.
## docs.clawdie.si shape
`docs.clawdie.si` is a public static documentation site, not a reverse proxy app.
Recommended site structure (inside cms jail):
```text
/usr/local/www/docs.clawdie.si/
index.html
css/
shared.css
docs/
index.html
split-brain.html
```
Recommended vhost:
```nginx
server {
listen 80;
listen [::]:80;
server_name docs.clawdie.si;
return 301 https://docs.clawdie.si$request_uri;
}
server {
listen 443 ssl;
listen [::]:443 ssl;
server_name docs.clawdie.si;
root /usr/local/www/docs.clawdie.si;
index index.html;
ssl_certificate /usr/local/etc/nginx/ssl/docs/fullchain.cer;
ssl_certificate_key /usr/local/etc/nginx/ssl/docs/docs.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
location /docs/ {
try_files $uri $uri/ /docs/index.html =404;
}
location / {
try_files $uri $uri/ =404;
}
}
```
Use this site to explain:
- FreeBSD-first deployment
- split-brain memory
- local built-in knowledge in the `db` jail
- upstream-aware relationship to NanoClaw
Recommended baseline for all public vhosts:
- `add_header X-Content-Type-Options "nosniff" always;`
- `add_header X-Frame-Options "SAMEORIGIN" always;`
- `add_header X-XSS-Protection "1; mode=block" always;`
- `add_header Referrer-Policy "strict-origin-when-cross-origin" always;`
## Site structure: public Clawdie site
Inside the cms jail at `/usr/local/www/clawdie/`:
```
/usr/local/www/clawdie/
index.html # Main landing page
css/shared.css # Shared styles
docs/index.html # Documentation page
img/ # Public images (used on landing page)
guides/
freebsd-setup.html # FreeBSD setup guide
nginx-ssl.html # Nginx + SSL guide
stripe-agents.html # Stripe agents guide
tailscale-vpn.html # Tailscale VPN guide
screenshots/ # Diagnostic screenshots (basic auth protected)
```
## Public vs Protected Paths
**Pattern:** Use `/img/` for public images, `/screenshots/` for protected content.
| Path | Purpose | Auth |
| --------------- | -------------------------------------- | ---------- |
| `/img/` | Public images used on landing page | None |
| `/screenshots/` | Diagnostic screenshots, wizard gallery | Basic auth |
**Why separate directories?**
If you include an image from `/screenshots/` on the landing page, the browser triggers an auth prompt when loading the page. This creates a bad user experience.
## Controlplane dashboard (Tailscale)
When exposing the operator dashboard over Tailscale, host nginx serves TLS for the MagicDNS hostname and proxies to the local controlplane API.
Recommended pattern:
- vhost: `/usr/local/etc/nginx/vhosts/controlplane-tailscale.conf`
- certs: `/usr/local/etc/nginx/ssl/tailscale/<magicdns>.crt` + `.key` (from `tailscale cert`)
- proxy: `http://127.0.0.1:3100`
- header: `Authorization: Bearer op:clawdie:${OPERATOR_PASSWORD}` when running `CONTROLPLANE_AUTH_MODE=local_trusted`
This is the `ai.<base>` surface, not the tenant home app.
**Solution:**
1. Public images → `/usr/local/www/clawdie/img/` inside the `cms` jail
2. Protected galleries → `/usr/local/www/clawdie/screenshots/` inside the `cms` jail
```nginx
# nginx config pattern
location /img/ {
try_files $uri $uri/ =404; # public, no auth
}
location /screenshots/ {
auth_basic "Diagnostics";
auth_basic_user_file /usr/local/etc/nginx/screenshots.htpasswd;
try_files $uri $uri/ =404;
}
```
**HTML usage:**
```html
<!-- Landing page - use public /img/ -->
<img src="/img/screenshot.png" />
<!-- Protected gallery - link to /screenshots/ -->
<a href="/screenshots/wizard.html">View gallery</a>
```
## Protected paths
| Path | Auth | htpasswd file | Credentials |
| --------------- | ---------- | ------------------------------------------- | ------------------------------------------------------ |
| `/screenshots/` | basic auth | `/usr/local/etc/nginx/screenshots.htpasswd` | in `.env` (`SCREENSHOTS_USER`, `SCREENSHOTS_PASSWORD`) |
### Adding basic auth to a new path
Run inside the cms jail (`bastille console cms`):
```sh
# 1. generate password hash
openssl passwd -apr1 'your-password'
# 2. write htpasswd file
sh -c 'cat > /usr/local/etc/nginx/newpath.htpasswd << EOF
username:$apr1$hash...
EOF'
chmod 640 /usr/local/etc/nginx/newpath.htpasswd
chown root:www /usr/local/etc/nginx/newpath.htpasswd
# 3. add location block to vhost
# location /newpath/ {
# auth_basic "Description";
# auth_basic_user_file /usr/local/etc/nginx/newpath.htpasswd;
# try_files $uri $uri/ =404;
# }
# 4. test and reload
nginx -t && service nginx reload
```
Store credentials in `/home/clawdie/clawdie-ai/.env`. Never in skill files.
## Safe defaults
- Always run `nginx -t` before reloading
- Never reload nginx with a broken config
- Back up vhost configs before modifying
- Keep CSS in shared files, not inline (except index.html which is self-contained)
- Test changes locally before pushing to production
## Workflow
Be explicit about which nginx owns the surface:
- **Host nginx** terminates public TLS for `clawdie.si`, `docs.clawdie.si`, and other host-level domains.
- **cms jail nginx** serves tenant/static content behind the host proxy or internal routes.
Run host nginx/acme.sh commands on the host. Run jail webroot and jail-vhost commands inside the cms jail:
```sh
bastille console cms
# or
bastille cmd cms sh
```
### Updating site content
1. Edit the HTML file directly in the webroot (inside the jail)
2. Changes are served immediately (static files, no build step)
3. For structural changes, run `nginx -t` then `service nginx reload`
### Adding a new public static HTTPS site — full flow (wiki.clawdie.si pattern)
This is the canonical pattern for deploying a static HTTPS site on the host
nginx, drawn from the `wiki.clawdie.si` deployment (26.jun.2026). It covers the
three hiccups that reliably trip first-time deploys and how to avoid them.
### 0. DNS first
Verify the A/AAAA record resolves before touching nginx or acme.sh. The server
cannot reach its own public IP (PF blocks loopback), so query the authoritative
nameserver directly:
```sh
drill NS clawdie.si | grep "ANSWER" -A5
drill wiki.clawdie.si A @x1.si.
```
Do not proceed until the authoritative NS returns the correct IP.
### 1. Create webroot + placeholder cert
**Hiccup #1: nginx refuses to start when the SSL cert file doesn't exist.**
The vhost references `ssl_certificate` and `ssl_certificate_key` — if those
files are absent, `nginx -t` fails and you can't even start the HTTP server for
ACME validation. Fix: create a **temporary self-signed cert** first:
```sh
mkdir -p /usr/local/www/wiki.clawdie.si
mkdir -p /usr/local/etc/nginx/ssl/wiki
# Placeholder cert — lets nginx start so ACME can validate
openssl req -x509 -nodes -days 1 -newkey ec \
-pkeyopt ec_paramgen_curve:prime256v1 \
-keyout /usr/local/etc/nginx/ssl/wiki/wiki.key \
-out /usr/local/etc/nginx/ssl/wiki/fullchain.cer \
-subj "/CN=wiki.clawdie.si"
```
### 2. Write the vhost with ACME challenge BEFORE redirect
**Hiccup #2: the HTTP→HTTPS redirect catches the ACME challenge before it
reaches the well-known location.** Nginx matches location blocks in order.
The `.well-known/acme-challenge/` location must appear **before** the
`location / { return 301 https://... }` redirect:
```nginx
server {
listen 80; listen [::]:80;
server_name wiki.clawdie.si;
# ACME challenge — MUST come before the redirect
location /.well-known/acme-challenge/ {
root /usr/local/www/wiki.clawdie.si;
}
location / {
return 301 https://wiki.clawdie.si$request_uri;
}
}
server {
listen 443 ssl; listen [::]:443 ssl;
server_name wiki.clawdie.si;
root /usr/local/www/wiki.clawdie.si;
index index.html;
ssl_certificate /usr/local/etc/nginx/ssl/wiki/fullchain.cer;
ssl_certificate_key /usr/local/etc/nginx/ssl/wiki/wiki.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
# Security headers — baseline for all public vhosts
add_header X-Content-Type-Options nosniff always;
add_header X-Frame-Options SAMEORIGIN always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy strict-origin-when-cross-origin always;
location /.well-known/acme-challenge/ {
root /usr/local/www/wiki.clawdie.si;
}
location / {
try_files $uri $uri/ =404;
}
}
```
Write to `/usr/local/etc/nginx/vhosts/<domain>.conf`, then:
```sh
nginx -t && service nginx reload
```
### 3. Issue the real cert (replaces placeholder)
**Hiccup #3: acme.sh `--issue` with `--key-file` + `--fullchain-file` writes
directly to the nginx SSL paths, overwriting the placeholder.** The cert files
and nginx config must agree on the paths:
```sh
acme.sh --issue -d wiki.clawdie.si -w /usr/local/www/wiki.clawdie.si \
--key-file /usr/local/etc/nginx/ssl/wiki/wiki.key \
--fullchain-file /usr/local/etc/nginx/ssl/wiki/fullchain.cer
service nginx reload
```
Verify the cert replaced the placeholder:
```sh
openssl x509 -in /usr/local/etc/nginx/ssl/wiki/fullchain.cer -noout -issuer
# Should show "CN = R11, O = Let's Encrypt" — not the placeholder CN
```
### 4. Content deployment
Static content goes to the webroot. For content built inside the CMS jail, use
tar to cross the jail boundary:
```sh
# Build inside jail, tar to host
bastille cmd cms sh -c 'tar -czf /tmp/wiki-dist.tar.gz -C /usr/local/www/wiki.clawdie.si .'
cp /usr/local/bastille/jails/cms/root/tmp/wiki-dist.tar.gz /tmp/
tar -xzf /tmp/wiki-dist.tar.gz -C /usr/local/www/wiki.clawdie.si/
```
### 5. Renewal
acme.sh auto-renews via cron (check with `crontab -l`). The cert paths match
between acme.sh and nginx, so renewal is zero-touch. Verify:
```sh
openssl x509 -in /usr/local/etc/nginx/ssl/wiki/fullchain.cer -noout -enddate
```
The public TLS/certificate steps below run on the **host**. Only jail webroot/content steps run inside the cms jail.
1. **DNS check first** — verify the A record resolves before touching nginx or acme.sh.
The server cannot reach its own public IP (PF), so query the authoritative nameserver directly:
```sh
# find NS
drill NS clawdie.si | grep "ANSWER" -A5
# query it directly
drill docs.clawdie.si A @x1.si.
```
Do not proceed until the authoritative NS returns the correct IP.
2. **Create the ACME challenge webroot on the host**:
```sh
mkdir -p /usr/local/www/<domain>
```
If the served content lives in the cms jail, keep this host path for HTTP-01 challenges and proxy normal traffic to the jail.
3. **Fix symlink traversal permissions** — nginx worker runs as `www` and must be able
to stat through every directory in the symlink path. If any parent dir is `drwxrwx---`
(no world-execute), nginx will get `Permission denied (13)` even though the files
are world-readable. The site will serve 404/403 to Let's Encrypt and visitors.
4. **Install a temporary HTTP-only vhost** before issuing the cert. Let's Encrypt needs
to reach `/.well-known/acme-challenge/` over HTTP. Without a matching server block,
nginx falls through to the default server and returns 404.
```sh
cat > /usr/local/etc/nginx/vhosts/<domain>.conf << 'EOF'
server {
listen 80; listen [::]:80;
server_name <domain>;
root /usr/local/www/<domain>;
location /.well-known/acme-challenge/ { try_files $uri =404; }
location / { return 301 https://<domain>$request_uri; }
}
EOF
nginx -t && service nginx reload
```
5. **Issue and install the cert**:
```sh
mkdir -p /usr/local/etc/nginx/ssl/<name>
acme.sh --issue -d <domain> --webroot /usr/local/www/<domain>
acme.sh --install-cert -d <domain> \
--cert-file /usr/local/etc/nginx/ssl/<name>/cert.cer \
--key-file /usr/local/etc/nginx/ssl/<name>/<name>.key \
--fullchain-file /usr/local/etc/nginx/ssl/<name>/fullchain.cer \
--reloadcmd "service nginx reload"
```
6. **Replace temp vhost with full HTTPS vhost**, then test and reload:
```sh
nginx -t && service nginx reload
```
### Strapi admin reverse proxy (optional)
To expose Strapi admin via HTTPS on an internal subdomain (e.g., `cms.<agent>.home.arpa`),
run inside the cms jail:
1. Create `/usr/local/etc/nginx/vhosts/cms.conf` using `references/strapi-proxy.md`
2. Strapi runs at `http://127.0.0.1:1337` inside the cms jail
3. Restrict access to Tailscale IPs only
4. Run `nginx -t` then `service nginx reload`
Do not expose Strapi admin publicly by default. Keep public traffic on the
static Astro output unless there is a strong reason to do otherwise.
### Strapi API microcache (nginx, recommended)
To protect Strapi from bursts (agents, crawlers, SSR spikes), enable a short
microcache in the **cms jail nginx**. This keeps content updates fast while
avoiding rebuilds.
See `references/strapi-cache.md` for the exact config and validation steps.
### Astro deploy
When Astro output is ready for a public site:
1. Build in the `cms` jail or build locally and copy only `dist/`.
2. Back up the current webroot unless the operator explicitly says to skip backup because of disk pressure.
3. Sync generated output to the right webroot:
- landing `clawdie.si`: `/usr/local/www/clawdie-si/`
- docs `docs.clawdie.si`: `/usr/local/www/clawdie/`
4. Validate the jail-local vhost with the correct `Host:` header.
5. Validate public HTTPS.
6. Roll back from backup if needed and available.
### SSL certificate management
1. Public edge certificates are managed via acme.sh on the host.
2. Certificate paths follow: `/usr/local/etc/nginx/ssl/<name>/`
3. Each domain has: `fullchain.cer` and `<name>.key`
4. `just doctor` reports certificate expiry and `ACME_RENEWAL_CRON` renewal-cron presence.
## FreeBSD assumption
This skill assumes the target runtime is FreeBSD inside the cms jail.
Canonical paths inside the cms jail:
- `/usr/local/etc/nginx/nginx.conf`
- `/usr/local/etc/nginx/vhosts/*.conf`
- `/usr/local/etc/nginx/ssl/`
- `/usr/local/www/`
## Validation
Run inside the cms jail:
```sh
# test config
nginx -t
# reload after changes
service nginx reload
# check status
service nginx status
```
When testing a named vhost through `127.0.0.1`, include the expected `Host:`
header. The `cms` jail default server returns `404`, so a plain direct URL can
be a false negative even when the named vhost works.
```sh
# clawdie.si landing vhost
curl -sI -H 'Host: clawdie.si' http://127.0.0.1/sl/
curl -sI -H 'Host: clawdie.si' http://127.0.0.1/en/
# docs.clawdie.si vhost
curl -sI -H 'Host: docs.clawdie.si' http://127.0.0.1/architecture/colibri/
```
Use `curl` for these checks; FreeBSD `fetch` does not provide a simple
`--header` flag.
If the Strapi cache is enabled, verify cache headers on a known public
endpoint:
```sh
curl -sI http://127.0.0.1/<public-api-path> | grep -i x-cache-status
```
From the host, verify public edge traffic:
```sh
curl -sI https://clawdie.si/ | head -5
curl -sI https://clawdie.si/sl/ | head -5
curl -sI https://docs.clawdie.si/architecture/colibri/ | head -5
```
## Troubleshooting
### nginx won't start
- Check config: `nginx -t`
- Check logs: `tail /var/log/nginx/error.log`
- Check port conflicts: `sockstat -l | grep :80`
### nginx config test fails: "cannot load certificate — BIO_new_file() failed"
This means the `ssl_certificate` path in a vhost references a file that doesn't
exist yet. **Do not remove the SSL block** — create a placeholder cert first:
```sh
openssl req -x509 -nodes -days 1 -newkey ec \
-pkeyopt ec_paramgen_curve:prime256v1 \
-keyout /usr/local/etc/nginx/ssl/<name>/<name>.key \
-out /usr/local/etc/nginx/ssl/<name>/fullchain.cer \
-subj "/CN=<domain>"
```
Then `nginx -t && service nginx reload` will succeed. Issue the real cert
with acme.sh — it replaces the placeholder. See §"Adding a new public static
HTTPS site" for the full flow.
### ACME challenge returns 404 or 301
Two common causes:
1. **The `location /.well-known/acme-challenge/` block is AFTER the
`location / { return 301 https://... }` redirect.** Nginx matches locations
in order — the redirect wins. Move the well-known block before the redirect.
2. **The well-known directory doesn't exist on the host.** acme.sh writes
challenge files there; if it's missing, nginx returns 404. Create it:
`mkdir -p /usr/local/www/<domain>/.well-known/acme-challenge`
### SSL certificate expired or near expiry
- Check cert dates: `openssl x509 -in /usr/local/etc/nginx/ssl/clawdie/fullchain.cer -noout -dates`
- Run `just doctor` and inspect `TLS_<LABEL>` plus `ACME_RENEWAL_CRON`
- Renew or repair acme.sh on the host; do not run public edge renewal inside the cms jail
### Changes not visible
- Static files are served immediately — hard refresh the browser
- Check you edited the correct webroot path inside the jail
- Landing changes should land in `/usr/local/www/clawdie-si/`
- Docs changes should land in `/usr/local/www/clawdie/`
- For jail-local HTTP checks, include `-H 'Host: clawdie.si'` or the request may hit the default 404 vhost
### Host nginx not proxying to jail
- Verify jail is up: `sudo bastille list | grep cms`
- Verify jail nginx is running: `sudo bastille cmd cms service nginx status`
- Test direct jail HTTP: `curl -s -o /dev/null -w "%{http_code}" http://${AGENT_SUBNET_BASE}.4/`
- Check host nginx proxy config: `nginx -t`
- Check jail nginx logs: `sudo bastille cmd cms tail /var/log/nginx/error.log`