docs: add fresh-install checklist with timing, IPC, and screenshot checks (Sam & Claude)
10-section verification checklist for new installations covering: timing milestones, jail status, .env correctness, watchdog IPC, database, LLM provider, Telegram, Lumina, network/PF, ZFS health, and screenshot smoke test. Includes all log paths and preflight integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --- Build: pass | Tests: pass — Tests 605 passed (605)
This commit is contained in:
parent
86da1fcb5c
commit
013f5fb40d
1 changed files with 213 additions and 0 deletions
213
docs/FRESH-INSTALL-CHECKLIST.md
Normal file
213
docs/FRESH-INSTALL-CHECKLIST.md
Normal file
|
|
@ -0,0 +1,213 @@
|
|||
# Fresh Install Checklist
|
||||
|
||||
Verification checklist for new Clawdie-AI installations (bare metal, bhyve VM,
|
||||
or jail-based). Run after firstboot completes. Each check includes the exact
|
||||
command and expected result.
|
||||
|
||||
Designed to work with the tmux-screenshot skill — capture each section for the
|
||||
installation record.
|
||||
|
||||
## Timing milestones
|
||||
|
||||
Record wall-clock timestamps at each stage. On bhyve, the serial console
|
||||
shows boot messages with timestamps.
|
||||
|
||||
| Milestone | Command / event | Record |
|
||||
|-----------|----------------|--------|
|
||||
| Boot start | First kernel message | `T0` |
|
||||
| Firstboot wizard shown | `bsddialog` prompt appears | `T1 = T1 - T0` |
|
||||
| Wizard complete | `[firstboot] Complete.` in log | `T2 = T2 - T0` |
|
||||
| Desktop ready (Lumina) | `lightdm` login screen visible | `T3 = T3 - T0` |
|
||||
| Agent responding | `/ping` on Telegram returns pong | `T4 = T4 - T0` |
|
||||
|
||||
Check firstboot log for exact timestamps:
|
||||
|
||||
```sh
|
||||
head -5 /var/log/clawdie-firstboot.log
|
||||
tail -5 /var/log/clawdie-firstboot.log
|
||||
```
|
||||
|
||||
## 1. Jails running
|
||||
|
||||
```sh
|
||||
jls -N
|
||||
```
|
||||
|
||||
Expected (agent name may vary):
|
||||
|
||||
```
|
||||
JID IP Address Hostname Name
|
||||
1 10.0.X.2 {agent}-controlplane {agent}-controlplane
|
||||
2 10.0.X.3 db db
|
||||
3 10.0.X.4 cms cms
|
||||
4 10.0.X.5 llamacpp llamacpp
|
||||
```
|
||||
|
||||
All four jails must be present and running. If any are missing:
|
||||
|
||||
```sh
|
||||
cat /var/log/clawdie-firstboot.log | grep -i 'fail\|error'
|
||||
```
|
||||
|
||||
## 2. .env correctness
|
||||
|
||||
```sh
|
||||
grep -E '^(AGENT_NAME|AGENT_GENDER|AGENT_DOMAIN|AGENT_INTERNAL_DOMAIN|AGENT_TMP_DIR|PI_TUI_PROVIDER|PI_TUI_MODEL|EMBED_BASE_URL|TELEGRAM_BOT_TOKEN)=' .env
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
| Key | Expected |
|
||||
|-----|----------|
|
||||
| `AGENT_NAME` | Lowercase, no spaces (e.g. `clawdie`, `mevy`) |
|
||||
| `AGENT_GENDER` | `f`, `m`, or `n` |
|
||||
| `AGENT_DOMAIN` | Valid domain or `.internal` |
|
||||
| `AGENT_INTERNAL_DOMAIN` | `{agent}.home.arpa` |
|
||||
| `AGENT_TMP_DIR` | Writable path, not `/tmp` |
|
||||
| `PI_TUI_PROVIDER` | `zai`, `openrouter`, `anthropic`, etc. |
|
||||
| `PI_TUI_MODEL` | Valid model for the provider |
|
||||
| `EMBED_BASE_URL` | URL ending in `/v1` |
|
||||
| `TELEGRAM_BOT_TOKEN` | Non-empty if `FEATURE_TELEGRAM=true` |
|
||||
|
||||
## 3. Watchdog IPC status
|
||||
|
||||
```sh
|
||||
# Check socket exists
|
||||
ls -la "${AGENT_TMP_DIR:-tmp}/ipc/"
|
||||
|
||||
# Query watchdog status
|
||||
echo '{"cmd":"status"}' | nc -U "${AGENT_TMP_DIR:-tmp}/ipc/${AGENT_NAME}-watchdog.sock"
|
||||
```
|
||||
|
||||
Expected: JSON response with `mode`, `throttle`, `memory`, `activeJails`.
|
||||
|
||||
If socket is missing, check if the agent process is running:
|
||||
|
||||
```sh
|
||||
pgrep -f 'node.*dist/index.js'
|
||||
```
|
||||
|
||||
## 4. Database connectivity
|
||||
|
||||
```sh
|
||||
# From host — test PostgreSQL in db jail
|
||||
sudo bastille cmd db service postgresql status
|
||||
|
||||
# Test connection (uses .env credentials)
|
||||
npm run setup -- --step verify
|
||||
```
|
||||
|
||||
Expected: `postgresql is running` and verify step exits 0.
|
||||
|
||||
## 5. LLM provider connectivity
|
||||
|
||||
```sh
|
||||
# Quick inference test via pi
|
||||
pi --provider "${PI_TUI_PROVIDER}" --model "${PI_TUI_MODEL}" -e "reply with OK"
|
||||
```
|
||||
|
||||
Expected: Model responds. If using ZAI (GLM), verify the API key:
|
||||
|
||||
```sh
|
||||
grep '^ZAI_API_KEY=' .env | cut -c1-20
|
||||
```
|
||||
|
||||
## 6. Telegram bot
|
||||
|
||||
```sh
|
||||
# Check bot token is valid (should return bot info)
|
||||
curl -s "https://api.telegram.org/bot$(grep '^TELEGRAM_BOT_TOKEN=' .env | cut -d= -f2)/getMe" | python3 -m json.tool
|
||||
```
|
||||
|
||||
Expected: `"ok": true` with the bot username.
|
||||
|
||||
## 7. Lumina desktop (baremetal only)
|
||||
|
||||
```sh
|
||||
service lightdm status
|
||||
service dbus status
|
||||
```
|
||||
|
||||
If Lumina fails to start, check:
|
||||
|
||||
```sh
|
||||
# X11 log
|
||||
cat /var/log/Xorg.0.log | tail -30
|
||||
|
||||
# LightDM log
|
||||
cat /var/log/lightdm/lightdm.log | tail -30
|
||||
|
||||
# GPU driver loaded?
|
||||
sysctl kern.conftxt | grep -i gpu
|
||||
pciconf -lv | grep -B3 'VGA'
|
||||
```
|
||||
|
||||
## 8. Network and firewall
|
||||
|
||||
```sh
|
||||
# PF rules loaded
|
||||
sudo pfctl -sr | head -10
|
||||
|
||||
# NAT working (from inside a jail)
|
||||
sudo bastille cmd db ping -c1 1.1.1.1
|
||||
|
||||
# Bridge healthy
|
||||
ifconfig warden0 | grep 'inet '
|
||||
```
|
||||
|
||||
## 9. ZFS health
|
||||
|
||||
```sh
|
||||
zpool status -x
|
||||
zfs list -o name,used,avail -t filesystem | head -20
|
||||
```
|
||||
|
||||
Expected: `all pools are healthy`.
|
||||
|
||||
## 10. Screenshot smoke test
|
||||
|
||||
Capture the final state as proof of successful install:
|
||||
|
||||
```sh
|
||||
python3 .agent/skills/tmux-screenshot/tmux-screenshot.py \
|
||||
--session "${AGENT_NAME}" \
|
||||
--base-url "https://${AGENT_DOMAIN}/screenshots" \
|
||||
--publish
|
||||
```
|
||||
|
||||
Verify the capture landed:
|
||||
|
||||
```sh
|
||||
ls -la /usr/local/www/clawdie/screenshots/*.png | tail -3
|
||||
```
|
||||
|
||||
## Log paths reference
|
||||
|
||||
| Log | Path |
|
||||
|-----|------|
|
||||
| Firstboot orchestrator | `/var/log/clawdie-firstboot.log` |
|
||||
| Firstboot progress | `/var/log/clawdie-firstboot.progress` |
|
||||
| Agent (production) | `logs/klavdija.log` (relative to project) |
|
||||
| Watchdog | Same as agent log (pino structured) |
|
||||
| Preflight run | `logs/preflight-{runstamp}/` |
|
||||
| LightDM | `/var/log/lightdm/lightdm.log` |
|
||||
| X11 | `/var/log/Xorg.0.log` |
|
||||
| PostgreSQL | `/var/log/postgresql.log` (inside db jail) |
|
||||
| nginx | `/var/log/nginx/error.log` |
|
||||
|
||||
## Running the full preflight
|
||||
|
||||
The automated version of this checklist:
|
||||
|
||||
```sh
|
||||
# As root (for jail and firewall steps)
|
||||
sudo npm run preflight
|
||||
|
||||
# With onboarding wizard
|
||||
sudo npm run preflight -- --with-onboarding
|
||||
|
||||
# Stop on first failure
|
||||
sudo npm run preflight -- --fail-fast
|
||||
```
|
||||
|
||||
Results are written to `logs/preflight-{timestamp}/summary.json`.
|
||||
Loading…
Add table
Reference in a new issue