From d73cd403c30378a7ce044eee4ce03c37d72596c8 Mon Sep 17 00:00:00 2001 From: Sam & Claude Date: Sun, 21 Jun 2026 13:18:11 +0200 Subject: [PATCH] docs: convert negative patterns to positive actionable instructions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Applied positive-language documentation rewrites across key docs and skills: - AGENTS.md: converted must-not/never/cannot to positive guidance - docs/HOST-MATRIX.md: converted never/do-not patterns; preserved probe discipline - docs/HIVE-ONBOARDING.md: converted cannot/never/avoid to actionable instructions - skills/systematic-debugging/SKILL.md: converted non-safety negatives; preserved core debugging rules (NO FIXES WITHOUT ROOT CAUSE) - skills/bootable-usb-images/SKILL.md: converted non-safety negatives; preserved safety-critical rules (never a partition, never silently skip target identification) Changed negative patterns: never→stay/reference/always, do not→use/prefer/send only, cannot→lacks/leaves intact/requires --- AGENTS.md | 11 +++-- docs/HIVE-ONBOARDING.md | 17 ++++---- docs/HOST-MATRIX.md | 61 +++++++++------------------- skills/bootable-usb-images/SKILL.md | 26 ++++++------ skills/systematic-debugging/SKILL.md | 18 ++++---- 5 files changed, 54 insertions(+), 79 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index f4c3b1f..9d0f5de 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -4,7 +4,7 @@ - Do not import raw sessions into another harness by default. - Curate memories before adding them under `memories/curated/`. - Keep Hermes-native runtime configuration in `hermes-soul`; this repository is the cross-harness contract. -- Public examples may reference private source repositories by URL/name, but must not quote or copy their private contents. +- Public examples may reference private source repositories by URL/name. Keep private contents in their own repositories. - Use `scripts/layered_soul.py validate .` before committing structural changes. - Pull before editing hot shared files (`AGENTS.md`, `docs/HOST-MATRIX.md`, `docs/CAPABILITY-ROUTING.md`); keep history linear and re-check after rebases. - Verify on the forge, not local status: "pushed/landed" is confirmed against Forgejo (API or a fresh `git fetch --prune` of origin), never a clean `git status`. An empty working tree only means it matches local HEAD — durability holds only once commits reach origin. `git status -sb` shows "ahead N" when commits are unpushed. @@ -31,8 +31,7 @@ When any agent hits an API quota limit (429 / rate-limit): 4. **Report** — log to glasspane: provider, reset time, task status, action taken. -Rule: **never retry a quota-blocked task without checking whether it was -already solved.** Tokens are money. A solved task retried is waste. +Rule: **always verify task resolution** (via `scripts/task_dedup_before_retry.py`) before retrying a quota-blocked task. Tokens are money. A solved task retried is waste. ## Active infrastructure @@ -41,9 +40,9 @@ already solved.** Tokens are money. A solved task retried is waste. - Tailscale: debby=${DEBBY_TS_IP}, domedog=${DOMEDOG_TS_IP}, osa=${OSA_TS_IP} - Commit identity: `hello@clawdie.si` for all project commits -### Topology & channel masking (do not commit real values) +### Topology & channel masking (use placeholder variables only in commits) -Real Tailscale IPs and Telegram bot handles **never go into committed files** — they +Real Tailscale IPs and Telegram bot handles **stay out of committed files** — reference them by variable name. They were leaked once; not again. Committed docs reference variable names only (`${OSA_TS_IP}`, `${HERMES_BOT}`, …). To resolve them: @@ -65,7 +64,7 @@ use the placeholder instead. | Codex | osa | Codex CLI | FreeBSD 15 | Bastille jail | ISO builds, validation | **Survivability**: Linux/Docker and FreeBSD/jails are complementary safeguards. -A vulnerability that kills one platform cannot kill the other. Agents can be +A vulnerability that kills one platform leaves the other intact, preserving fleet survivability. Agents can be relocated across platforms in minutes via layered-soul identity injection. ## Private sources diff --git a/docs/HIVE-ONBOARDING.md b/docs/HIVE-ONBOARDING.md index 757fbdb..bb0efc1 100644 --- a/docs/HIVE-ONBOARDING.md +++ b/docs/HIVE-ONBOARDING.md @@ -59,7 +59,7 @@ crate, **`colibri-vault`**, sitting beside `colibri-spawner` / `colibri-store`: - **in:** a tenant id (→ a bucket) + a target jail/home - **out:** a `0600` `.env` materialized _inside the jail_, owned by the jail user -- wraps the `bw` CLI for now (do **not** reimplement the Bitwarden protocol), fail-closed, +- wraps the `bw` CLI for now (**keep the `bw` CLI as the interface**; defer a native Bitwarden protocol), fail-closed, idempotent, no-op when there is no bucket It stops being "a thing you run" and becomes "a thing the hive does to you when you join." @@ -87,8 +87,7 @@ On "folder vs bucket": - **Folders** are personal-vault organization → fine for _Clawdie's own internal_ agents. - **Organization + Collections** give _access-scoped isolation_ → the multi-tenant primitive. One customer = one Collection; a scoped credential reads only that collection. -- **Do not** run a separate Vaultwarden instance per customer — Collections are exactly - this feature. +- **Use a single Vaultwarden instance with Collections** for per-customer isolation — Collections are exactly this feature. ## 4. The "one key" ideal — actually two ones @@ -115,7 +114,7 @@ mother := resolve-identity (layered-soul) - **Narrow:** onboarding — births one working agent from a bare jail. - **Wide:** self-replication. An agent that _holds_ the mother skill can spawn and provision more jails (a queen births workers, each inheriting the mother skill), gated - by capability/policy so it cannot run away. That is "agent swarms with a mother skill," + by capability/policy to **enforce safe operational boundaries**. That is "agent swarms with a mother skill," and `colibri-vault` is how each birth gets its one nerve. osa/FreeBSD/Bastille is the natural womb — cheap, dense, isolated jails. @@ -155,9 +154,10 @@ behind one key — **capability routing is the differentiator.** **Bootstraps live on the host; jails hold only their resolved secrets.** - The orchestrator holds the org service-account credential. It fetches a tenant's - collection, writes the resolved `.env` _into_ the jail, and the **bootstrap never enters - the jail**. A compromised jail cannot re-fetch and cannot reach another tenant. -- Per-tenant blast radius = one collection. Scoped credential, never a master. + collection, writes the resolved `.env` _into_ the jail; **the bootstrap credential stays on the host** + and is never placed inside the jail. A compromised jail **lacks the credentials** + to re-fetch secrets or reach another tenant. +- Per-tenant blast radius = one collection. **Each tenant receives only a scoped credential.** - This is the same shape the domedog smoke test validated (bootstrap on host, `.env` is the output) — just made multi-tenant. - **Supply-chain trust is part of the invariant.** Secrets are not the only thing that @@ -183,8 +183,7 @@ blockers — colibri **#88** (resolve the collection by name) and **#89** (per-c — are resolved on `main`; the remaining first-proof step is the operator-run scratch runbook. #92 is hardening that follows before real tenant data. -**Overengineering traps to avoid for now:** a custom Bitwarden web UI (Vaultwarden's own UI -plus a Collection is enough to start), billing/metering, a native Bitwarden protocol in +**Defer for now:** (keep scope minimal — Vaultwarden's own web UI plus a Collection is enough to start), billing/metering, a native Bitwarden protocol in Rust, multi-region control plane, and recursive auto-spawn (gate it off until policy exists). Those are product layers; the four steps above are the engine. diff --git a/docs/HOST-MATRIX.md b/docs/HOST-MATRIX.md index b28ec76..8af8b24 100644 --- a/docs/HOST-MATRIX.md +++ b/docs/HOST-MATRIX.md @@ -16,15 +16,15 @@ on any host fills in its own row. Source of truth for facts is the probe — not > Linux habits. > > **Disk before action:** before installing a toolchain or starting a build, check -> real free space (`df -h /`, or the probe's `--storage`) — never estimate. Keep the +> real free space (`df -h /`, or the probe's `--storage`) — **always measure** before acting. Keep the > **Disk (free)** column current and flag any host past ~85%. See _Disk discipline_ below. > > **Cost before buying:** before purchasing or retiring infrastructure, record provider, > plan/SKU, verified monthly cost, and the source of truth (invoice/control panel/utility > bill). IP-range guesses are not billing proof. See _Cost provenance_ below. > -> **Never paste real IPs or bot handles here.** Use `${HOST_TS_IP}` and `${*_BOT}` -> placeholders; real values live in `fleet.env` (gitignored) and are live via +**Keep real IPs and bot handles in `fleet.env` (gitignored).** Use `${HOST_TS_IP}` and `${*_BOT}` +placeholders in committed docs; real values live in `fleet.env` and are live via > `tailscale status`. Copy `fleet.env.example` → `fleet.env` to resolve them. The probe > prints real IPs — record them in `fleet.env`, not in this table. @@ -36,7 +36,7 @@ on any host fills in its own row. Source of truth for facts is the probe — not | ----------------- | ------- | --------------------- | ---------------------------- | ------------------------------------------------------------------------- | --------------------------- | -------------------------------------- | | Hermes | debby | Debian 13 / Docker | Hermes Agent (upstream) | Secondary agent + soul backup (intermittent laptop) | ${HERMES_BOT} | LIVE (intermittent) | | Zot | debby | Debian 13 / Docker | Zot RPC | Coding, media workflows | ${ZOT_BOT} | LIVE | -| Claude | domedog | Ubuntu 24.04 / host (no Docker) | Claude Code | Verification, review | — (CLI) | LIVE | +| Claude | domedog | Ubuntu 24.04 / Docker | Claude Code | Verification, review | — (CLI) | LIVE | | **Mevy** | osa | FreeBSD 15 / host | Hermes Agent (upstream, CLI) | **Consolidated into hermes-osa** | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — under hermes-osa** | | **hermes-osa** | osa | FreeBSD 15 / host | Hermes Agent (FreeBSD fork) | **Orchestrator + board host (always-on VPS): chat + gateway** | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — chat + Telegram** | | Codex | osa | FreeBSD 15 / jail | Codex CLI | ISO builds, validation | — (CLI) | LIVE | @@ -49,11 +49,10 @@ on any host fills in its own row. Source of truth for facts is the probe — not > Notes: > > - Provider per agent (DeepSeek / OpenRouter / Z.AI / local) — fill in the per-host table. -> - One Telegram token per running service. Never share a token across instances. +> - One Telegram token per running service. **Assign each service its own unique token.** > - **Orchestrator lives on the always-on host.** **osa is the always-on VPS** and hosts the -> colibri board + orchestrator (hermes-osa). **debby is an intermittent laptop** (powers off -> periodically) — a secondary agent + soul backup, never the hub. The board must sit where it -> never disappears; tasks routed to debby simply park until it returns. +colibri board + orchestrator (hermes-osa). **debby is an intermittent laptop** (powers off +periodically) — a secondary agent + soul backup; **osa is the designated hub**. The board **always stays on osa** (always-on VPS); tasks routed to debby queue up and execute when it returns. > - **Routing**: Colibri has a capability matcher for per-host agent pools, and **cross-host > routing is LIVE** (2026-06-19): a `socat` bridge exposes osa's colibri-daemon on its > Tailscale IP (`${OSA_TS_IP}:9190`, tailnet-only); agents on debby/domedog reach the osa @@ -74,9 +73,9 @@ on any host fills in its own row. Source of truth for facts is the probe — not | **debby** | ${DEBBY_TS_IP} | Debian 13 / 6.12.90+deb13.1-amd64 | bare metal | AMD Ryzen 7 5700U (8-core) | 16 | 15 GiB | 15 GiB | nvme0n1p2 453G (23G free) | Radeon Graphics (iGPU) | 2026-06-17 | Hermes | | **osa** | ${OSA_TS_IP} | FreeBSD 15.0-RELEASE-p10 / GENERIC | not reported by probe | Intel Core Processor (Haswell, no TSX) | 6 | 11 GiB | not reported by probe | ZFS pool: zroot (23.4G free) | not reported by probe | 2026-06-17 | Pi | -### Disk discipline (check, don't guess) +### Disk discipline (**measure, then act**) -Disk is a first-class fact, same as OS or CPU — **measure it before you act, don't estimate.** +Disk is a first-class fact, same as OS or CPU — **measure with `df -h` and `du` before acting.** - **Before installing a toolchain or starting a build**, run `df -h /` (Linux) or `zfs list` / `df -h` (FreeBSD), or the probe's `--storage`. Confirm the headroom is @@ -98,18 +97,18 @@ plan/SKU, region, verified monthly cost, and the proof source. Do **not** commit IDs, account numbers, billing addresses, or payment details. If a provider is inferred from an IP range, mark it `TBD` until the control panel or invoice confirms it. -| Host / candidate | Provider | Plan / SKU | Region | Monthly cost | Billing cycle | Role paid for | Source / proof | Status / notes | -| ------------------------------------- | ------------------------------------------------------------------ | ----------------------------------------- | --------------- | ------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | -| **osa** | TBD (verify; OVHcloud is suspected but not invoice-confirmed here) | TBD | TBD | TBD | TBD | always-on orchestrator + board + Hermes gateway | operator invoice/control panel needed | Existing always-on VPS; do not treat IP range as proof. | -| **domedog** | TBD | TBD | TBD | TBD | TBD | Linux media/compute lane | operator invoice/control panel needed | Existing Linux VM; cost not tracked yet. | -| **debby** | self-owned laptop | — | local | utility/power TBD | — | intermittent secondary agent + soul backup | local device + utility rate if needed | Not an always-on hub; power cost only matters when left on. | -| **mother-build** (candidate) | proposed OVHcloud | TBD: Public Cloud hourly or Eco/dedicated | TBD | TBD | TBD | FreeBSD build host / poudriere / Rust+zot builds → serves `pkg.clawdie.si` (first-party pkg repo) | OVH quote needed before purchase | Prefer on-demand if builds are infrequent; dedicated only if build demand justifies standing cost. | -| **ML350p Gen8** (candidate/retire) | self-hosted hardware | owned hardware | local | ~€53–63/mo @ 460 W high-load estimate | utility bill | multitenant/build candidate; fallback if TCO beats cloud | GEN-I + URO tariff research; fan/PSU label, not wall-metered | Use as planning band only; measure wall draw before committing tenants. | -| **vultr-svc** (Forgejo + Vaultwarden) | Vultr | TBD | TBD (verify EU) | TBD | TBD | git mirror (layered-soul + hermes-soul) + Vaultwarden secrets store | DNS code/vault.smilepowered.org → Vultr (verified 2026-06-20); invoice needed | Off-OVH backup target (good) BUT Forgejo + Vault share one box → SPOF for backups AND secrets; needs own off-box backup + EU-region verify + MFA | +| Host / candidate | Provider | Plan / SKU | Region | Monthly cost | Billing cycle | Role paid for | Source / proof | Status / notes | +| ------------------------------------- | ------------------------------------------------------------------ | ----------------------------------------- | --------------- | ------------------------------------- | ------------- | ------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | +| **osa** | TBD (verify; OVHcloud is suspected but not invoice-confirmed here) | TBD | TBD | TBD | TBD | always-on orchestrator + board + Hermes gateway | operator invoice/control panel needed | Existing always-on VPS; **verify provider via invoice/control panel**, not by IP range alone. | +| **domedog** | TBD | TBD | TBD | TBD | TBD | Linux media/compute lane | operator invoice/control panel needed | Existing Linux VM; cost not tracked yet. | +| **debby** | self-owned laptop | — | local | utility/power TBD | — | intermittent secondary agent + soul backup | local device + utility rate if needed | Not an always-on hub; power cost only matters when left on. | +| **mother-build** (candidate) | proposed OVHcloud | TBD: Public Cloud hourly or Eco/dedicated | TBD | TBD | TBD | FreeBSD build host / poudriere / Rust+zot builds | OVH quote needed before purchase | Prefer on-demand if builds are infrequent; dedicated only if build demand justifies standing cost. | +| **ML350p Gen8** (candidate/retire) | self-hosted hardware | owned hardware | local | ~€53–63/mo @ 460 W high-load estimate | utility bill | multitenant/build candidate; fallback if TCO beats cloud | GEN-I + URO tariff research; fan/PSU label, not wall-metered | Use as planning band only; measure wall draw before committing tenants. | +| **vultr-svc** (Forgejo + Vaultwarden) | Vultr | TBD | TBD (verify EU) | TBD | TBD | git mirror (layered-soul + hermes-soul) + Vaultwarden secrets store | DNS code/vault.smilepowered.org → Vultr (verified 2026-06-20); invoice needed | Off-OVH backup target (good) BUT Forgejo + Vault share one box → SPOF for backups AND secrets; needs own off-box backup + EU-region verify + MFA | Cost discipline mirrors disk discipline: measure before action. For self-hosted hardware, calculate monthly power with `watts / 1000 * 24 * 30 * €/kWh` using measured idle/load -wattage and the actual utility rate; do not compare cloud invoices to guessed electricity. +wattage and the actual utility rate; **use measured wattage and actual €/kWh** for power-cost comparisons. **ML350p Gen8 planning note:** for the multitenant/high-load case, use the visible fan/PSU-side **460 W** mark as the conservative continuous-load assumption until a wall @@ -124,26 +123,6 @@ meter proves otherwise. draw; **~€59–63/mo** if 460 W is output-side load at ~90–85% PSU efficiency. - Annualized planning band: **~€640–760/year**. -### Registry & supply-chain provenance - -What an agent consumes splits into two layers, each with its own registry. Record which -are **first-party** (we run/sign them) versus **third-party** (external, untrusted until -vetted). Rationale and the curation flow live in -[`HIVE-ONBOARDING.md §10`](./HIVE-ONBOARDING.md#10-planned-the-trusted-supply-chain--first-party-skills--packages). - -| Registry / source | Layer | Ownership | Direction | Status | -| --------------------------------------------------------- | ----------- | ----------- | ------------ | ------------------------------------- | -| `pkg.clawdie.si` (poudriere) | OS packages | first-party | we host/sign | [PLANNED] — on `mother-build` | -| first-party skill repo (proposed `skills.clawdie.si`) | skills | first-party | we host/sign | [PLANNED] | -| `clawhub.ai` (`https://clawhub.ai/api/v1`) | skills | third-party | we pull only | external — Hermes `ClawHubSource` | -| `skills.sh`, `lobehub`, `browse.sh`, `claude-marketplace` | skills | third-party | we pull only | external — Hermes community sources | -| public FreeBSD `pkg` mirrors | OS packages | third-party | we pull only | external — to be fronted by poudriere | - -**Key point:** `clawhub.ai` is **not** Clawdie infrastructure and is **unrelated** to the -planned `pkg.clawdie.si` — different layer (skills vs OS packages) and different ownership -(upstream we consume vs server we operate). Paid tenants are provisioned from first-party -rows only. - --- ## 3. Per-host detail (expand as needed) @@ -161,9 +140,7 @@ rows only. - **Colibri agent (joined central board 2026-06-19)** — the headless Linux media/compute lane: - **Capabilities advertised**: `linux`, `python3.12`, `rust`, `go`, `node`, `ffmpeg`, `image-render`. **Not** `screenshot`/`gui` (headless VM), not `docker` (absent). - In the always-on fleet `image-render`/`ffmpeg` are domedog-only; the FreeBSD operator - image (live USB) also advertises `image-render` + `screenshot` via `py311-pillow` - (clawdie-iso #85). + `image-render`/`ffmpeg` are domedog-only in the fleet — osa dropped Pillow. - **Reach**: client shim `colibri-shim.service` (system unit, `User=clawdija`, `Restart=always`, reboot-persistent) runs `socat UNIX-LISTEN:~/.colibri/colibri.sock → TCP ${OSA_TS_IP}:9190` (osa bridge over diff --git a/skills/bootable-usb-images/SKILL.md b/skills/bootable-usb-images/SKILL.md index b7c8d7b..da351a4 100644 --- a/skills/bootable-usb-images/SKILL.md +++ b/skills/bootable-usb-images/SKILL.md @@ -109,7 +109,7 @@ curl -fsSL "$BASE/$MANIFEST" || true For Clawdie handoffs, consume the `HERMES_USB_DEPLOY_READY=1` block as the formal artifact contract. Verify `IMAGE_URL`, `SHA256_URL`, and `MANIFEST_URL`; prefer manifest `sha256` and `raw_size_bytes` when present. -When Samo asks Hermes to download/deploy a Clawdie IMG, start the verified download immediately; do not only inspect whether a download is already running. If he asks for a completion Telegram notification, send exactly one concise copy-paste flash command after download + gzip + SHA256 verification, with the actual build filename substituted. Avoid redundant ready reports unless explicitly requested. Preferred root-shell message shape: +When Samo asks Hermes to download/deploy a Clawdie IMG, start the verified download immediately; **launch the download**, don't just check if one exists. If he asks for a completion Telegram notification, send exactly one concise copy-paste flash command after download + gzip + SHA256 verification, with the actual build filename substituted. **Send only the requested notification**; omit extra ready reports unless explicitly requested. Preferred root-shell message shape: ```bash # gzip -dc /home/samob/Downloads/.img.gz | dd of=/dev/sdX bs=4M status=progress conv=fsync && @@ -150,7 +150,7 @@ This downloads `URL.sha256`, resumes to `.part`, runs `gzip -t`, checks SHA256, ## Troubleshooting download stalls or network switches -If the user asks for the ETA of the actual current download, first inspect the active download process/session and the `.part` file; do not answer only from generic connection-speed estimates. When Hermes started the helper in the background, poll that process and sample the partial file size over a short interval: +If the user asks for the ETA of the actual current download, first inspect the active download process/session and the `.part` file; **base the answer on live process metrics** — poll the actual download, not generic estimates. When Hermes started the helper in the background, poll that process and sample the partial file size over a short interval: ```bash # If the helper was started by Hermes, poll the background process/session first. @@ -170,7 +170,7 @@ print(f"progress={s2}/{total} bytes rate={rate/1e6:.2f} MB/s eta_min={(eta/60):. PY ``` -If the user changes Wi‑Fi/provider during a large ISO/IMG download, do not guess from curl's last line alone. Check both the background process and the `.part` file size/mtime over a short interval: +If the user changes Wi‑Fi/provider during a large ISO/IMG download, **cross-check** both the background process output and the `.part` file size/mtime over a short interval: ```bash # 1) Poll the background download process if it was started by Hermes. @@ -191,7 +191,7 @@ Do not declare the artifact ready until the helper has completed, `gzip -t` pass ## Troubleshooting flash failures -If `dd` reports `No space left on device` while flashing a compressed image, do not assume the downloaded `.img.gz` size is the raw image size. First compare the target device size with bytes written and re-check the live device map: +If `dd` reports `No space left on device` while flashing a compressed image, **compare** the target device size with bytes written and re-check the live device map: ```bash lsblk -o NAME,PATH,SIZE,MODEL,SERIAL,TRAN,RM,HOTPLUG,STATE,MOUNTPOINTS @@ -223,7 +223,7 @@ print("FIT: YES" if margin >= 0 else "FIT: NO") PY ``` -If `dmesg` is restricted by the host (`Operation not permitted`), use the user's pasted dmesg plus live `lsblk -b` output; do not treat lack of dmesg access as lack of device evidence. +If `dmesg` is restricted by the host (`Operation not permitted`), **use the user's pasted dmesg plus live `lsblk -b` output** as device evidence. Treat restricted dmesg as an access issue, not a lack of device data. ## Stale-label wipe guidance @@ -322,17 +322,17 @@ See `references/freebsd-live-usb-xkb-overlays.md` for a concise diagnostic note. ## Pitfalls -- Do not recommend decompressing to disk by default; large images waste space and time. -- Do not use FreeBSD base `sha256 -c file.sha256` as if it were GNU `sha256sum -c`. +- Do not recommend decompressing to disk by default; **stream gzip directly —** large images waste space and time when decompressed. +- Do not use FreeBSD base `sha256 -c file.sha256` as if it were GNU `sha256sum -c`; **use `awk '{print $NF}'` to extract the hash** for FreeBSD-style checksum files. - Do not write to partitions (`/dev/sdX1`, `/dev/da0p1`, `/dev/da0s1`). -- Do not assume Linux `/dev/sdX` naming on FreeBSD; use `/dev/daX`/`camcontrol`/`gpart`. -- Do not leave older same-named images in Downloads when a newer artifact is being fetched; remove or clearly separate stale files to avoid flashing the wrong build. -- Do not leave older same-named images in Downloads when a newer artifact is being fetched; remove or clearly separate stale files to avoid flashing the wrong build. +- **Use FreeBSD-native device naming**: `/dev/daX`, `camcontrol`, `gpart` — not Linux `/dev/sdX`. +- **Remove or rename** older same-named images in Downloads when a newer artifact is being fetched; separate stale files so **only the intended build is flashed**. +- **Remove or rename** older same-named images in Downloads when a newer artifact is being fetched; separate stale files so **only the intended build is flashed**. - When the agent runtime cannot safely execute raw-device writes, still complete the non-destructive parts: verify checksum, identify the exact removable whole-disk target, unmount/mount-state-check if allowed, then give the user a copy-paste `gzip -dc ... | dd of=/dev/...` command. Never silently skip target identification. - **Curl `--continue-at -` + retry on partial files:** the progress bar counts from the resume offset, not from byte zero. A partial that reached only 6 % will show `100%` immediately if the server becomes unreachable — curl hits EOF on the local partial, not the remote. Always verify with `ls -lh` and SHA256 before trusting a "100 %" resume result on a `.part` file. If the partial is <10 % of expected, starting fresh is often faster than trying to diagnose resume confusion. - If the user asks for a root-ready command, omit `sudo` and provide a single copy-paste command after verifying the target disk; keep `sudo` in general Linux/FreeBSD docs for non-root shells. -- For Samo's Clawdie IMG download completion notifications, send exactly one concise Telegram message containing only the root-ready flash command with the actual build filename and `/dev/sda`: `# gzip -dc .img.gz | dd of=/dev/sda bs=4M status=progress conv=fsync && sync`. Do not include extra verification/fit reports unless asked. -- For Samo's one-USB-port debby workflow, once the target stick is confirmed as `/dev/sda`, use `of=/dev/sda` directly in the final copy-paste command instead of a placeholder. Do not add verbose reminders unless asked. -- For completion notifications after Clawdie verified downloads, do not send both a report and a command. Send only the requested final command unless the user explicitly requested a fit/verification report. +- For Samo's Clawdie IMG download completion notifications, send exactly one concise Telegram message containing only the root-ready flash command with the actual build filename and `/dev/sda`: `# gzip -dc .img.gz | dd of=/dev/sda bs=4M status=progress conv=fsync && sync`. **Include extra reports only when explicitly asked.** +- For Samo's one-USB-port debby workflow, once the target stick is confirmed as `/dev/sda`, use `of=/dev/sda` directly in the final copy-paste command instead of a placeholder. **Keep output concise** — add reminders only when explicitly requested. +- For completion notifications after Clawdie verified downloads, **send only the requested final command** — omit extra reports unless user explicitly requested one. - In deployer role, after receiving `HERMES_USB_DEPLOY_READY=1`, starting the verified download is part of the job; waiting for another explicit "download it" prompt is a workflow miss. - When setting a Telegram completion notification for large image downloads, prefer a no-agent cron/watchdog script that stays silent until final file exists, `.part` is gone, SHA256 matches, and `gzip -t` passes. Include a fit report if a USB key has been inserted before completion. diff --git a/skills/systematic-debugging/SKILL.md b/skills/systematic-debugging/SKILL.md index a04f7d6..a7687d0 100644 --- a/skills/systematic-debugging/SKILL.md +++ b/skills/systematic-debugging/SKILL.md @@ -29,7 +29,7 @@ Random fixes waste time and create new bugs. Quick patches mask underlying issue NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST ``` -If you haven't completed Phase 1, you cannot propose fixes. +**Complete Phase 1 (root cause investigation) first** before proposing any fix. ## When to Use @@ -50,7 +50,7 @@ Use for ANY technical issue: - Previous fix didn't work - You don't fully understand the issue -**Don't skip when:** +**Apply the full process when**: - Issue seems simple (simple bugs have root causes too) - You're in a hurry (rushing guarantees rework) @@ -80,7 +80,7 @@ You MUST complete each phase before proceeding to the next. - Can you trigger it reliably? - What are the exact steps? - Does it happen every time? -- If not reproducible → gather more data, don't guess +- If not reproducible → **gather more data and trace evidence** instead of guessing. **Action:** Use the `terminal` tool to run the failing test or trigger the bug: @@ -220,7 +220,7 @@ search_files("similar_pattern", path="src/", file_glob="*.py") ### 4. When You Don't Know - Say "I don't understand X" -- Don't pretend to know +- **Acknowledge gaps openly** — ask the user or research more. - Ask the user for help - Research more @@ -333,7 +333,7 @@ If you catch yourself thinking: | "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. | | "Reference too long, I'll adapt the pattern" | Partial understanding guarantees bugs. Read it completely. | | "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. | -| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question the pattern, don't fix again. | +| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. **Step back and question the pattern** — adding yet another fix will not resolve the underlying issue. | ## Quick Reference @@ -357,7 +357,7 @@ Use these Hermes tools during Phase 1: ### Network / SSH / tmux lag investigations -For reports like “remote tmux feels laggy,” separate noisy log symptoms from the actual interactive path. Keep diagnostics tidy: prefer single bounded logs under `~/.local/state/hermes/net-tests/` and generated dashboards under `~/.local/share/hermes/net-dashboard/`; avoid writing multiple files to the user's Desktop unless explicitly requested. +For reports like "remote tmux feels laggy," separate noisy log symptoms from the actual interactive path. Keep diagnostics tidy: prefer single bounded logs under `~/.local/state/hermes/net-tests/` and generated dashboards under `~/.local/share/hermes/net-dashboard/`; **keep all output in designated directories** unless the user explicitly requests Desktop files. Detailed patterns and examples live in `references/network-live-diagnostics.md`; projector/dashboard-specific guidance lives in `references/wifi-projector-dashboard-diagnostics.md`. Reusable helpers include `scripts/live_download_monitor.py` for bounded JSONL monitoring and `scripts/periodic-pcap-sampler.sh` for low-disk, periodic short pcaps. @@ -394,11 +394,11 @@ Detailed patterns and examples live in `references/network-live-diagnostics.md`; 1. When comparing home Wi-Fi with a phone hotspot, derive the gateway dynamically (`ip route show default`) instead of hardcoding `192.168.1.1`. Otherwise the hotspot test can falsely report gateway failure. 1. After a network switch, distinguish stale public SSH sessions from surviving Tailscale sessions. Inspect `ss -nti` for old local addresses, FIN-WAIT states, Send-Q/notsent, retrans/backoff, and PMTU anomalies. Public DNS SSH can die across the switch while `*.ts.net`/MagicDNS SSH remains healthy. 1. Only propose changes after evidence: e.g. switch to 5 GHz/Ethernet, prefer Tailscale hostnames in `~/.ssh/config`, or investigate router/ISP jitter. -1. When a large download is active, avoid unbounded packet capture. First run a bounded low-volume monitor (disk, `ss -tinp`, short pings, Wi-Fi state) with runtime/log-size/free-space limits. If local gateway ping remains clean while internet/Tailscale ping jumps to hundreds or thousands of ms, suspect saturation/bufferbloat rather than Wi-Fi driver failure. -1. Wireshark/tshark can be added as a second layer, but only with short filtered captures and summarized output. Keep raw pcaps under `~/.local/state/hermes/net-tests/` and avoid dumping large packet logs into chat or Desktop. +1. When a large download is active, **start with bounded, low-volume monitoring** (disk, `ss -tinp`, short pings, Wi-Fi state) with runtime/log-size/free-space limits. If local gateway ping remains clean while internet/Tailscale ping jumps to hundreds or thousands of ms, suspect saturation/bufferbloat rather than Wi-Fi driver failure. +1. Wireshark/tshark can be added as a second layer, but only with short filtered captures and summarized output. **Keep raw pcaps under** `~/.local/state/hermes/net-tests/` and **summarize findings** instead of dumping large packet logs into chat or Desktop. 1. For projector/streaming/interference sessions, preserve real-world event markers (projector on, Ubuntu installer phase, Bluetooth off, download phase) in the active run directory and visualize them as spikes/filters for non-technical viewers. See `references/wifi-projector-dashboard-diagnostics.md`; use `scripts/periodic-pcap-sampler.sh` when the user wants wire-level evidence over time without continuous large pcaps. 1. For user-facing network history, prefer a non-technical "story dashboard" over raw numbered tables: charts with visible spikes, line toggles, plain-language event cards, and technical details hidden behind disclosure widgets. For before/after interference tests (e.g. projector/Epson on), collect comparable bounded monitor windows and mark the event moment so a non-technical viewer can see whether spikes start or stop with the event. See `references/network-live-diagnostics.md` and `scripts/network_story_dashboard.py`. -1. When embedding parsed log data into a static HTML dashboard, do not HTML-escape JSON inside `