layered-soul/docs/HOST-MATRIX.md

# Host & Agent Matrix (shared, fill-as-you-go)

A living inventory of **who runs where** and **what each host actually is**. Any agent
on any host fills in its own row. Source of truth for facts is the probe — not memory.

> **How to fill your row**
>
> ```sh
> cd ~/layered-soul
> python3 scripts/verify_facts_probe.py --os --hardware --storage --network --text
> ```
>
> Copy the verified values into the tables below, set `Probed` to today's UTC date,
> and commit. **Never guess hardware, OS, or IPs** — paste what the probe reports.
> On FreeBSD the probe synthesizes an OS-specific command map; trust its output over
> Linux habits.
>
> **Disk before action:** before installing a toolchain or starting a build, check
> real free space (`df -h /`, or the probe's `--storage`) — **always measure** before acting. Keep the
> **Disk (free)** column current and flag any host past ~85%. See _Disk discipline_ below.
>
> **Cost before buying:** before purchasing or retiring infrastructure, record provider,
> plan/SKU, verified monthly cost, and the source of truth (invoice/control panel/utility
> bill). IP-range guesses are not billing proof. See _Cost provenance_ below.
>
**Keep real IPs and bot handles in `fleet.env` (gitignored).** Use `${HOST_TS_IP}` and `${*_BOT}`
placeholders in committed docs; real values live in `fleet.env` and are live via
> `tailscale status`. Copy `fleet.env.example` → `fleet.env` to resolve them. The probe
> prints real IPs — record them in `fleet.env`, not in this table.

---

## 1. Agent placement (who runs where)

| Agent             | Host    | OS / Isolation        | Harness                      | Role                                                                      | Bot / channel               | Status                                 |
| ----------------- | ------- | --------------------- | ---------------------------- | ------------------------------------------------------------------------- | --------------------------- | -------------------------------------- |
| Hermes            | debby   | Debian 13 / Docker    | Hermes Agent (upstream)      | Secondary agent + soul backup (intermittent laptop)                       | ${HERMES_BOT}               | LIVE (intermittent)                    |
| Zot               | debby   | Debian 13 / Docker    | Zot RPC                      | Coding, media workflows                                                   | ${ZOT_BOT}                  | LIVE                                   |
| Claude            | domedog | Ubuntu 24.04 / Docker | Claude Code                  | Verification, review                                                      | — (CLI)                     | LIVE                                   |
| **Mevy**          | osa     | FreeBSD 15 / host     | Hermes Agent (upstream, CLI) | **Consolidated into hermes-osa**                                          | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — under hermes-osa**            |
| **hermes-osa**    | osa     | FreeBSD 15 / host     | Hermes Agent (FreeBSD fork)  | **Orchestrator + board host (always-on VPS): chat + gateway**             | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — chat + Telegram**             |
| Codex             | osa     | FreeBSD 15 / jail     | Codex CLI                    | ISO builds, validation                                                    | — (CLI)                     | LIVE                                   |
| **domedog-agent** | domedog | Ubuntu 24.04 / host   | Colibri board agent          | Headless Linux media/compute lane (image-render, ffmpeg, rust/go/py/node) | —                           | **LIVE — on central board 2026-06-19** |

> **Mevy vs hermes-osa distinction**: Mevy (${HERMES_OSA_BOT} / OSA-bot) has been consolidated into hermes-osa as of 2026-06-17. The Telegram bot token was migrated from the old backup .env. hermes-osa now runs both the local CLI chat and the Telegram gateway (polling mode, tmux session `hermes-gateway`).
>
> **Status key**: `LIVE` = running and validated right now. `INSTALLED` = binary present, not yet validated in role. `PLANNED` = not yet set up. No guessing.

> Notes:
>
> - Provider per agent (DeepSeek / OpenRouter / Z.AI / local) — fill in the per-host table.
> - One Telegram token per running service. **Assign each service its own unique token.**
> - **Orchestrator lives on the always-on host.** **osa is the always-on VPS** and hosts the
colibri board + orchestrator (hermes-osa). **debby is an intermittent laptop** (powers off
periodically) — a secondary agent + soul backup; **osa is the designated hub**. The board **always stays on osa** (always-on VPS); tasks routed to debby queue up and execute when it returns.
> - **Routing**: Colibri has a capability matcher for per-host agent pools, and **cross-host
>   routing is LIVE** (2026-06-19): a `socat` bridge exposes osa's colibri-daemon on its
>   Tailscale IP (`${OSA_TS_IP}:9190`, tailnet-only); agents on debby/domedog reach the osa
>   board over the tailnet, and a poller (2 min) / worker (5 min) loop executes assigned tasks.
>   Validated on the debby↔osa lane; colibri PR #83. See [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md).
> - **Probe vs identity**: `verify_facts_probe.py` is a required discipline/tool,
>   not an automatic startup hook — agents run it when grounding host facts, and HOST-MATRIX
>   records the result. OS/hardware facts come from probes and the matrix, not from SOUL.md
>   (which carries identity and values).

---

## 2. Host hardware & facts (one row per host)

| Host        | Tailscale IP     | OS / Kernel                        | Virt                  | CPU                                    | vCPU | RAM     | Swap                  | Disk (free)                  | GPU                    | Probed     | By     |
| ----------- | ---------------- | ---------------------------------- | --------------------- | -------------------------------------- | ---- | ------- | --------------------- | ---------------------------- | ---------------------- | ---------- | ------ |
| **domedog** | ${DOMEDOG_TS_IP} | Ubuntu 24.04.4 / 6.8.0-117         | KVM                   | AMD EPYC 7543P (32-core host)          | 2    | 7.8 GiB | 2.0 GiB               | 100 GB QEMU (51G free)       | none (headless)        | 2026-06-17 | Claude |
| **debby**   | ${DEBBY_TS_IP}   | Debian 13 / 6.12.90+deb13.1-amd64  | bare metal            | AMD Ryzen 7 5700U (8-core)             | 16   | 15 GiB  | 15 GiB                | nvme0n1p2 453G (23G free)    | Radeon Graphics (iGPU) | 2026-06-17 | Hermes |
| **osa**     | ${OSA_TS_IP}     | FreeBSD 15.0-RELEASE-p10 / GENERIC | not reported by probe | Intel Core Processor (Haswell, no TSX) | 6    | 11 GiB  | not reported by probe | ZFS pool: zroot (23.4G free) | not reported by probe  | 2026-06-17 | Pi     |

### Disk discipline (**measure, then act**)

Disk is a first-class fact, same as OS or CPU — **measure with `df -h` and `du` before acting.**

- **Before installing a toolchain or starting a build**, run `df -h /` (Linux) or
  `zfs list` / `df -h` (FreeBSD), or the probe's `--storage`. Confirm the headroom is
  really there.
- **Keep the `Disk (free)` column above current** when you add or remove anything large.
- **Flag any host past ~85% used.** Reference footprints to budget with: Go SDK ≈ 290 MB,
  Rust toolchain (`~/.rustup` + `~/.cargo`) ≈ 1.8 GB, a Node version ≈ 150 MB; build/module
  caches grow on top of these.
- **Standing watch:** `debby` runs ~95% full (23 GB free). Treat new installs/builds there
  as a deliberate decision, not a default — prefer the host with real headroom.

This is the survivability principle applied to storage: a host that silently fills up is a
host that fails. What you guess will be wrong; what you probe will be right.

### Cost provenance (invoice/control-panel facts, not guesses)

Hosting spend is a first-class fleet fact, but it must stay non-secret: record provider,
plan/SKU, region, verified monthly cost, and the proof source. Do **not** commit invoice
IDs, account numbers, billing addresses, or payment details. If a provider is inferred from
an IP range, mark it `TBD` until the control panel or invoice confirms it.

| Host / candidate                      | Provider                                                           | Plan / SKU                                | Region          | Monthly cost                          | Billing cycle | Role paid for                                                       | Source / proof                                                                | Status / notes                                                                                                                                   |
| ------------------------------------- | ------------------------------------------------------------------ | ----------------------------------------- | --------------- | ------------------------------------- | ------------- | ------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| **osa**                               | TBD (verify; OVHcloud is suspected but not invoice-confirmed here) | TBD                                       | TBD             | TBD                                   | TBD           | always-on orchestrator + board + Hermes gateway                     | operator invoice/control panel needed                                         | Existing always-on VPS; **verify provider via invoice/control panel**, not by IP range alone.                                                                                          |
| **domedog**                           | TBD                                                                | TBD                                       | TBD             | TBD                                   | TBD           | Linux media/compute lane                                            | operator invoice/control panel needed                                         | Existing Linux VM; cost not tracked yet.                                                                                                         |
| **debby**                             | self-owned laptop                                                  | —                                         | local           | utility/power TBD                     | —             | intermittent secondary agent + soul backup                          | local device + utility rate if needed                                         | Not an always-on hub; power cost only matters when left on.                                                                                      |
| **mother-build** (candidate)          | proposed OVHcloud                                                  | TBD: Public Cloud hourly or Eco/dedicated | TBD             | TBD                                   | TBD           | FreeBSD build host / poudriere / Rust+zot builds                    | OVH quote needed before purchase                                              | Prefer on-demand if builds are infrequent; dedicated only if build demand justifies standing cost.                                               |
| **ML350p Gen8** (candidate/retire)    | self-hosted hardware                                               | owned hardware                            | local           | ~€53–63/mo @ 460 W high-load estimate | utility bill  | multitenant/build candidate; fallback if TCO beats cloud            | GEN-I + URO tariff research; fan/PSU label, not wall-metered                  | Use as planning band only; measure wall draw before committing tenants.                                                                          |
| **vultr-svc** (Forgejo + Vaultwarden) | Vultr                                                              | TBD                                       | TBD (verify EU) | TBD                                   | TBD           | git mirror (layered-soul + hermes-soul) + Vaultwarden secrets store | DNS code/vault.smilepowered.org → Vultr (verified 2026-06-20); invoice needed | Off-OVH backup target (good) BUT Forgejo + Vault share one box → SPOF for backups AND secrets; needs own off-box backup + EU-region verify + MFA |

Cost discipline mirrors disk discipline: measure before action. For self-hosted hardware,
calculate monthly power with `watts / 1000 * 24 * 30 * €/kWh` using measured idle/load
wattage and the actual utility rate; **use measured wattage and actual €/kWh** for power-cost comparisons.

**ML350p Gen8 planning note:** for the multitenant/high-load case, use the visible
fan/PSU-side **460 W** mark as the conservative continuous-load assumption until a wall
meter proves otherwise.

- Monthly energy: `0.460 kW * 24 h * 30.4375 d = ~336 kWh/month`.
- GEN-I regular household ET energy price: `0.13286 EUR/kWh` with VAT → **~€44.6/mo**
  energy-only.
- Add URO network-energy ET estimate (`0.01864 EUR/kWh` before VAT, ~`0.02274 EUR/kWh`
  with VAT) → **~€52.3/mo** variable electricity + network-energy estimate.
- Practical planning band with smaller per-kWh state charges: **~€53/mo** if 460 W is wall
  draw; **~€59–63/mo** if 460 W is output-side load at ~90–85% PSU efficiency.
- Annualized planning band: **~€640–760/year**.

---

## 3. Per-host detail (expand as needed)

### domedog (Claude / verification) — probed 2026-06-17 by Claude

- **Identity**: hostname `domedog.pro`, Tailscale `${DOMEDOG_TS_IP}`
- **OS**: Ubuntu 24.04.4 LTS, kernel `6.8.0-117-generic`, x86_64, KVM guest
- **CPU**: AMD EPYC 7543P 32-Core (2 vCPU exposed to guest)
- **Memory**: 7.8 GiB RAM, 2.0 GiB swap
- **Storage**: `/dev/sda1` 96 GB ext4 root, 51 GB free (QEMU HARDDISK). No ZFS.
- **GPU**: none (headless VM)
- **Uptime at probe**: ~3.5 weeks
- **Role here**: Claude Code — verification & review lane. No Telegram bot.
- **Colibri agent (joined central board 2026-06-19)** — the headless Linux media/compute lane:
  - **Capabilities advertised**: `linux`, `python3.12`, `rust`, `go`, `node`, `ffmpeg`,
    `image-render`. **Not** `screenshot`/`gui` (headless VM), not `docker` (absent).
    `image-render`/`ffmpeg` are domedog-only in the fleet — osa dropped Pillow.
  - **Reach**: client shim `colibri-shim.service` (system unit, `User=clawdija`,
    `Restart=always`, reboot-persistent) runs
    `socat UNIX-LISTEN:~/.colibri/colibri.sock → TCP ${OSA_TS_IP}:9190` (osa bridge over
    Tailscale). A system unit, not `--user`: `systemctl --user` has no bus on this host.
  - **Operate**: `~/.colibri/agent.env` holds `COLIBRI_AGENT_ID` + `COLIBRI_SOCKET`; helpers
    in `~/.colibri/` — `colibri_cmd.py` (raw JSON), `colibri_poll.py`, `colibri_task_done.py`.
  - **Validated**: register → scheduler routed an `image-render` task to domedog → poller saw
    it → worker marked it `done` (2026-06-19).
  - **Executor pending (decision required)**: domedog _receives_ capability-matched tasks, but
    no persistent execution loop runs yet — until one does, routed tasks sit `started` (no
    lease/reaper). Decide what executes (Claude Code worker / script) and with what authority
    before relying on autonomous domedog task completion.

### debby (Hermes secondary + Zot — intermittent laptop) — probed 2026-06-17 by Hermes

- **Identity**: hostname `debby`, Tailscale `${DEBBY_TS_IP}`
- **OS**: Debian 13 (Trixie), kernel `6.12.90+deb13.1-amd64`, bare metal (KDE Plasma desktop)
- **CPU**: AMD Ryzen 7 5700U with Radeon Graphics, 8 physical cores, 16 threads
- **Memory**: 15 GiB RAM, 15 GiB swap
- **Storage**: `/dev/nvme0n1p2` 453 GB ext4 root, 23 GB free (95% full). No ZFS.
- **GPU**: AMD Radeon Graphics (integrated, Lucienne)
- **Containers**: Docker 29.5.3 installed (daemon not currently running)
- **Hermes Agent**: v0.16.0 (upstream f9c8d95e), DeepSeek v4 Pro primary provider, OpenRouter for vision/fallback, Z.AI/GLM available
- **Zot RPC**: Go binary at `~/.local/bin/zot`, GLM-5.1 model
- **Telegram**: ${HERMES_BOT} + ${ZOT_BOT} in "My Debby" group
- **Layered soul**: commit `817624c`, 6 curated memories, 9 cross-harness skills

### osa (FreeBSD: hermes-osa orchestrator + board host, always-on VPS; + Mevy + Codex) — probed 2026-06-17 by hermes-osa

- **Identity**: hostname `osa.smilepowered.org`, Tailscale `${OSA_TS_IP}`
- **OS**: FreeBSD `15.0-RELEASE-p10`, kernel `FreeBSD osa.smilepowered.org 15.0-RELEASE-p10 FreeBSD 15.0-RELEASE-p10 releng/15.0-n281064-98258a339269 GENERIC amd64`
- **CPU**: Intel Core Processor (Haswell, no TSX), 6 vCPU
- **Memory**: 11 GiB RAM
- **Storage**: ZFS pool `zroot`, 98.5G ONLINE, 23.4G available
- **Jails**: `cms` and `worker` (Bastille jails); Docker not installed
- **Agents on host**:
  - **hermes-osa** — Hermes Agent v0.16.0 (`hermes-bsd` clean-room MIT fork), FreeBSD local CLI runtime + Telegram gateway. **Status: LIVE — validated local chat + Telegram.** Default provider: DeepSeek direct (`provider: deepseek`, `default: deepseek-chat`). OpenRouter available as fallback/manual lane. Telegram/gateway: LIVE — ${HERMES_OSA_BOT} (Mevy/OSA-bot), polling mode, tmux session `hermes-gateway` on osa. Daemon/rc.d: deferred (Track A).
  - **Mevy** — ${HERMES_OSA_BOT} (OSA-bot) — now consolidated under hermes-osa gateway. Token migrated from old backup .env.
  - **Codex** — `codex-cli 0.117.0`, ISO builds and validation. Runs in a Bastille jail.
  - **Claude Code** — installed (path: `/home/clawdie/.npm-global/bin/claude`), no dedicated role yet.
- **Provider stack** (hermes-osa):
  ```yaml
  provider: deepseek # primary — direct credits, proven DEEPSEEK_OK
  default: deepseek-chat
  fallback: openrouter # available manually, not auto-fallback configured yet
  ```
- **Z.AI**: deferred (not configured for hermes-osa; available via OpenRouter if needed)
- **Telegram**: LIVE — ${HERMES_OSA_BOT}, polling mode, connected 2026-06-17
- **Gateway**: LIVE — running in tmux session `hermes-gateway`, manual start (no rc.d yet)
- **Launch command**:
  ```sh
  tmux new -s hermes-osa
  cd /home/clawdie/ai/hermes-bsd
  export HERMES_HOME=/home/clawdie/.hermes
  source venv/bin/activate    # or: .venv/bin/activate
  hermes chat
  ```
- **Layered soul**: commit `c9c88fd`, 10 skills, 7 curated memories
- **Future tracks (separate, none blocking)**:
  - Track A: daemon/rc.d promotion (hermes_daemon service, dedicated user)
  - Track B: ~~Telegram/gateway integration~~ DONE (2026-06-17) — gateway daemonization (rc.d) still deferred
  - Track C: ~~Colibri cross-host routing~~ **DONE (2026-06-19)** — `socat` bridge on osa `:9190` (Tailscale-only) + poller/worker loop; colibri PR #83 merged. See CAPABILITY-ROUTING.md
  - Track D: old clawdie_glass cleanup

_See [`../AGENTS.md`](../AGENTS.md) for the canonical agent matrix and operating rules._

## §4 Compliance standing constraints

- **EU region only**: All OVHcloud resources in FR/DE/PL. Sidesteps non-EU transfer/SCC burden under GDPR.
- **Off-box backup before any reinstall**: OVH DPA §10 + GTS §6.3/6.5/10.6 — reinstall/termination = irreversible deletion including OVH-side backups, no recovery, OVH not liable. Identity/skills covered by git (layered-soul + hermes-soul on Forgejo). Runtime state (ZFS snapshots, Vaultwarden DB) must be verified backed up outside OVH.
  - **Backup independence (verified 2026-06-20):** Forgejo **and** Vaultwarden both run on **Vultr** (the `code` / `vault.smilepowered.org` host) — a _different provider_ than osa/OVH, so an OVH loss does not take the git backup (good). **But Forgejo and Vaultwarden share that one Vultr box**, making it a single point of failure for _both_ the backups _and_ all secrets. → that box needs its _own_ off-box backup (Vaultwarden DB export + Forgejo data to a third location), and **backups are unverified until test-restored** (cost-discipline applies to backups: check, don't assume). Add the Vultr host to the provenance table; apply EU-region (verify) + MFA to it too.
- **MFA on every master-key account**: GTS §2.3/2.4 — operator is liable for fraudulent account use. Enable MFA on **OVH, Vultr, the domain registrar (clawdie.si / smilepowered.org), Forgejo admin, and Vaultwarden** — each is a master key to the fleet. **Auto-renew the domains**: a lapsed domain silently kills `pkg.clawdie.si`, ACME certs, and SSH-by-hostname.
- **Billing hygiene**: provider **auto-renew is on by default** (OVH/Vultr) — disable before the 19th of the month if not renewing. **Commitment Periods lock you in** (full term due, no refund for early cancel/non-use). Act on **price-increase / end-of-life** notices within the 30-day cancel window. Track renewal dates per provider in the provenance table.
- **Continuity plan (contractually required)**: OVH GTS §6.3 makes a recovery plan the Client's obligation, and §4/§10 cap provider liability at service credits — no data-loss or downtime damages. The fleet's **multi-host survivability** (Linux/Docker + FreeBSD/jails, relocatable via layered-soul) **is** the recovery plan; pair it with the off-box backups above.
- **Do not commit OVH contracts/credentials**: GTS §13 makes contract terms confidential. A compliance summary only in public repos — no verbatim DPA/GTS text, no NIC handles or login credentials.

### Multi-tenant GDPR gates (administrative, not technical)

These switch on when the hive goes multi-tenant. None block current internal use:

- [ ] GDPR controller docs (privacy notice, legal basis for processing, ROPA)
- [ ] DPIA only if agents make automated decisions about _individuals_ with legal/significant effect (GDPR Art. 35/22) — the internal agent task scheduler (routing work to machines) does **not** trigger this
- [ ] Pass OVH terms down to customers (GTS §10.6 — sub-licensing)
- [ ] Third-party / "AAA" professional indemnity insurance (§10.6)
- [ ] Customer sanctions screening (GTS §14.3 — denied parties / export controls)
- [ ] Data Processing Agreement with each tenant (DPA §12 — controller→processor chain)

See [`HIVE-ONBOARDING.md §9`](./HIVE-ONBOARDING.md) for the integration checklist.