Track hosting spend as a verified fleet fact alongside disk and hardware, seed TBD rows for osa/domedom/debby/proposed OVH build capacity/ML350p, and update HIVE status now that first-proof blockers are code-complete.\n\nValidation: npx --yes prettier@3 --check docs/HOST-MATRIX.md docs/HIVE-ONBOARDING.md; python3 scripts/layered_soul.py validate .
195 lines
16 KiB
Markdown
195 lines
16 KiB
Markdown
# Host & Agent Matrix (shared, fill-as-you-go)
|
|
|
|
A living inventory of **who runs where** and **what each host actually is**. Any agent
|
|
on any host fills in its own row. Source of truth for facts is the probe — not memory.
|
|
|
|
> **How to fill your row**
|
|
>
|
|
> ```sh
|
|
> cd ~/layered-soul
|
|
> python3 scripts/verify_facts_probe.py --os --hardware --storage --network --text
|
|
> ```
|
|
>
|
|
> Copy the verified values into the tables below, set `Probed` to today's UTC date,
|
|
> and commit. **Never guess hardware, OS, or IPs** — paste what the probe reports.
|
|
> On FreeBSD the probe synthesizes an OS-specific command map; trust its output over
|
|
> Linux habits.
|
|
>
|
|
> **Disk before action:** before installing a toolchain or starting a build, check
|
|
> real free space (`df -h /`, or the probe's `--storage`) — never estimate. Keep the
|
|
> **Disk (free)** column current and flag any host past ~85%. See _Disk discipline_ below.
|
|
>
|
|
> **Cost before buying:** before purchasing or retiring infrastructure, record provider,
|
|
> plan/SKU, verified monthly cost, and the source of truth (invoice/control panel/utility
|
|
> bill). IP-range guesses are not billing proof. See _Cost provenance_ below.
|
|
>
|
|
> **Never paste real IPs or bot handles here.** Use `${HOST_TS_IP}` and `${*_BOT}`
|
|
> placeholders; real values live in `fleet.env` (gitignored) and are live via
|
|
> `tailscale status`. Copy `fleet.env.example` → `fleet.env` to resolve them. The probe
|
|
> prints real IPs — record them in `fleet.env`, not in this table.
|
|
|
|
---
|
|
|
|
## 1. Agent placement (who runs where)
|
|
|
|
| Agent | Host | OS / Isolation | Harness | Role | Bot / channel | Status |
|
|
| ----------------- | ------- | --------------------- | ---------------------------- | ------------------------------------------------------------------------- | --------------------------- | -------------------------------------- |
|
|
| Hermes | debby | Debian 13 / Docker | Hermes Agent (upstream) | Secondary agent + soul backup (intermittent laptop) | ${HERMES_BOT} | LIVE (intermittent) |
|
|
| Zot | debby | Debian 13 / Docker | Zot RPC | Coding, media workflows | ${ZOT_BOT} | LIVE |
|
|
| Claude | domedog | Ubuntu 24.04 / Docker | Claude Code | Verification, review | — (CLI) | LIVE |
|
|
| **Mevy** | osa | FreeBSD 15 / host | Hermes Agent (upstream, CLI) | **Consolidated into hermes-osa** | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — under hermes-osa** |
|
|
| **hermes-osa** | osa | FreeBSD 15 / host | Hermes Agent (FreeBSD fork) | **Orchestrator + board host (always-on VPS): chat + gateway** | ${HERMES_OSA_BOT} (OSA-bot) | **LIVE — chat + Telegram** |
|
|
| Codex | osa | FreeBSD 15 / jail | Codex CLI | ISO builds, validation | — (CLI) | LIVE |
|
|
| **domedog-agent** | domedog | Ubuntu 24.04 / host | Colibri board agent | Headless Linux media/compute lane (image-render, ffmpeg, rust/go/py/node) | — | **LIVE — on central board 2026-06-19** |
|
|
|
|
> **Mevy vs hermes-osa distinction**: Mevy (${HERMES_OSA_BOT} / OSA-bot) has been consolidated into hermes-osa as of 2026-06-17. The Telegram bot token was migrated from the old backup .env. hermes-osa now runs both the local CLI chat and the Telegram gateway (polling mode, tmux session `hermes-gateway`).
|
|
>
|
|
> **Status key**: `LIVE` = running and validated right now. `INSTALLED` = binary present, not yet validated in role. `PLANNED` = not yet set up. No guessing.
|
|
|
|
> Notes:
|
|
>
|
|
> - Provider per agent (DeepSeek / OpenRouter / Z.AI / local) — fill in the per-host table.
|
|
> - One Telegram token per running service. Never share a token across instances.
|
|
> - **Orchestrator lives on the always-on host.** **osa is the always-on VPS** and hosts the
|
|
> colibri board + orchestrator (hermes-osa). **debby is an intermittent laptop** (powers off
|
|
> periodically) — a secondary agent + soul backup, never the hub. The board must sit where it
|
|
> never disappears; tasks routed to debby simply park until it returns.
|
|
> - **Routing**: Colibri has a capability matcher for per-host agent pools, and **cross-host
|
|
> routing is LIVE** (2026-06-19): a `socat` bridge exposes osa's colibri-daemon on its
|
|
> Tailscale IP (`${OSA_TS_IP}:9190`, tailnet-only); agents on debby/domedog reach the osa
|
|
> board over the tailnet, and a poller (2 min) / worker (5 min) loop executes assigned tasks.
|
|
> Validated on the debby↔osa lane; colibri PR #83. See [`CAPABILITY-ROUTING.md`](./CAPABILITY-ROUTING.md).
|
|
> - **Probe vs identity**: `verify_facts_probe.py` is a required discipline/tool,
|
|
> not an automatic startup hook — agents run it when grounding host facts, and HOST-MATRIX
|
|
> records the result. OS/hardware facts come from probes and the matrix, not from SOUL.md
|
|
> (which carries identity and values).
|
|
|
|
---
|
|
|
|
## 2. Host hardware & facts (one row per host)
|
|
|
|
| Host | Tailscale IP | OS / Kernel | Virt | CPU | vCPU | RAM | Swap | Disk (free) | GPU | Probed | By |
|
|
| ----------- | ---------------- | ---------------------------------- | --------------------- | -------------------------------------- | ---- | ------- | --------------------- | ---------------------------- | ---------------------- | ---------- | ------ |
|
|
| **domedog** | ${DOMEDOG_TS_IP} | Ubuntu 24.04.4 / 6.8.0-117 | KVM | AMD EPYC 7543P (32-core host) | 2 | 7.8 GiB | 2.0 GiB | 100 GB QEMU (51G free) | none (headless) | 2026-06-17 | Claude |
|
|
| **debby** | ${DEBBY_TS_IP} | Debian 13 / 6.12.90+deb13.1-amd64 | bare metal | AMD Ryzen 7 5700U (8-core) | 16 | 15 GiB | 15 GiB | nvme0n1p2 453G (23G free) | Radeon Graphics (iGPU) | 2026-06-17 | Hermes |
|
|
| **osa** | ${OSA_TS_IP} | FreeBSD 15.0-RELEASE-p10 / GENERIC | not reported by probe | Intel Core Processor (Haswell, no TSX) | 6 | 11 GiB | not reported by probe | ZFS pool: zroot (23.4G free) | not reported by probe | 2026-06-17 | Pi |
|
|
|
|
### Disk discipline (check, don't guess)
|
|
|
|
Disk is a first-class fact, same as OS or CPU — **measure it before you act, don't estimate.**
|
|
|
|
- **Before installing a toolchain or starting a build**, run `df -h /` (Linux) or
|
|
`zfs list` / `df -h` (FreeBSD), or the probe's `--storage`. Confirm the headroom is
|
|
really there.
|
|
- **Keep the `Disk (free)` column above current** when you add or remove anything large.
|
|
- **Flag any host past ~85% used.** Reference footprints to budget with: Go SDK ≈ 290 MB,
|
|
Rust toolchain (`~/.rustup` + `~/.cargo`) ≈ 1.8 GB, a Node version ≈ 150 MB; build/module
|
|
caches grow on top of these.
|
|
- **Standing watch:** `debby` runs ~95% full (23 GB free). Treat new installs/builds there
|
|
as a deliberate decision, not a default — prefer the host with real headroom.
|
|
|
|
This is the survivability principle applied to storage: a host that silently fills up is a
|
|
host that fails. What you guess will be wrong; what you probe will be right.
|
|
|
|
### Cost provenance (invoice/control-panel facts, not guesses)
|
|
|
|
Hosting spend is a first-class fleet fact, but it must stay non-secret: record provider,
|
|
plan/SKU, region, verified monthly cost, and the proof source. Do **not** commit invoice
|
|
IDs, account numbers, billing addresses, or payment details. If a provider is inferred from
|
|
an IP range, mark it `TBD` until the control panel or invoice confirms it.
|
|
|
|
| Host / candidate | Provider | Plan / SKU | Region | Monthly cost | Billing cycle | Role paid for | Source / proof | Status / notes |
|
|
| ---------------------------------- | ------------------------------------------------------------------ | ----------------------------------------- | ------ | ----------------- | ------------- | ------------------------------------------------ | ------------------------------------- | -------------------------------------------------------------------------------------------------- |
|
|
| **osa** | TBD (verify; OVHcloud is suspected but not invoice-confirmed here) | TBD | TBD | TBD | TBD | always-on orchestrator + board + Hermes gateway | operator invoice/control panel needed | Existing always-on VPS; do not treat IP range as proof. |
|
|
| **domedog** | TBD | TBD | TBD | TBD | TBD | Linux media/compute lane | operator invoice/control panel needed | Existing Linux VM; cost not tracked yet. |
|
|
| **debby** | self-owned laptop | — | local | utility/power TBD | — | intermittent secondary agent + soul backup | local device + utility rate if needed | Not an always-on hub; power cost only matters when left on. |
|
|
| **mother-build** (candidate) | proposed OVHcloud | TBD: Public Cloud hourly or Eco/dedicated | TBD | TBD | TBD | FreeBSD build host / poudriere / Rust+zot builds | OVH quote needed before purchase | Prefer on-demand if builds are infrequent; dedicated only if build demand justifies standing cost. |
|
|
| **ML350p Gen8** (candidate/retire) | self-hosted hardware | owned hardware | local | power TBD | utility bill | fallback build host only | measured watts + actual €/kWh needed | Do not make critical paths depend on it until reliability and TCO beat cloud. |
|
|
|
|
Cost discipline mirrors disk discipline: measure before action. For self-hosted hardware,
|
|
calculate monthly power with `watts / 1000 * 24 * 30 * €/kWh` using measured idle/load
|
|
wattage and the actual utility rate; do not compare cloud invoices to guessed electricity.
|
|
|
|
---
|
|
|
|
## 3. Per-host detail (expand as needed)
|
|
|
|
### domedog (Claude / verification) — probed 2026-06-17 by Claude
|
|
|
|
- **Identity**: hostname `domedog.pro`, Tailscale `${DOMEDOG_TS_IP}`
|
|
- **OS**: Ubuntu 24.04.4 LTS, kernel `6.8.0-117-generic`, x86_64, KVM guest
|
|
- **CPU**: AMD EPYC 7543P 32-Core (2 vCPU exposed to guest)
|
|
- **Memory**: 7.8 GiB RAM, 2.0 GiB swap
|
|
- **Storage**: `/dev/sda1` 96 GB ext4 root, 51 GB free (QEMU HARDDISK). No ZFS.
|
|
- **GPU**: none (headless VM)
|
|
- **Uptime at probe**: ~3.5 weeks
|
|
- **Role here**: Claude Code — verification & review lane. No Telegram bot.
|
|
- **Colibri agent (joined central board 2026-06-19)** — the headless Linux media/compute lane:
|
|
- **Capabilities advertised**: `linux`, `python3.12`, `rust`, `go`, `node`, `ffmpeg`,
|
|
`image-render`. **Not** `screenshot`/`gui` (headless VM), not `docker` (absent).
|
|
`image-render`/`ffmpeg` are domedog-only in the fleet — osa dropped Pillow.
|
|
- **Reach**: client shim `colibri-shim.service` (system unit, `User=clawdija`,
|
|
`Restart=always`, reboot-persistent) runs
|
|
`socat UNIX-LISTEN:~/.colibri/colibri.sock → TCP ${OSA_TS_IP}:9190` (osa bridge over
|
|
Tailscale). A system unit, not `--user`: `systemctl --user` has no bus on this host.
|
|
- **Operate**: `~/.colibri/agent.env` holds `COLIBRI_AGENT_ID` + `COLIBRI_SOCKET`; helpers
|
|
in `~/.colibri/` — `colibri_cmd.py` (raw JSON), `colibri_poll.py`, `colibri_task_done.py`.
|
|
- **Validated**: register → scheduler routed an `image-render` task to domedog → poller saw
|
|
it → worker marked it `done` (2026-06-19).
|
|
- **Executor pending (decision required)**: domedog _receives_ capability-matched tasks, but
|
|
no persistent execution loop runs yet — until one does, routed tasks sit `started` (no
|
|
lease/reaper). Decide what executes (Claude Code worker / script) and with what authority
|
|
before relying on autonomous domedog task completion.
|
|
|
|
### debby (Hermes secondary + Zot — intermittent laptop) — probed 2026-06-17 by Hermes
|
|
|
|
- **Identity**: hostname `debby`, Tailscale `${DEBBY_TS_IP}`
|
|
- **OS**: Debian 13 (Trixie), kernel `6.12.90+deb13.1-amd64`, bare metal (KDE Plasma desktop)
|
|
- **CPU**: AMD Ryzen 7 5700U with Radeon Graphics, 8 physical cores, 16 threads
|
|
- **Memory**: 15 GiB RAM, 15 GiB swap
|
|
- **Storage**: `/dev/nvme0n1p2` 453 GB ext4 root, 23 GB free (95% full). No ZFS.
|
|
- **GPU**: AMD Radeon Graphics (integrated, Lucienne)
|
|
- **Containers**: Docker 29.5.3 installed (daemon not currently running)
|
|
- **Hermes Agent**: v0.16.0 (upstream f9c8d95e), DeepSeek v4 Pro primary provider, OpenRouter for vision/fallback, Z.AI/GLM available
|
|
- **Zot RPC**: Go binary at `~/.local/bin/zot`, GLM-5.1 model
|
|
- **Telegram**: ${HERMES_BOT} + ${ZOT_BOT} in "My Debby" group
|
|
- **Layered soul**: commit `817624c`, 6 curated memories, 9 cross-harness skills
|
|
|
|
### osa (FreeBSD: hermes-osa orchestrator + board host, always-on VPS; + Mevy + Codex) — probed 2026-06-17 by hermes-osa
|
|
|
|
- **Identity**: hostname `osa.smilepowered.org`, Tailscale `${OSA_TS_IP}`
|
|
- **OS**: FreeBSD `15.0-RELEASE-p10`, kernel `FreeBSD osa.smilepowered.org 15.0-RELEASE-p10 FreeBSD 15.0-RELEASE-p10 releng/15.0-n281064-98258a339269 GENERIC amd64`
|
|
- **CPU**: Intel Core Processor (Haswell, no TSX), 6 vCPU
|
|
- **Memory**: 11 GiB RAM
|
|
- **Storage**: ZFS pool `zroot`, 98.5G ONLINE, 23.4G available
|
|
- **Jails**: `cms` and `worker` (Bastille jails); Docker not installed
|
|
- **Agents on host**:
|
|
- **hermes-osa** — Hermes Agent v0.16.0 (`hermes-bsd` clean-room MIT fork), FreeBSD local CLI runtime + Telegram gateway. **Status: LIVE — validated local chat + Telegram.** Default provider: DeepSeek direct (`provider: deepseek`, `default: deepseek-chat`). OpenRouter available as fallback/manual lane. Telegram/gateway: LIVE — ${HERMES_OSA_BOT} (Mevy/OSA-bot), polling mode, tmux session `hermes-gateway` on osa. Daemon/rc.d: deferred (Track A).
|
|
- **Mevy** — ${HERMES_OSA_BOT} (OSA-bot) — now consolidated under hermes-osa gateway. Token migrated from old backup .env.
|
|
- **Codex** — `codex-cli 0.117.0`, ISO builds and validation. Runs in a Bastille jail.
|
|
- **Claude Code** — installed (path: `/home/clawdie/.npm-global/bin/claude`), no dedicated role yet.
|
|
- **Provider stack** (hermes-osa):
|
|
```yaml
|
|
provider: deepseek # primary — direct credits, proven DEEPSEEK_OK
|
|
default: deepseek-chat
|
|
fallback: openrouter # available manually, not auto-fallback configured yet
|
|
```
|
|
- **Z.AI**: deferred (not configured for hermes-osa; available via OpenRouter if needed)
|
|
- **Telegram**: LIVE — ${HERMES_OSA_BOT}, polling mode, connected 2026-06-17
|
|
- **Gateway**: LIVE — running in tmux session `hermes-gateway`, manual start (no rc.d yet)
|
|
- **Launch command**:
|
|
```sh
|
|
tmux new -s hermes-osa
|
|
cd /home/clawdie/ai/hermes-bsd
|
|
export HERMES_HOME=/home/clawdie/.hermes
|
|
source venv/bin/activate # or: .venv/bin/activate
|
|
hermes chat
|
|
```
|
|
- **Layered soul**: commit `c9c88fd`, 10 skills, 7 curated memories
|
|
- **Future tracks (separate, none blocking)**:
|
|
- Track A: daemon/rc.d promotion (hermes_daemon service, dedicated user)
|
|
- Track B: ~~Telegram/gateway integration~~ DONE (2026-06-17) — gateway daemonization (rc.d) still deferred
|
|
- Track C: ~~Colibri cross-host routing~~ **DONE (2026-06-19)** — `socat` bridge on osa `:9190` (Tailscale-only) + poller/worker loop; colibri PR #83 merged. See CAPABILITY-ROUTING.md
|
|
- Track D: old clawdie_glass cleanup
|
|
|
|
_See [`../AGENTS.md`](../AGENTS.md) for the canonical agent matrix and operating rules._
|