layered-soul/skills/mother/SKILL.md

164 lines
5.8 KiB
Markdown
Raw Normal View History

---
name: mother
description: Genesis sequence for a new agent joining the Clawdie hive — resolve identity, verify vault provision, register capabilities on the Colibri board, and start the heartbeat/poll loop. Idempotent — safe to re-run on an already-provisioned agent.
triggers:
- "join the hive"
- "mother skill"
- "genesis sequence"
- "onboard this agent"
- "register with colibri"
- "first boot"
- "new agent setup"
- "provision agent"
---
# Mother Skill — Agent Genesis
Trigger: the agent wakes up in a freshly-provisioned jail (or container) and needs
to join the hive. The vault has already written a `.env` into the jail root by the
time this skill runs — the agent's job is to verify, register, and start polling.
## Prerequisites (must already exist before this skill runs)
- `.env` file present in the agent's home directory (written by `colibri-vault`)
- `colibri-daemon` reachable — either on a local Unix socket (osa) or via the TCP
bridge over Tailscale (remote hosts)
- `colibri` binary in PATH (or at a known location)
- The agent knows its own hostname and OS (`hostname`, `uname`)
## Genesis Sequence
Run these steps in order. The skill is **idempotent** — if an agent already has an
agent_id and is registered, skip the registration step and go straight to polling.
### 1. Resolve identity
Determine what this agent IS before registering:
```bash
HOST=$(hostname)
OS=$(uname -s | tr '[:upper:]' '[:lower:]')
```
Capability tags are derived from what's actually installed and reachable on this
host. Use the Host Matrix as reference (`docs/HOST-MATRIX.md` in layered-soul), but
the ground truth is `scripts/verify_facts_probe.py`.
Minimum base tags every agent should carry:
| Tag | Source |
|-----|--------|
| `freebsd` or `linux` | `uname -s` |
| `shell` | always present |
| `hermes` | if Hermes Agent is installed |
| `tailscale` | if `tailscale status` succeeds |
Additional tags per the Capability Vocabulary in `docs/CAPABILITY-ROUTING.md`:
jail isolation (`freebsd-jail`), hardware (`zfs`, `gpu`), runtimes (`python3.12`,
`rust`, `node24`), media (`ffmpeg`, `image-render`).
### 2. Verify vault provision
Confirm the vault did its job — the `.env` must contain at least one valid API key:
```bash
test -f ~/.env && grep -q '_API_KEY=' ~/.env && echo "vault: provisioned" || echo "vault: MISSING — .env absent or empty"
```
If missing, this agent cannot authenticate to any provider. Report the error and
halt — a human operator must re-run `colibri-vault provision`.
### 3. Register with Colibri
Check if already registered (by name, not UUID — the daemon enforces UNIQUE on names):
```bash
# Check if agent already exists
colibri --socket $COLIBRI_SOCKET list-agents | grep -q "\"$AGENT_NAME\""
```
If NOT registered, register with capabilities derived from Step 1:
```bash
colibri --socket $COLIBRI_SOCKET register-agent "$AGENT_NAME" \
--capabilities freebsd,shell,hermes,tailscale,rc.d,pf,zfs
```
The `COLIBRI_SOCKET` path depends on the host:
| Host | Socket path |
|------|-------------|
| osa (FreeBSD, local daemon) | `/var/run/colibri/colibri.sock` |
| Remote hosts (via socat bridge) | `/tmp/osa-colibri.sock` |
If registration fails (e.g., name collision), log the error and continue — the agent
may already be registered under a different name, or the board may be unreachable.
The poller will surface this on its next tick.
### 4. Start heartbeat and poll loop
The agent is now a member of the hive. Two recurring jobs keep it alive:
**Heartbeat** (every 5 min): update agent status on the board so the scheduler knows
this agent is still alive and can receive tasks.
```bash
colibri --socket $COLIBRI_SOCKET set-agent-status "$AGENT_NAME" active
```
**Poll loop** (every 2 min): check the board for tasks assigned to this agent.
On osa and debby, this runs inside Hermes' internal scheduler — see
`colibri/packaging/freebsd/colibri-agent-loop.md` for the Hermes cronjob setup.
For a bare agent without Hermes cron, a minimal poll loop:
```bash
while true; do
colibri --socket $COLIBRI_SOCKET list-tasks --status started \
| colibri_poll.py # filters by agent UUID, outputs JSON
sleep 120
done
```
## Platform-specific notes
### FreeBSD jail (osa)
- The vault writes `.env` to `/home/clawdie/.env` inside the jail
- The daemon socket is at `/var/run/colibri/colibri.sock` (mounted or shared)
- Capabilities include `freebsd-jail`, `zfs` (if zpool is visible), `rc.d`
- The jail's `/var/run` is tmpfs — `mkdir -p /var/run/colibri` before connecting
### Linux container (debby, domedog)
- The vault writes `.env` to `~/.env`
- The socket is a socat bridge at `/tmp/osa-colibri.sock` pointing to osa:9190
- Capabilities include `docker` (if Docker socket is mounted), `image-render`
(if Pillow/FFmpeg are installed)
- Start the socat shim before the poll loop:
```bash
socat UNIX-LISTEN:/tmp/osa-colibri.sock,fork TCP:${OSA_TS_IP}:9190 &
```
## Idempotency contract
This skill can be re-run at any time without breaking a live agent:
| Step | Re-run behavior |
|------|-----------------|
| Resolve identity | Safe — hostname and OS don't change |
| Verify vault | Safe — checks file existence, no side effects |
| Register agent | Name collision = benign error (already registered) |
| Heartbeat/poll | Safe — daemon ignores duplicate status updates |
## Security invariant
The `.env` file is `0600` (written by `colibri-vault` with `Permissions::from_mode(0o600)`).
It contains the agent's ONE provider key — never the Vaultwarden org service-account
credential. A compromised jail can use its own key but cannot reach another tenant's
collection or the vault bootstrap credential.
_See `docs/HIVE-ONBOARDING.md` for the full onboarding vision,
`docs/CAPABILITY-ROUTING.md` for the routing layer, and
`colibri/packaging/freebsd/colibri-agent-loop.md` for the Hermes cronjob setup._