clawdie-ai/docs/public/operate/operator-commands.md
Operator & claude 3828e5ce83 docs: integrate operator observability + provider fallback work
Brings the public docs in line with what shipped on multitenant over the
last few days. Three new operator-facing pages, three updates to existing
ones, and a CHANGELOG batch.

New pages (docs/public/operate/):
- operator-commands.md — single reference for all Telegram slash commands,
  grouped by purpose (status, structured reports, runtime, sessions, admin
  actions) with auth gating per command. Previously only in-bot /help text.
- provider-fallback.md — operator guide for the cooldown layer: env vars,
  how cooldowns are detected and tracked, /policy surfacing, /clearcooldown
  for manual release, the configured/effective/actual observability triple.
  Includes a "path convention note" flagging that the cooldown file still
  uses the legacy $CLAWDIE_VAR_DIR resolution while test/build status
  files have moved to repo tmp/ — divergence to harmonize later in code.
- structured-reports.md — explains the Observed/Interpretation/Operator
  Notes pattern, lists the six structured reports, documents the
  test/build pipeline contract (status JSON schema + new $AGENT_STATUS_DIR
  → $CLAWDIE_VAR_DIR → tmp/status precedence Codex landed in 1389e17),
  and covers free-text routing (classifyReportIntent + isOpsFlavored).

Updates:
- monitoring.md: appended "Operator-Facing Reports" section pointing at
  the new structured-reports page, and "Provider Fallback Health" pointing
  at the fallback page.
- operate/index.md: added the three new pages to the runbook list.
- architecture/controlplane.md: added "Runtime Observability" section
  documenting the configured/effective/actual triple and linking to the
  new operate pages.
- README.md: expanded the Telegram Commands table (was 10 rows, missing
  every structured report, /policy, /clearcooldown, /budgetreset) and
  added a pointer to operator-commands.md as the full reference. Also
  noted free-text routing.
- CHANGELOG.md: appended an "operator observability + provider fallback,
  apr.2026" batch under [Unreleased] covering provider fallback, the
  reports family, the test/build wrapper pipeline, free-text routing,
  /clearcooldown, the observability triple, the Telegram setMyCommands
  menu, and the new "Verify Before Claiming Remote State" rule in
  AGENTS.md.

No code changes. Slovenian sl/ mirror left untouched (out of localization
scope).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  8 failed | 1940 passed (1948)

---
Build: pass | Tests: FAIL — Tests  2 failed | 1949 passed (1951)
2026-04-26 13:01:43 +02:00

7.3 KiB

title description
Operator Commands Reference for the Telegram slash commands operators use to inspect and control the running agent.

The agent exposes its operational surface as Telegram slash commands. This page is the single reference for what each command does, who can run it, and which underlying surface it inspects. The Telegram bot also publishes a native command menu via setMyCommands — start typing / in any chat for the live in-app list.

Authorization Layers

Three layers gate the commands. A command may pass through one, two, or all three:

Gate Where Effect
requireAdmin Per-handler Only operators on the admin allow-list run it
requireOpsChat Per-handler (write/destructive only) Only the configured ops chat may invoke it
Per-chat overrides group.jailConfig (registered groups) Per-chat model/provider overrides

Read-only commands (/status, /disk, /report, /testreport, etc.) are admin-gated but not ops-chat-gated — admins can run them from any chat. Destructive commands (/budgetreset, /clearcooldown) require the ops chat.

Status & Identity

Command Purpose Surface
/ping Confirm the bot process is responsive Direct reply
/chatid Print the current chat's JID Useful for .env registration
/whoami Show your Telegram identity Confirms admin-allowlist match
/status Compact system summary (jails, ZFS pools, PF, budget) src/system-state.ts snapshot

Structured Reports

All structured reports follow the same Observed / Interpretation / Operator Notes template. See Structured Reports for the design pattern.

Command Report Source
/report System & auth — services, jails, PF, controlplane hostd probes + probeControlplaneAuth()
/disk ZFS pools and snapshots zpool list -H + zfs list -H -o name,usedsnap
/tasks Controlplane task queue getAllTasks() (Postgres)
/budgetreport Token budgets and burn analytics getAllBudgets() + getAgentTokenAnalytics()
/publishreport Tenant publish/content state loadTenantRegistry() + webroot inspection
/testreport Build and test pass/fail tmp/status/build-status.json + tmp/status/test-status.json

/testreport is fed by scripts/write-test-build-status.sh — see Structured Reports for the write/read contract.

Runtime & Policy

Command Purpose
/policy Active runtime policy (default model, overrides, cooldowns, budget state)
/budget Alias for /policy
/usage Token budget per agent
/tokens Runtime token burn per agent (last-N analytics)
/model Set provider/model for this chat (per-chat override)
/activation Set trigger mode (always-respond vs mention-only)
/tts Toggle voice replies (on / off / status)

/policy shows the Provider fallback cooldown line when one is active.

Sessions

Command Purpose
/new Reset this chat's session
/compact Compact the session (summarize old, keep recent)
/stop Stop a running agent for this chat
/resume Resume a budget-paused chat

Admin Actions (Ops-Chat Only)

Command Purpose
/budgetreset <id> Reset an agent's token budget. all requires confirm second arg.
/clearcooldown [id] Clear a provider fallback cooldown
/audit Platform ownership audit (which jail/dataset/service belongs to which)
/snapshots [dataset] List ZFS snapshots
/scrub <pool> [op] ZFS scrub controls (status / start / stop)
/updates FreeBSD base + ports update status
/schedule Manage scheduled agent tasks (list / add / cancel / done)

Free-Text Routing

The bot recognizes bot-addressed ops-flavored phrasings without requiring a slash command. Examples that route to structured reports instead of the LLM path:

Phrase Routed to
disk usage, how much disk /disk
task report, active tasks /tasks
budget report, how many tokens /budgetreport
are the tests passing, build status /testreport
system report, report please /report

This keeps memory or narrative recall from drifting into a stale answer when fresh structured data is available. The full pattern set lives in classifyReportIntent() in src/report-intent.ts.

A broader isOpsFlavored() matcher also suppresses memory injection on any ops-flavored prompt (services, jails, deploy, auth, controlplane terms), even when no specific report matches — so the LLM answers from live tools rather than narrative recall.

Help

/help prints the in-bot command list. The list is generated from the same constants that drive the Telegram menu publication, so it reflects whatever is currently registered.