clawdie-ai/docs/public/operate/operator-commands.md at fix/mask-tailscale-ips

Operator & claude 3828e5ce83 docs: integrate operator observability + provider fallback work

Brings the public docs in line with what shipped on multitenant over the
last few days. Three new operator-facing pages, three updates to existing
ones, and a CHANGELOG batch.

New pages (docs/public/operate/):
- operator-commands.md — single reference for all Telegram slash commands,
  grouped by purpose (status, structured reports, runtime, sessions, admin
  actions) with auth gating per command. Previously only in-bot /help text.
- provider-fallback.md — operator guide for the cooldown layer: env vars,
  how cooldowns are detected and tracked, /policy surfacing, /clearcooldown
  for manual release, the configured/effective/actual observability triple.
  Includes a "path convention note" flagging that the cooldown file still
  uses the legacy $CLAWDIE_VAR_DIR resolution while test/build status
  files have moved to repo tmp/ — divergence to harmonize later in code.
- structured-reports.md — explains the Observed/Interpretation/Operator
  Notes pattern, lists the six structured reports, documents the
  test/build pipeline contract (status JSON schema + new $AGENT_STATUS_DIR
  → $CLAWDIE_VAR_DIR → tmp/status precedence Codex landed in 1389e17),
  and covers free-text routing (classifyReportIntent + isOpsFlavored).

Updates:
- monitoring.md: appended "Operator-Facing Reports" section pointing at
  the new structured-reports page, and "Provider Fallback Health" pointing
  at the fallback page.
- operate/index.md: added the three new pages to the runbook list.
- architecture/controlplane.md: added "Runtime Observability" section
  documenting the configured/effective/actual triple and linking to the
  new operate pages.
- README.md: expanded the Telegram Commands table (was 10 rows, missing
  every structured report, /policy, /clearcooldown, /budgetreset) and
  added a pointer to operator-commands.md as the full reference. Also
  noted free-text routing.
- CHANGELOG.md: appended an "operator observability + provider fallback,
  apr.2026" batch under [Unreleased] covering provider fallback, the
  reports family, the test/build wrapper pipeline, free-text routing,
  /clearcooldown, the observability triple, the Telegram setMyCommands
  menu, and the new "Verify Before Claiming Remote State" rule in
  AGENTS.md.

No code changes. Slovenian sl/ mirror left untouched (out of localization
scope).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: pass | Tests: FAIL — Tests  8 failed | 1940 passed (1948)

---
Build: pass | Tests: FAIL — Tests  2 failed | 1949 passed (1951)

2026-04-26 13:01:43 +02:00

7.3 KiB

Raw Permalink Blame History

title	description
Operator Commands	Reference for the Telegram slash commands operators use to inspect and control the running agent.

The agent exposes its operational surface as Telegram slash commands. This page is the single reference for what each command does, who can run it, and which underlying surface it inspects. The Telegram bot also publishes a native command menu via setMyCommands — start typing / in any chat for the live in-app list.

Authorization Layers

Three layers gate the commands. A command may pass through one, two, or all three:

Gate	Where	Effect
`requireAdmin`	Per-handler	Only operators on the admin allow-list run it
`requireOpsChat`	Per-handler (write/destructive only)	Only the configured ops chat may invoke it
Per-chat overrides	`group.jailConfig` (registered groups)	Per-chat model/provider overrides

Read-only commands (/status, /disk, /report, /testreport, etc.) are admin-gated but not ops-chat-gated — admins can run them from any chat. Destructive commands (/budgetreset, /clearcooldown) require the ops chat.

Status & Identity

Command	Purpose	Surface
`/ping`	Confirm the bot process is responsive	Direct reply
`/chatid`	Print the current chat's JID	Useful for `.env` registration
`/whoami`	Show your Telegram identity	Confirms admin-allowlist match
`/status`	Compact system summary (jails, ZFS pools, PF, budget)	`src/system-state.ts` snapshot

Structured Reports

All structured reports follow the same Observed / Interpretation / Operator Notes template. See Structured Reports for the design pattern.

Command	Report	Source
`/report`	System & auth — services, jails, PF, controlplane	`hostd` probes + `probeControlplaneAuth()`
`/disk`	ZFS pools and snapshots	`zpool list -H` + `zfs list -H -o name,usedsnap`
`/tasks`	Controlplane task queue	`getAllTasks()` (Postgres)
`/budgetreport`	Token budgets and burn analytics	`getAllBudgets()` + `getAgentTokenAnalytics()`
`/publishreport`	Tenant publish/content state	`loadTenantRegistry()` + webroot inspection
`/testreport`	Build and test pass/fail	`tmp/status/build-status.json` + `tmp/status/test-status.json`

/testreport is fed by scripts/write-test-build-status.sh — see Structured Reports for the write/read contract.

Runtime & Policy

Command	Purpose
`/policy`	Active runtime policy (default model, overrides, cooldowns, budget state)
`/budget`	Alias for `/policy`
`/usage`	Token budget per agent
`/tokens`	Runtime token burn per agent (last-N analytics)
`/model`	Set provider/model for this chat (per-chat override)
`/activation`	Set trigger mode (always-respond vs mention-only)
`/tts`	Toggle voice replies (`on` / `off` / `status`)

/policy shows the Provider fallback cooldown line when one is active.

Sessions

Command	Purpose
`/new`	Reset this chat's session
`/compact`	Compact the session (summarize old, keep recent)
`/stop`	Stop a running agent for this chat
`/resume`	Resume a budget-paused chat

Admin Actions (Ops-Chat Only)

Command	Purpose
`/budgetreset <id>`	Reset an agent's token budget. `all` requires `confirm` second arg.
`/clearcooldown [id]`	Clear a provider fallback cooldown
`/audit`	Platform ownership audit (which jail/dataset/service belongs to which)
`/snapshots [dataset]`	List ZFS snapshots
`/scrub <pool> [op]`	ZFS scrub controls (`status` / `start` / `stop`)
`/updates`	FreeBSD base + ports update status
`/schedule`	Manage scheduled agent tasks (list / add / cancel / done)

Free-Text Routing

The bot recognizes bot-addressed ops-flavored phrasings without requiring a slash command. Examples that route to structured reports instead of the LLM path:

Phrase	Routed to
`disk usage`, `how much disk`	`/disk`
`task report`, `active tasks`	`/tasks`
`budget report`, `how many tokens`	`/budgetreport`
`are the tests passing`, `build status`	`/testreport`
`system report`, `report please`	`/report`

This keeps memory or narrative recall from drifting into a stale answer when fresh structured data is available. The full pattern set lives in classifyReportIntent() in src/report-intent.ts.

A broader isOpsFlavored() matcher also suppresses memory injection on any ops-flavored prompt (services, jails, deploy, auth, controlplane terms), even when no specific report matches — so the LLM answers from live tools rather than narrative recall.

Help

/help prints the in-bot command list. The list is generated from the same constants that drive the Telegram menu publication, so it reflects whatever is currently registered.

7.3 KiB Raw Permalink Blame History