clawdie-ai/docs/public/operate/structured-reports.md
Operator & Codex b97e623e3a Document tenant-site verify states
---
Build: pass | Tests: pass — Tests  2057 passed (2057)
2026-04-29 12:04:25 +02:00

173 lines
7.7 KiB
Markdown

---
title: 'Structured Reports'
description: The Observed / Interpretation / Operator Notes pattern, the report family, and the free-text routing layer.
---
The agent's operator-facing reports follow a single template so an operator
or a peer agent can read any of them at a glance and know what is observed
fact, what is interpretation, and what action (if any) is suggested.
## In Plain Language
- A **structured report** is a deterministic snapshot of one slice of the
system (disk, services, tasks, budget, publish state, build/test status).
- Reports are built from **raw inputs** — DB rows, command output, JSON
status files — by a **pure builder function**. The builder has no side
effects and is unit-tested independently of how the report is delivered.
- The result is rendered to HTML for Telegram and could equally be rendered
to JSON for a dashboard or to plain text for a CLI.
- When the agent answers an ops question, it reads the structured report
rather than narrating from memory. This matters because memory drifts;
ZFS pool capacity does not.
## The Three-Section Template
Every structured report has the same three top-level sections:
### Observed
What the report measured, with no interpretation. ZFS shows pool A at 87%
capacity. Build status file says `status: "fail"`. The last task in the
queue was created at 10:23.
This section is the source of truth for the rest of the report. If
`Observed` is empty, the underlying probe failed and the report says so.
### Interpretation
A handful of `findings` extracted from `Observed`, each tagged `info`,
`warn`, or `error`. "Pool A is at 87% capacity." "Tests last run failed.
12 failing tests." "No active controlplane tasks are queued right now."
Findings are short, factual, and avoid recommending action. Their job is
to reduce a wall of data to the few signals that matter.
### Operator Notes
Suggestions, conditional and labeled `note` or `action`. "Largest
snapshot: `tank/data@2026-04-20-weekly` (4.2 GB). Remove only if that
rollback point is no longer needed." "Re-run the test wrapper before
relying on this as evidence the branch is green."
Notes are *suggestions*, not commands. They include the **conditional**
that makes the action correct ("only if X"), so an operator can decide
without re-deriving the context.
## The Report Family
| Report | Module | Slash command | Source |
| ---------- | ------------------------------------- | ---------------- | -------------------------------------------------------- |
| System | `src/reports/system-report.ts` | `/report` | `hostd` probes + controlplane auth probe |
| Disk | `src/reports/disk-report.ts` | `/disk` | `zpool list -H` + `zfs list -H -o name,usedsnap` |
| Tasks | `src/reports/tasks-report.ts` | `/tasks` | `getAllTasks()` (Postgres) |
| Budget | `src/reports/budget-report.ts` | `/budgetreport` | `getAllBudgets()` + `getAgentTokenAnalytics()` |
| Publish | `src/reports/publish-report.ts` | `/publishreport` | tenant registry + webroot inspection |
| Test/Build | `src/reports/test-report.ts` | `/testreport` | `tmp/status/build-status.json` + `test-status.json` |
Each module exports two functions:
```ts
buildXxxReport(inputs) // pure: takes raw inputs, returns a typed report
renderXxxReport(report) // pure: takes the report, returns an HTML string
```
The split lets you unit-test the analysis without touching IO and reuse
the builder against a JSON sink later.
For publish-state reporting specifically, the source of truth is not only the
registry. The report reads live served output from the CMS webroot, and the
setup `verify` step now also checks publish-manifest consistency for tenant
sites. That keeps "declared", "served", and "last published" closer together.
## Test/Build Pipeline
`/testreport` is the only report whose source-of-truth is a file the agent
does not write itself. The contract:
1. `scripts/write-test-build-status.sh` runs `npm run build` and
`npx vitest run --reporter=json --outputFile=...` (or one of them via
`build` / `tests` argument).
2. The wrapper writes two JSON files into the **status directory**:
- `<status-dir>/build-status.json`
- `<status-dir>/test-status.json`
The status directory resolves with this precedence (matched by both the
wrapper and `getDefaultStatusDir()` in `src/reports/test-report.ts`):
1. `$AGENT_STATUS_DIR` if set
2. `$CLAWDIE_VAR_DIR` if set (legacy)
3. `<project-root>/tmp/status` (default)
Per `AGENTS.md` § "Temporary File Storage", artifact paths under repo
`tmp/` are the preferred default — point `$AGENT_STATUS_DIR` elsewhere
only if you have a reason to.
3. `/testreport` reads both files, builds the report, renders it.
The schema for each file is intentionally narrow:
```json
{
"status": "ok" | "fail" | "unknown",
"completedAt": "2026-04-26T10:00:00Z",
"command": "npx vitest run",
"exitCode": 0,
"durationMs": 12345,
"totalTests": 1934,
"failingTests": 0,
"skippedTests": 0,
"failingTestNames": ["..."],
"summary": "..."
}
```
Only `status` and `completedAt` are required; everything else degrades
gracefully. Files older than 6 hours surface as `stale` with a warn finding.
Missing or malformed files surface as `status: "unknown"` with an action
note rather than fabricating success.
The pre-commit and post-commit hooks call this wrapper so commit messages
include a `Build: pass | Tests: 12 failed | 1936 passed (1948)` footer
visible in `git log`.
## Free-Text Routing
When the agent receives a bot-addressed message, `classifyReportIntent()` in
`src/report-intent.ts` checks a set of conservative regexes and routes to
the matching structured report instead of the LLM path. This means an
operator typing "how much disk?" gets a fresh `/disk` snapshot, not a
half-remembered narrative from a session three days ago.
The routing rules are intentionally **narrow** (false negatives are fine,
false positives are not). For broader detection of "this prompt smells
operational", a separate `isOpsFlavored()` matcher catches a wider net of
phrasings (services, jails, deploy, controlplane terms, etc.) — and is
used to **suppress memory injection** on those prompts so the LLM answers
from live tools rather than narrative recall.
| Function | Use |
| ---------------------------- | -------------------------------------------------------------------- |
| `classifyReportIntent(text)` | Hard route → structured report. Only fires on confident phrasings. |
| `isOpsFlavored(text)` | Soft signal → drop memory injection. Wider net, lower bar. |
Both ignore slash-command messages (those are routed by grammy) and
`@assistant` mentions are stripped before matching.
## Why Pure Builders
The pure builder pattern was a deliberate choice over a one-shot
"render-to-HTML now" approach. Three reasons:
- **Testable** — unit tests exercise the analysis logic with synthetic
inputs, no Postgres or pi running.
- **Reusable** — the same `buildDiskReport()` could feed a dashboard widget
or a daily email digest later. We are not committed to Telegram as the
only sink.
- **Inspectable** — when an operator asks "why did the report flag this?",
the answer is a `findings[]` array with explicit codes, not opaque text
generation.
If you add a new report, follow the same shape: a `Report` interface, a
`buildXxxReport()` function with `findings: XxxReportFinding[]` and
`operatorNotes: XxxReportOperatorNote[]`, a `renderXxxReport()` HTML
renderer, and a `*.test.ts` covering the builder independently.