clawdie-ai/docs/public/operate/structured-reports.md

---
title: 'Structured Reports'
description: The Observed / Interpretation / Operator Notes pattern, the report family, and the free-text routing layer.
---

The agent's operator-facing reports follow a single template so an operator
or a peer agent can read any of them at a glance and know what is observed
fact, what is interpretation, and what action (if any) is suggested.

## In Plain Language

- A **structured report** is a deterministic snapshot of one slice of the
  system (disk, services, tasks, budget, publish state, build/test status).
- Reports are built from **raw inputs** — DB rows, command output, JSON
  status files — by a **pure builder function**. The builder has no side
  effects and is unit-tested independently of how the report is delivered.
- The result is rendered to HTML for Telegram and could equally be rendered
  to JSON for a dashboard or to plain text for a CLI.
- When the agent answers an ops question, it reads the structured report
  rather than narrating from memory. This matters because memory drifts;
  ZFS pool capacity does not.

## The Three-Section Template

Every structured report has the same three top-level sections:

### Observed

What the report measured, with no interpretation. ZFS shows pool A at 87%
capacity. Build status file says `status: "fail"`. The last task in the
queue was created at 10:23.

This section is the source of truth for the rest of the report. If
`Observed` is empty, the underlying probe failed and the report says so.

### Interpretation

A handful of `findings` extracted from `Observed`, each tagged `info`,
`warn`, or `error`. "Pool A is at 87% capacity." "Tests last run failed.
12 failing tests." "No active controlplane tasks are queued right now."

Findings are short, factual, and avoid recommending action. Their job is
to reduce a wall of data to the few signals that matter.

### Operator Notes

Suggestions, conditional and labeled `note` or `action`. "Largest
snapshot: `tank/data@2026-04-20-weekly` (4.2 GB). Remove only if that
rollback point is no longer needed." "Re-run the test wrapper before
relying on this as evidence the branch is green."

Notes are *suggestions*, not commands. They include the **conditional**
that makes the action correct ("only if X"), so an operator can decide
without re-deriving the context.

## The Report Family

| Report     | Module                                | Slash command    | Source                                                   |
| ---------- | ------------------------------------- | ---------------- | -------------------------------------------------------- |
| System     | `src/reports/system-report.ts`        | `/report`        | `hostd` probes + controlplane auth probe                 |
| Disk       | `src/reports/disk-report.ts`          | `/disk`          | `zpool list -H` + `zfs list -H -o name,usedsnap`         |
| Tasks      | `src/reports/tasks-report.ts`         | `/tasks`         | `getAllTasks()` (Postgres)                               |
| Budget     | `src/reports/budget-report.ts`        | `/budgetreport`  | `getAllBudgets()` + `getAgentTokenAnalytics()`           |
| Publish    | `src/reports/publish-report.ts`       | `/publishreport` | tenant registry + webroot inspection                     |
| Test/Build | `src/reports/test-report.ts`          | `/testreport`    | `tmp/status/build-status.json` + `test-status.json`      |

Each module exports two functions:

```ts
buildXxxReport(inputs)  // pure: takes raw inputs, returns a typed report
renderXxxReport(report) // pure: takes the report, returns an HTML string
```

The split lets you unit-test the analysis without touching IO and reuse
the builder against a JSON sink later.

For publish-state reporting specifically, the source of truth is not only the
registry. The report reads live served output from the CMS webroot, and the
setup `verify` step now also checks publish-manifest consistency for tenant
sites. That keeps "declared", "served", and "last published" closer together.

## Test/Build Pipeline

`/testreport` is the only report whose source-of-truth is a file the agent
does not write itself. The contract:

1. `scripts/write-test-build-status.sh` runs `npm run build` and
   `npx vitest run --reporter=json --outputFile=...` (or one of them via
   `build` / `tests` argument).
2. The wrapper writes two JSON files into the **status directory**:
   - `<status-dir>/build-status.json`
   - `<status-dir>/test-status.json`

   The status directory resolves with this precedence (matched by both the
   wrapper and `getDefaultStatusDir()` in `src/reports/test-report.ts`):

   1. `$AGENT_STATUS_DIR` if set
   2. `$CLAWDIE_VAR_DIR` if set (legacy)
   3. `<project-root>/tmp/status` (default)

   Per `AGENTS.md` § "Temporary File Storage", artifact paths under repo
   `tmp/` are the preferred default — point `$AGENT_STATUS_DIR` elsewhere
   only if you have a reason to.

3. `/testreport` reads both files, builds the report, renders it.

The schema for each file is intentionally narrow:

```json
{
  "status": "ok" | "fail" | "unknown",
  "completedAt": "2026-04-26T10:00:00Z",
  "command": "npx vitest run",
  "exitCode": 0,
  "durationMs": 12345,
  "totalTests": 1934,
  "failingTests": 0,
  "skippedTests": 0,
  "failingTestNames": ["..."],
  "summary": "..."
}
```

Only `status` and `completedAt` are required; everything else degrades
gracefully. Files older than 6 hours surface as `stale` with a warn finding.
Missing or malformed files surface as `status: "unknown"` with an action
note rather than fabricating success.

The pre-commit and post-commit hooks call this wrapper so commit messages
include a `Build: pass | Tests: 12 failed | 1936 passed (1948)` footer
visible in `git log`.

## Free-Text Routing

When the agent receives a bot-addressed message, `classifyReportIntent()` in
`src/report-intent.ts` checks a set of conservative regexes and routes to
the matching structured report instead of the LLM path. This means an
operator typing "how much disk?" gets a fresh `/disk` snapshot, not a
half-remembered narrative from a session three days ago.

The routing rules are intentionally **narrow** (false negatives are fine,
false positives are not). For broader detection of "this prompt smells
operational", a separate `isOpsFlavored()` matcher catches a wider net of
phrasings (services, jails, deploy, controlplane terms, etc.) — and is
used to **suppress memory injection** on those prompts so the LLM answers
from live tools rather than narrative recall.

| Function                     | Use                                                                  |
| ---------------------------- | -------------------------------------------------------------------- |
| `classifyReportIntent(text)` | Hard route → structured report. Only fires on confident phrasings.   |
| `isOpsFlavored(text)`        | Soft signal → drop memory injection. Wider net, lower bar.           |

Both ignore slash-command messages (those are routed by grammy) and
`@assistant` mentions are stripped before matching.

## Why Pure Builders

The pure builder pattern was a deliberate choice over a one-shot
"render-to-HTML now" approach. Three reasons:

- **Testable** — unit tests exercise the analysis logic with synthetic
  inputs, no Postgres or pi running.
- **Reusable** — the same `buildDiskReport()` could feed a dashboard widget
  or a daily email digest later. We are not committed to Telegram as the
  only sink.
- **Inspectable** — when an operator asks "why did the report flag this?",
  the answer is a `findings[]` array with explicit codes, not opaque text
  generation.

If you add a new report, follow the same shape: a `Report` interface, a
`buildXxxReport()` function with `findings: XxxReportFinding[]` and
`operatorNotes: XxxReportOperatorNote[]`, a `renderXxxReport()` HTML
renderer, and a `*.test.ts` covering the builder independently.