Operator & Claude Code eef58906a6 Propose initial explanation-domain registry

Delivers Claude-side deliverables from EXPLANATION-GROUNDER-PROPOSAL.md
"Claude Work Split": 10 coarse-grained domains with aliases, runtime
facts, curated sources, and exclusions; a flagged list of stale
grounding-source candidates per the recent audits; and a seed prompt
corpus covering single-domain, plain-language, mixed-subject, and
adversarial cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---
Build: FAIL | Tests: FAIL — 15 failed

2026-05-06 00:16:14 +02:00

15 KiB

Raw Blame History

Explanation Domains — Initial Registry

Date: 2026-05-06 Owner: Claude (drafting), Codex (final authority — implementation) Scope: Claude-side deliverables from EXPLANATION-GROUNDER-PROPOSAL.md lines 408-441 — initial domain registry, per-domain grounding-pack inputs, seed corpus, and stale-source flags. Designed to be the input Codex's generic explanation pipeline consumes.

Design choices

A few choices shape the registry below; flagging them here so they're visible rather than buried in the entries.

Domains are coarse-grained. Ten domains, not thirty. A domain covers a topic family wide enough that most prompts naturally land in exactly one. Sub-topics are aliases, not their own domains.
Grounding packs are mostly facts + small snippets, not whole files. Per Codex's preferred shape. The largest snippet a domain should include is one focused doc section, not a full source file dump.
Curated sources beat discovered ones. Each entry names exact files. No glob discovery. If a file moves, the registry breaks loudly, which is preferable to silently grounding from a moved-or-deleted path.
Runtime facts use existing config exports. No new accessors needed for the MVP — every fact below already lives in src/config.ts or is trivially derivable from src/jail-registry.ts.
Stale sources are explicitly excluded. Per the audit/sweep work, some files describe wrong runtime defaults. Those are flagged below and must not become grounding sources without cleanup.

Initial domain registry (10 domains)

Each entry has the shape Codex's pipeline will consume: aliases (what the user might call this), runtime facts (live config values to inject), sources (curated files/sections), and exclusions (sources that look relevant but should not be grounded from).

1. `memory`

User asks how chat memory / long-term recall works.

Aliases: memory, memories, recall, remember, long-term memory, chat memory, session memory, conversation history
Runtime facts:
- DB_RUNTIME
- DB_HOST
- MEMORY_DB_NAME
- MEMORY_DB_USER
- EMBED_BASE_URL (empty = embeddings disabled)
- EMBED_MODEL, EMBED_DIMENSIONS
Sources:
- src/memory-pg.ts (curated: exports + searchMemories + storeMemory signatures)
- docs/internal/POSTGRES-MEMORY.md (full — recently rewritten, current)
- docs/internal/POSTGRES-HYBRID-MEMORY.md (full — recently rewritten, current)
Exclusions: docs/internal/HOST-DB-*.md (historical reviews; framing-noted as "not topology source of truth")

2. `database`

User asks about PostgreSQL setup, naming, splits.

Aliases: database, databases, postgres, postgresql, db, sql, schema
Runtime facts:
- DB_RUNTIME
- DB_HOST
- MEMORY_DB_NAME, OPS_DB_NAME, SKILLS_DB_NAME, FORGEJO_DB_NAME
- TENANT_ID (drives root-vs-tenant naming)
Sources:
- src/db-identifiers.ts (full — small, authoritative)
- src/config.ts lines 651-700 (the SUBNET_BASE / DB_RUNTIME / DB_HOST / *_DB_NAME block — extract by line range, not whole file)
- docs/internal/POSTGRES-MEMORY.md (full — recently rewritten)
- docs/internal/POSTGRES-PERMISSIONS.md (full — recently rewritten)
Exclusions: same as memory

3. `jails` ← pilot domain per Codex's "Immediate Next Step"

User asks about FreeBSD jail topology, names, IPs, layout.

Aliases: jail, jails, bastille, controlplane jail, db jail, cms jail, git jail, worker jail, subnet
Runtime facts:
- SUBNET_BASE
- DB_RUNTIME (controls whether the db jail exists)
- For each role in infra/jails.yaml: name, ip_suffix, resolved IP via getJailIp(role)
- RUNTIME_ID (drives ${RUNTIME_ID}-controlplane naming)
Sources:
- infra/jails.yaml (full — the canonical registry)
- src/jail-schema.ts (curated: JailRegistrySchema + resolveJailIp)
- src/jail-registry.ts (full — small)
- docs/internal/POSTGRES-MEMORY.md (lines covering the optional db jail path only — not the whole doc)
Exclusions:
- doc/THREE-BIRD-ARCHITECTURE.md, doc/CONTROLPLANE-ARCHITECTURE.md, doc/HARNESS-VALIDATION-HANDOFF.md (historical handoffs with 10.0.0.x IPs)
- docs/internal/HOST-DB-*.md (historical reviews)
- .agent/skills/tmux-screenshot/SKILL.md (still has 10.0.0.3 — see flagged list below)

4. `controlplane`

User asks how the bridge decides delegation, how main chat answers vs. specialists.

Aliases: controlplane, bridge, delegate, delegation, routing, bridge, main chat, specialist
Runtime facts:
- CONTROLPLANE_RUNNER (pi / aider / codex)
- Whether heartbeatConfig is enabled
Sources:
- src/controlplane.ts (curated: top-level docstring + getControlplaneDbRuntimeIdentity)
- src/report-intent.ts (full — small, contains the routing decision logic)
- docs/internal/ROUTING-SOURCE-OF-TRUTH-SPLIT-HANDOFF.md (full — the design rationale)
Exclusions: none specific (no widespread legacy)

5. `embeddings`

User asks how vector search / similarity / hybrid retrieval works.

Aliases: embedding, embeddings, vector, pgvector, similarity, hybrid search, semantic search, bge-m3
Runtime facts:
- EMBED_BASE_URL, EMBED_MODEL, EMBED_DIMENSIONS
- OPENROUTER_API_KEY presence (controls fallback path)
Sources:
- src/memory-pg.ts (curated: embedding + search functions)
- docs/internal/POSTGRES-HYBRID-MEMORY.md (full)
Exclusions: none

6. `tenants` (multi-tenant model)

User asks how additive tenants work, naming, isolation.

Aliases: tenant, tenants, multi-tenant, additive tenant, mevy, clawdie, runtime id
Runtime facts:
- RUNTIME_ID, TENANT_ID, SERVICE_NAME
- PLATFORM_NAMESPACE
- List of additive tenants from src/tenant-registry.ts (if any)
Sources:
- src/db-identifiers.ts (full — naming derivation lives here)
- src/tenant-registry.ts (curated: type definitions + main loader)
- src/platform-identity.ts (full — small)
- docs/internal/MULTITENANT.md (full — referenced from src/config.ts:211 as canonical)
Exclusions: none

7. `pi` (the agent harness)

User asks how the pi runtime / TUI / extensions work.

Aliases: pi, pi runtime, pi tui, pi extension, clawdie-harness, coding agent
Runtime facts:
- PI_TUI_BIN (resolved)
- CONTROLPLANE_RUNNER
- List of extension files in .pi/extensions/clawdie-harness/
Sources:
- .pi/extensions/clawdie-harness/controlplane-tools.ts (full — small)
- docs/internal/PI-SKILLS-INTEGRATION.md (conditional — see flag below, may need cleanup before grounding)
- scripts/glass.sh (full — small)
Exclusions:
- docs/internal/CLEAN-RESET-PI-TUI.md (historical, has stale 10.0.0.3 references)

8. `aider` (the editor harness)

User asks how aider integrates, runs, and what flags are used.

Aliases: aider, aider runner, aider session
Runtime facts:
- CONTROLPLANE_RUNNER
- CONTROLPLANE_AIDER_FLAGS
- CONTROLPLANE_AIDER_TMUX_SESSION
- AIDER_BIN resolved path
- GLASS_AIDER_FLAGS
Sources:
- src/controlplane-aider-runner.ts (curated: top-level docstring + entry function)
- scripts/glass.sh (full)
- docs/internal/sessions/2026-04-12-aider-harness-migration.md (full — historical context for why aider is wired this way)
Exclusions: none

9. `storage` (ZFS, datasets, snapshots)

User asks about ZFS layout, snapshots, sanoid, retention.

Aliases: zfs, zpool, pool, dataset, snapshot, snapshots, sanoid, retention, compression
Runtime facts:
- ZFS_POOL, ZFS_PREFIX
- DB_COMPRESSION
- DB_RUNTIME (controls whether DB datasets matter)
Sources:
- .agent/skills/sanoid/SKILL.md (full — pending verification it's current; currently no flag against it)
- .agent/skills/zfs-snapshot/SKILL.md (full)
- .agent/skills/zfs-scrub/SKILL.md (full)
- docs/internal/POSTGRES-MEMORY.md (lines on ZFS dataset layout for DB_RUNTIME=host)
Exclusions: none confirmed; recommend a quick audit of these three skills before they become grounding sources (they were not in the earlier sweep scope).

10. `telegram` (the chat surface)

User asks how Telegram messages are received, dispatched, formatted.

Aliases: telegram, telegram bot, chat, bot, grammy, controlplane-telegram
Runtime facts:
- TELEGRAM_BOT_TOKEN presence (boolean)
- MAIN_GROUP_FOLDER
- List of registered groups from runtime state
Sources:
- src/channels/telegram.ts (curated: renderMarkdownProse + outbound shape)
- src/controlplane-telegram.ts (curated: bridge entry function)
- src/index.ts lines 460-510 (the explanation-intent intercept block, current)
Exclusions: none

Stale or dangerous source candidates (flagged)

Per the recent audit/sweep work, these files contain wrong runtime defaults and must not be grounding sources without cleanup. Listing them so the registry implementation can short-circuit if any of these paths sneak in:

File	Issue	Status
`doc/THREE-BIRD-ARCHITECTURE.md`	`10.0.0.x` jail IPs	Historical, not patched
`doc/CONTROLPLANE-ARCHITECTURE.md`	`10.0.0.3` db jail	Historical, not patched
`doc/HARNESS-VALIDATION-HANDOFF.md`	mixed historical IPs	Historical, not patched
`doc/SESSION-HANDOFF-2026-04-18.md`	historical context only	Historical
`doc/HANDOFF-ISO-AGENT.md`	historical context only	Historical
`doc/GIT-JAIL-PLAN.md`	historical context only	Historical
`docs/internal/CLEAN-RESET-PI-TUI.md`	`10.0.0.3`	Stale, not patched
`docs/internal/PI-SKILLS-INTEGRATION.md`	`clawdie_brain` connection string	Stale, not patched
`docs/internal/STRAPI-FREEBSD-SETUP.md`	`10.0.1.3` db jail	Stale, not patched
`docs/internal/HOST-DB-READINESS-REVIEW.md`	historical review only	Framed as historical, OK to skip
`docs/internal/HOST-DB-REBOOT-REVIEW.md`	historical review only	Framed as historical, OK to skip
`docs/internal/HOST-DB-RECOVERY-PLAN.md`	historical review only	Framed as historical, OK to skip
`.agent/skills/tmux-screenshot/SKILL.md`	`10.0.0.3` db jail reference	Not yet rewritten
`CHANGELOG.md`	`10.0.0.3` (line 100)	Historical record, leave alone
`AGENTS.md`	`10.0.0.3` (line 408)	Should be cleaned up; loaded by Claude Code
`DB_ADMIN_AGENT.md`	`10.0.0.3`	Historical

Recommended exclusion enforcement: the pipeline should accept a fixed denylist (EXCLUDED_GROUNDING_SOURCES) in addition to the per- domain exclusions list, so a misconfigured registry can't accidentally pull from a known-stale source.

Seed corpus for the explanation pipeline

Real prompts to validate the pipeline against, grouped by what they test. Codex can lift these into tests; not exhaustive, designed to surface edge cases early.

Single-domain, technical phrasing (baseline)

Prompt	Expected domain
`how does memory work in this project`	memory
`explain how the controlplane decides what to delegate`	controlplane
`how do databases work here`	database
`what is the jail layout`	jails
`explain the embedding pipeline`	embeddings
`how does the multi-tenant model work`	tenants
`how does aider run in this repo`	aider
`explain the zfs setup`	storage
`how is the telegram bot wired up`	telegram
`how does the pi runtime work`	pi

Plain-language phrasing (variant)

The grounder should let the LLM pick tone from the prompt. No separate "plain" responder per domain.

Prompt	Expected domain
`explain memory simply`	memory
`what's the database setup in plain english`	database
`tell me how jails work for a non-technical person`	jails

Mixed-subject (multiple domains hit)

Codex's pipeline must decide: ground from both, or pick a primary? Recommend: ground from both, label clearly, let the LLM choose what to emphasize.

Prompt	Expected domains
`how does memory and embeddings work together`	memory + embeddings
`explain the database and jail relationship`	database + jails
`how do tenants and databases interact`	tenants + database

Ambiguous (worth pinning explicitly)

Prompt	Expected domain	Why
`explain pi`	pi	Could mean math, but in this repo `pi` is the agent runtime — registry alias resolves it
`how does git work`	none / fall through	Generic question; no project-specific grounding adds value. The pipeline should detect "no domain match" and pass through to the existing memory-filter + LLM path
`what's new in the database code`	none	Explanation-shaped phrasing but really a "what changed" query — should already be caught by the `EXPLANATION_PATTERNS` regex, but answer is a git log not grounding

Adversarial (should NOT be explanation-grounded)

Prompt	Why not
`restart the database`	Operational verb; routes to delegation, not grounder
`is the db jail up`	Live state probe; routes to delegation
`please reply in english`	Conversational preference; routes to memory layer

These should be handled by the routing slice already; listing them so the pipeline test corpus also asserts the explanation grounder doesn't fire on them.

What I am NOT proposing

A subject-extraction model. The simplest version of extractExplanationSubjects(text) is "match aliases against the prompt, return matched domain keys." LLM-based subject extraction is premature; revisit if regex-alias matching can't disambiguate.
Per-domain prose templates. The point of the grounder is the LLM writes the prose. Templates = drift again.
A second-tier "subdomain" registry. Ten coarse domains is the baseline. If a domain becomes unwieldy, split it then; do not split preemptively.
Migration of the two existing responders. Codex's plan keeps memory-architecture.ts and database-architecture.ts as temporary guardrails. This proposal does not touch them — they continue to intercept their own prompts before the grounder sees them.

Suggested implementation order

For Codex:

Implement the pipeline against the jails domain only (Codex's pilot per the proposal's "Immediate Next Step"). Smallest scope, no migration from existing responders, validates the grounding-pack shape end-to-end.
Add 2-3 more domains once the pilot ships green: controlplane, embeddings, tenants. These have no existing responders, low risk.
Run side-by-side with the two existing deterministic responders for a week.
Decide promotion direction based on observed answer quality and token cost.

For Claude (after this proposal):

Stand by — registry is hand-curated, not auto-derived. Future updates are one-domain-at-a-time additions to this file as the pipeline grows.
If Codex flags a stale grounding source mid-implementation, slot it into the flagged-list above with a recommended cleanup.
If a real-world prompt the seed corpus doesn't cover surfaces a regression, add it to the corpus.

15 KiB Raw Blame History

Explanation Domains — Initial Registry

Design choices

Initial domain registry (10 domains)

1. memory

2. database

3. jails ← pilot domain per Codex's "Immediate Next Step"

4. controlplane

5. embeddings

6. tenants (multi-tenant model)

7. pi (the agent harness)

8. aider (the editor harness)

9. storage (ZFS, datasets, snapshots)

10. telegram (the chat surface)