docs: skill-based feature migration plan for clawdie-ai (Sam & Claude)
Some checks failed
CI / ci (pull_request) Has been cancelled

Reframes pruning as skill extraction: telegram-voice, telegram-images,
and browser-jail become opt-in skill packages instead of deletions.

- Import dependency analysis: 3,200 lines browser (orphaned), 383 lines
  Telegram voice/images (live paths), with hidden-core deps mapped
- Three approaches evaluated: event bus, config-gated, build-flag
- Recommendation: config-gated for TS, trait-based for Colibri Rust
- Browser family: delete (already dead, glasspane supersedes)
- 7 open questions for agent input
This commit is contained in:
Sam & Claude 2026-06-13 19:00:41 +02:00
parent 8d5284df3b
commit 028a7e4adf

View file

@ -0,0 +1,507 @@
# Skill-Based Feature Migration Plan
**Date:** 13.jun.2026
**Status:** PROPOSED — open for agent input
**Authors:** Sam & Claude (analysis), open for Codex/Hermes review
**Builds on:** `doc/COLIBRI-CONTROLPLANE-PLAN.md`, PR #7 (Tier-A removal), unmerged
`chore/prune-browser-family` branch
---
## TL;DR
Instead of deleting ~3,600 lines of Telegram-voice, image-extraction, and
browser-jail code from clawdie-ai, restructure them into **opt-in skill
packages**. The core message loop stays lean. Features load only when their
skill is enabled. This maps to Colibri's existing skill catalog (`colibri-skills`
crate, MCP `list_skills` / `intake_task` tools) and gives us a clean migration
path from TypeScript conditional imports to Rust trait-based skill modules.
---
## 1. Context
### What PR #7 (Tier-A) already removed
| Module | Lines | Why removed |
|--------|------:|-------------|
| `browser-operator.ts` | 0 importers | Orphaned — dead code |
| `tts.ts` | 0 importers | Orphaned — dead code |
| `vision.ts` | 0 importers | Orphaned — dead code |
| **Total** | **805** | 6 files, confirmed zero-importer |
### What the unmerged Tier-B branch does
Branch `chore/prune-browser-family` (`8ee71c7`) removes:
| Module | Lines | Importers | Untangle |
|--------|------:|-----------|----------|
| `browser-orchestrator.ts` | 614 | 0 | Clean delete |
| `browser-backend/*` (7 files) | 775 | 0 TS (1 path ref) | Clean delete + hostd path cleanup |
| `browser-session-registry.ts` | 350 | 1 (`controlplane-db.ts`) | Delete `ensure*Schema` call |
| `browser-credentials-store.ts` | 168 | 1 (same) | Same |
| `browser-grant-tokens.ts` | 289 | 1 (same) | Same |
| **Total** | **2,196** | | |
Plus `scripts/browser-jail-clone-validation/` (~470 lines shell).
### What we considered pruning but shouldn't delete
The original "delete tenant-*/platform-*/surface-*" plan was **mis-bucketed**.
These modules have SaaS-sounding names but are deeply core:
| Module | Lines | Importers | Reality |
|--------|------:|-----------|---------|
| `tenant-registry.ts` | 1,263 | **28** | Config, db, api, pages, hostd, doctor, telegram |
| `platform-identity.ts` | 18 | **23** | hostd daemon, watchdog, db — core identity |
| `memory-pg.ts` | 300 | **8** | agent-runner, session-compaction — DB pool |
| `platform-layout.ts` | 253 | **8** | Config, pages, doctor, hostd types |
| `surface-inventory.ts` | 96 | **5** | Host routing depends on it (`local-hosts.ts`) |
| `platform-audit.ts` | 221 | **5** | `ResourceOwner` type used by hostd **authz** |
**These stay.** They are not SaaS residue — they are load-bearing.
---
## 2. The Three Target Skill Packages
### 2.1 `telegram-voice` skill
**Source modules:**
| Module | Lines | Current import sites |
|--------|------:|---------------------|
| `transcription.ts` | ~250 | `channels/telegram.ts` (voice messages) |
| `stt-guard.ts` | ~71 | `channels/telegram.ts`, `telegram-commands.ts`, `startup-report.ts` |
| **Total** | **~321** | 5 import sites across 3 files |
**Core hook needed:** `onVoiceMessage(audioBlob, chatId) → transcriptionText`
**Behavior when skill not loaded:** Telegram voice messages are logged and
skipped. No transcription. No cooldown enforcement.
### 2.2 `telegram-images` skill
**Source modules:**
| Module | Lines | Current import sites |
|--------|------:|---------------------|
| `outbound-images.ts` | 62 | `index.ts:166` (main message-processing loop) |
| **Total** | **62** | 1 import site |
**Core hook needed:** `preMessageRender(agentOutput) → { extractedPaths, cleanedOutput }`
**Behavior when skill not loaded:** Agent output is passed through without
temp-image-path extraction. Image rendering may show raw paths instead of
rendered images.
### 2.3 `browser-jail` skill
**Source modules:**
| Module | Lines | Current import sites |
|--------|------:|---------------------|
| `browser-orchestrator.ts` | 614 | 0 (orphaned) |
| `browser-backend/*` | 775 | 0 TS, 1 path ref in `hostd/privileged-commands.ts` |
| `browser-session-registry.ts` | 350 | 1 (`controlplane-db.ts`) |
| `browser-credentials-store.ts` | 168 | 1 (same) |
| `browser-grant-tokens.ts` | 289 | 1 (same) |
| hostd browser-clone ops | ~600 | `privileged-commands.ts`, `hostd-authorization.ts` |
| `scripts/browser-jail-clone-validation/` | ~470 | 0 |
| **Total** | **~3,266** | |
**Core hooks needed:** `onBrowserClone(config)`, `onBrowserDestroy(cloneId)`,
`onBrowserReap()`, `onPrivilegedOp(op, args) → result`
**Behavior when skill not loaded:** Browser-clone privileged commands return
"skill not loaded." No browser jail schema initialization.
**Note:** This skill is already nearly dead (0 importers on orchestrator,
schema-init is the only coupling). It's the strongest candidate for full
deletion rather than skill-ification — see open questions below.
---
## 3. Architecture: Three Approaches
### Approach A: Event Bus (cleanest, most work)
```
┌──────────────────────────────────────────────────┐
│ Core message loop (channels/telegram.ts, index) │
│ │
│ skillBus.emit('voice_message', { audio, chat }) │
│ // no-op if no listener registered │
└───────────────┬──────────────────────────────────┘
│ emit
┌───────────┼───────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌──────────┐
│ voice │ │ images │ │ browser │
│ skill │ │ skill │ │ skill │
└────────┘ └────────┘ └──────────┘
```
- Skills register event listeners at startup
- Core has zero knowledge of skill internals
- Truly decoupled — load/unload dynamically
**Pros:**
- Maps perfectly to Colibri's future Rust skill trait system
- Clean separation of concerns
- Skills can be added/removed without touching core
- Each skill fully testable in isolation
**Cons:**
- Most implementation work — need to design the bus, error handling, async lifecycle
- TypeScript doesn't have great runtime plugin loading (startup scan or explicit registration)
- Risk of over-engineering for TS code that's heading toward Rust replacement
- Async error propagation across the bus needs careful design
### Approach B: Config-Gated Modules (pragmatic — RECOMMENDED)
```
┌──────────────────────────────────────────────────┐
│ config.ts │
│ ENABLED_SKILLS=telegram-voice,browser-jail │
└──────────────────────────────────────────────────┘
┌────┴────────────────────────────────────────────┐
│ channels/telegram.ts │
│ │
│ if (skills.has('telegram-voice')) { │
│ const { transcribeVoice } = await import( │
│ '../skills/telegram-voice/mod' │
│ ); │
│ await transcribeVoice(buffer); │
│ } │
└─────────────────────────────────────────────────┘
```
- Existing imports wrapped in config checks
- Skills become config entries
- Each skill module stays self-contained but is only loaded when enabled
**Pros:**
- Least risky — wrap existing imports in conditional blocks
- Skills are config entries (`ENABLED_SKILLS=telegram-voice,browser-jail`)
- Simple clawdie lane = `ENABLED_SKILLS=` (empty) — lean runtime
- Full deployment = full skill set
- Maps to Colibri's skill catalog concept (skills are named, declarable)
- Each skill stays self-contained for future extraction
**Cons:**
- Still compiled into the binary — just not executed
- Not true plugin isolation (code is present, just dormant)
- Dynamic `import()` has Node.js quirks (needs `dist/` structure awareness)
- Less elegant than a proper event bus
### Approach C: Build-Flag Exclusion (simplest, binary-level)
```
// tsconfig.features.json
// "compilerOptions": { ... }
// Features excluded via tsc project references
// Simple clawdie build: tsc --build tsconfig.minimal.json
// Full build: tsc --build tsconfig.full.json
```
- `tsc` with feature flags or project references
- Simple clawdie binary doesn't compile voice/images/browser at all
**Pros:**
- Dead simple
- Smallest binary (excluded code isn't compiled)
- Clean for ISO/live-USB builds
**Cons:**
- Compile-time only — can't toggle at runtime
- Requires separate builds for different skill sets
- Doesn't map to Colibri's runtime skill catalog model
- Two build targets to maintain
---
## 4. Recommendation
### Short-term (clawdie-ai TypeScript): Approach B (config-gated)
Rationale:
1. Least risky way to get "simple clawdie = no voice/images/browser"
2. Doesn't require a hook system redesign in TS code being superseded by Rust
3. Skill packages (`skills/telegram-voice/mod.ts`, etc.) map cleanly to future
Colibri Rust skills
4. Gets the binary-size and complexity reduction without surgery on the live
message loop
5. Config entries are declarable and visible — operators see what's loaded
### Long-term (Colibri Rust): Approach A (event bus / trait registration)
When these features migrate into `colibri-skills` as Rust modules:
- Each skill implements a `Skill` trait with `register_hooks(&mut SkillBus)`
- The `colibri-skills` catalog declares available skills
- MCP `list_skills` exposes loaded skills to editors
- `colibri_intake_task` matches task capabilities to agent skill sets
- Skills compile as separate crates or feature-gated modules within
`colibri-skills`
### Browser family: delete, don't skill-ify
The browser-jail feature is **already dead** — 0 importers on the orchestrator,
the only coupling is a schema-init call. Colibri's `colibri-glasspane` fully
supersedes the browser-clone supervision model. Skill-ifying dead code adds
complexity for no benefit.
**Recommendation:** Merge the existing `chore/prune-browser-family` branch
as-is. Delete the browser family. Do not package it as a skill.
### Telegram voice + images: skill-ify
These features are **live code on active paths** (Telegram channel, main message
loop). Operators may still want them on deployed servers. Skill-ifying them gives
us:
- Simple clawdie = no voice/image processing (lean)
- Deployed clawdie = opt-in via config
- Clean extraction path to Colibri Rust skills later
---
## 5. Implementation Plan (Approach B)
### Phase 1: Browser family deletion (merge existing branch)
- [ ] Merge `chore/prune-browser-family` (`8ee71c7`) into main
- [ ] Clean up hostd browser-clone privileged ops (~60 lines in
`hostd/privileged-commands.ts`)
- [ ] Remove `scripts/browser-jail-clone-validation/`
- [ ] Remove browser-related docs: `BROWSER-JAIL-HANDOFF.md`,
`BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md`
- [ ] Remove `jail/skills/agent-browser/SKILL.md`
- [ ] Verify: `tsc --noEmit` clean, all tests pass
**Estimated removal:** ~3,200 lines
**Risk:** Low (orphaned code + one schema-init untangle)
**Assignee:** Open (Claude/Hermes can do the merge + hostd cleanup)
### Phase 2: Skill config infrastructure
- [ ] Add `ENABLED_SKILLS` to `config.ts` (comma-separated string from env)
- [ ] Add `skills/` directory convention: each skill has `mod.ts` (entrypoint)
- [ ] Add skill-loading utility: `loadSkill(name) → Promise<module>`
- [ ] Add startup log line listing loaded skills
**Estimated effort:** ~100 lines new code
**Risk:** Low
### Phase 3: Extract `telegram-voice` skill
- [ ] Create `skills/telegram-voice/mod.ts` — exports `transcribeVoice`,
`checkSttCooldown`
- [ ] Move `transcription.ts` and `stt-guard.ts` into `skills/telegram-voice/`
- [ ] Wrap 5 import sites in `channels/telegram.ts`, `telegram-commands.ts`,
`startup-report.ts` with config-gated dynamic imports
- [ ] Add `skills/telegram-voice/README.md` documenting the hook contract
- [ ] Test: with skill disabled, voice messages are skipped gracefully
- [ ] Test: with skill enabled, voice transcription works as before
**Estimated effort:** ~1 day
**Risk:** Medium — touches Telegram channel (live path), but behavior is
identical when skill is enabled
### Phase 4: Extract `telegram-images` skill
- [ ] Create `skills/telegram-images/mod.ts` — exports `extractTmpImagePaths`
- [ ] Move `outbound-images.ts` into `skills/telegram-images/`
- [ ] Wrap `index.ts:166` import site with config-gated dynamic import
- [ ] Test: with skill disabled, agent output passes through (raw paths shown)
- [ ] Test: with skill enabled, image extraction works as before
**Estimated effort:** ~half day
**Risk:** Medium — touches main message loop (`index.ts`), but the hook is
simple (extract-and-clean)
### Phase 5: Documentation + Colibri skill catalog mapping
- [ ] Document the skill system in `doc/CONTROLPLANE-ARCHITECTURE.md`
- [ ] Map TS skill packages to planned Colibri Rust skill crates:
- `skills/telegram-voice/``colibri-skills::telegram_voice`
- `skills/telegram-images/``colibri-skills::telegram_images`
- [ ] Update `colibri-skills` catalog to declare these as planned skills
- [ ] Update MCP `list_skills` tool schema to include capability metadata
---
## 6. What the Skill Hook Contract Looks Like (Approach B)
Each skill module exports a standard interface:
```typescript
// skills/telegram-voice/mod.ts
export interface SkillManifest {
name: string; // "telegram-voice"
version: string; // "1.0.0"
capabilities: string[]; // ["voice-transcription", "stt-guard"]
hooks: string[]; // ["onVoiceMessage", "onVoiceCooldown"]
}
export const manifest: SkillManifest = {
name: "telegram-voice",
version: "1.0.0",
capabilities: ["voice-transcription", "stt-guard"],
hooks: ["onVoiceMessage", "onVoiceCooldown"],
};
export async function transcribeVoice(audioBlob: Buffer): Promise<string> {
// existing transcription logic
}
export function checkSttCooldown(chatId: string): boolean {
// existing stt-guard logic
}
```
Core code loads conditionally:
```typescript
// channels/telegram.ts
const voiceSkill = config.skills.includes('telegram-voice')
? await import('../skills/telegram-voice/mod')
: null;
// Later in voice message handler:
if (voiceSkill && voiceSkill.checkSttCooldown(chatId)) {
const text = await voiceSkill.transcribeVoice(audioBlob);
// ... process text
} else {
logger.debug('voice message skipped — telegram-voice skill not loaded');
}
```
---
## 7. Open Questions for Agent Input
1. **Event bus vs config-gated?**
Claude recommends config-gated (Approach B) for TS as a stepping stone. Codex
/ Hermes — do you agree, or is the event bus (Approach A) worth the extra work
even for TS code heading toward Rust replacement?
2. **Browser family: delete or skill-ify?**
Claude recommends deletion (already orphaned, glasspane supersedes it). Anyone
see a reason to keep browser-jail as a skill package?
3. **`stripe-config` + `stripe-tools.ts`: skill or core?**
Currently 365 lines across 2 files. Stripe is a billing capability, not a
core runtime path. Should this become a `stripe-billing` skill? Or is it
small enough to leave in core?
4. **`openrouter-status` (109 lines): skill or core?**
Status-display only (startup report + telegram commands). Not in the request
loop. Could be a `provider-status` skill, or left as-is since it's tiny.
5. **Skill loading mechanism for TS:**
Should we use `await import()` (dynamic, async) or pre-require all skills and
gate at call-site (static, sync)? Dynamic import is cleaner but has Node.js
`dist/` path quirks. Static pre-require is simpler but defeats the "not
loaded" benefit.
6. **Colibri Rust skill trait:**
When these migrate to Rust, should skills be separate crates in the workspace
(`colibri-skill-voice`, `colibri-skill-browser`) or feature-gated modules
within `colibri-skills`? Separate crates = cleaner compilation boundaries but
workspace bloat. Feature-gated = fewer crates but coarser.
7. **FreeBSD impact:**
Does skill-gating change anything for the ISO build? The simple clawdie binary
on the live USB would have `ENABLED_SKILLS=` (empty). Does the ISO build
process need to know about skill flags, or is it purely runtime config?
---
## 8. Current Import Dependency Map (reference data)
Gathered from `rg` analysis of clawdie-ai at commit `8d5284d`.
### Fully orphaned (0 importers — safe to delete)
| Module | Lines |
|--------|------:|
| `browser-orchestrator.ts` | 614 |
### Near-orphaned (1 importer, schema-init only)
| Module | Lines | Importer |
|--------|------:|----------|
| `browser-session-registry.ts` | 350 | `controlplane-db.ts` |
| `browser-credentials-store.ts` | 168 | `controlplane-db.ts` |
| `browser-grant-tokens.ts` | 289 | `controlplane-db.ts` |
| `browser-backend/*` (7 files) | 775 | 0 TS (1 path ref in hostd) |
### Single-importer on live path (skill candidates)
| Module | Lines | Importer | Path |
|--------|------:|----------|------|
| `transcription.ts` | ~250 | `channels/telegram.ts` | Voice messages |
| `outbound-images.ts` | 62 | `index.ts:166` | Message rendering |
| `stripe-config.ts` | 41 | `config.ts` | Status display |
| `openrouter-status.ts` | 109 | `startup-report.ts`, `telegram-commands.ts` | Status display |
### Low coupling (2-4 importers)
| Module | Lines | Importers |
|--------|------:|-----------|
| `stt-guard.ts` | ~71 | 3 (telegram channel, commands, startup) |
| `tenant-site-content.ts` | 237 | 4 (pages, dashboard, publish-report, tenant-publish) |
### Heavily core — DO NOT TOUCH
| Module | Lines | Importers |
|--------|------:|-----------|
| `tenant-registry.ts` | 1,263 | **28** |
| `platform-identity.ts` | 18 | **23** |
| `memory-pg.ts` | 300 | **8** |
| `platform-layout.ts` | 253 | **8** |
| `surface-inventory.ts` | 96 | **5** (host routing dep) |
| `platform-audit.ts` | 221 | **5** (hostd authz dep) |
---
## 9. Validation Notes
### Linux (domedog / debby)
- PR #7 (Tier-A): merged, tested, clean
- Tier-B branch (`chore/prune-browser-family`): written, untested on current main
- Skill extraction: not yet started
### FreeBSD 15 (Codex)
- Colibri `colibri-mcp` PR #32: **validated** (16 tests, stdio handshake, 2.4M
ELF binary)
- This plan: needs FreeBSD review — does the skill-gating approach make sense
for the FreeBSD deployment model?
### Outstanding FreeBSD items
- Tier-B branch needs FreeBSD `tsc --noEmit` + test run after merge
- Skill extraction phases need FreeBSD validation once implemented
- ISO build impact assessment (does `ENABLED_SKILLS` need to be a build-time or
runtime-only flag?)
---
## 10. Attribution
| Agent | Role |
|-------|------|
| Sam (operator) | Direction, cost-mode policy, FreeBSD validation |
| Claude (domedog) | Import dependency analysis, approach evaluation, this document |
| Hermes (debby) | Open — review requested |
| Codex (FreeBSD 15) | Open — FreeBSD feasibility review requested |
---
*This document supersedes the inline pruning notes from the PR #7 commit
message and the stale "delete tenant-*/platform-*/surface-*" plan, which was
mis-bucketed.*