From 028a7e4adfa3afb05406cea85628bb6e73d53a74 Mon Sep 17 00:00:00 2001 From: Sam & Claude Date: Sat, 13 Jun 2026 19:00:41 +0200 Subject: [PATCH] docs: skill-based feature migration plan for clawdie-ai (Sam & Claude) Reframes pruning as skill extraction: telegram-voice, telegram-images, and browser-jail become opt-in skill packages instead of deletions. - Import dependency analysis: 3,200 lines browser (orphaned), 383 lines Telegram voice/images (live paths), with hidden-core deps mapped - Three approaches evaluated: event bus, config-gated, build-flag - Recommendation: config-gated for TS, trait-based for Colibri Rust - Browser family: delete (already dead, glasspane supersedes) - 7 open questions for agent input --- doc/SKILL-MIGRATION-HANDOFF.md | 507 +++++++++++++++++++++++++++++++++ 1 file changed, 507 insertions(+) create mode 100644 doc/SKILL-MIGRATION-HANDOFF.md diff --git a/doc/SKILL-MIGRATION-HANDOFF.md b/doc/SKILL-MIGRATION-HANDOFF.md new file mode 100644 index 0000000..7c31db6 --- /dev/null +++ b/doc/SKILL-MIGRATION-HANDOFF.md @@ -0,0 +1,507 @@ +# Skill-Based Feature Migration Plan + +**Date:** 13.jun.2026 +**Status:** PROPOSED — open for agent input +**Authors:** Sam & Claude (analysis), open for Codex/Hermes review +**Builds on:** `doc/COLIBRI-CONTROLPLANE-PLAN.md`, PR #7 (Tier-A removal), unmerged +`chore/prune-browser-family` branch + +--- + +## TL;DR + +Instead of deleting ~3,600 lines of Telegram-voice, image-extraction, and +browser-jail code from clawdie-ai, restructure them into **opt-in skill +packages**. The core message loop stays lean. Features load only when their +skill is enabled. This maps to Colibri's existing skill catalog (`colibri-skills` +crate, MCP `list_skills` / `intake_task` tools) and gives us a clean migration +path from TypeScript conditional imports to Rust trait-based skill modules. + +--- + +## 1. Context + +### What PR #7 (Tier-A) already removed + +| Module | Lines | Why removed | +|--------|------:|-------------| +| `browser-operator.ts` | 0 importers | Orphaned — dead code | +| `tts.ts` | 0 importers | Orphaned — dead code | +| `vision.ts` | 0 importers | Orphaned — dead code | +| **Total** | **805** | 6 files, confirmed zero-importer | + +### What the unmerged Tier-B branch does + +Branch `chore/prune-browser-family` (`8ee71c7`) removes: + +| Module | Lines | Importers | Untangle | +|--------|------:|-----------|----------| +| `browser-orchestrator.ts` | 614 | 0 | Clean delete | +| `browser-backend/*` (7 files) | 775 | 0 TS (1 path ref) | Clean delete + hostd path cleanup | +| `browser-session-registry.ts` | 350 | 1 (`controlplane-db.ts`) | Delete `ensure*Schema` call | +| `browser-credentials-store.ts` | 168 | 1 (same) | Same | +| `browser-grant-tokens.ts` | 289 | 1 (same) | Same | +| **Total** | **2,196** | | | + +Plus `scripts/browser-jail-clone-validation/` (~470 lines shell). + +### What we considered pruning but shouldn't delete + +The original "delete tenant-*/platform-*/surface-*" plan was **mis-bucketed**. +These modules have SaaS-sounding names but are deeply core: + +| Module | Lines | Importers | Reality | +|--------|------:|-----------|---------| +| `tenant-registry.ts` | 1,263 | **28** | Config, db, api, pages, hostd, doctor, telegram | +| `platform-identity.ts` | 18 | **23** | hostd daemon, watchdog, db — core identity | +| `memory-pg.ts` | 300 | **8** | agent-runner, session-compaction — DB pool | +| `platform-layout.ts` | 253 | **8** | Config, pages, doctor, hostd types | +| `surface-inventory.ts` | 96 | **5** | Host routing depends on it (`local-hosts.ts`) | +| `platform-audit.ts` | 221 | **5** | `ResourceOwner` type used by hostd **authz** | + +**These stay.** They are not SaaS residue — they are load-bearing. + +--- + +## 2. The Three Target Skill Packages + +### 2.1 `telegram-voice` skill + +**Source modules:** + +| Module | Lines | Current import sites | +|--------|------:|---------------------| +| `transcription.ts` | ~250 | `channels/telegram.ts` (voice messages) | +| `stt-guard.ts` | ~71 | `channels/telegram.ts`, `telegram-commands.ts`, `startup-report.ts` | +| **Total** | **~321** | 5 import sites across 3 files | + +**Core hook needed:** `onVoiceMessage(audioBlob, chatId) → transcriptionText` + +**Behavior when skill not loaded:** Telegram voice messages are logged and +skipped. No transcription. No cooldown enforcement. + +### 2.2 `telegram-images` skill + +**Source modules:** + +| Module | Lines | Current import sites | +|--------|------:|---------------------| +| `outbound-images.ts` | 62 | `index.ts:166` (main message-processing loop) | +| **Total** | **62** | 1 import site | + +**Core hook needed:** `preMessageRender(agentOutput) → { extractedPaths, cleanedOutput }` + +**Behavior when skill not loaded:** Agent output is passed through without +temp-image-path extraction. Image rendering may show raw paths instead of +rendered images. + +### 2.3 `browser-jail` skill + +**Source modules:** + +| Module | Lines | Current import sites | +|--------|------:|---------------------| +| `browser-orchestrator.ts` | 614 | 0 (orphaned) | +| `browser-backend/*` | 775 | 0 TS, 1 path ref in `hostd/privileged-commands.ts` | +| `browser-session-registry.ts` | 350 | 1 (`controlplane-db.ts`) | +| `browser-credentials-store.ts` | 168 | 1 (same) | +| `browser-grant-tokens.ts` | 289 | 1 (same) | +| hostd browser-clone ops | ~600 | `privileged-commands.ts`, `hostd-authorization.ts` | +| `scripts/browser-jail-clone-validation/` | ~470 | 0 | +| **Total** | **~3,266** | | + +**Core hooks needed:** `onBrowserClone(config)`, `onBrowserDestroy(cloneId)`, +`onBrowserReap()`, `onPrivilegedOp(op, args) → result` + +**Behavior when skill not loaded:** Browser-clone privileged commands return +"skill not loaded." No browser jail schema initialization. + +**Note:** This skill is already nearly dead (0 importers on orchestrator, +schema-init is the only coupling). It's the strongest candidate for full +deletion rather than skill-ification — see open questions below. + +--- + +## 3. Architecture: Three Approaches + +### Approach A: Event Bus (cleanest, most work) + +``` +┌──────────────────────────────────────────────────┐ +│ Core message loop (channels/telegram.ts, index) │ +│ │ +│ skillBus.emit('voice_message', { audio, chat }) │ +│ // no-op if no listener registered │ +└───────────────┬──────────────────────────────────┘ + │ emit + ┌───────────┼───────────┐ + ▼ ▼ ▼ +┌────────┐ ┌────────┐ ┌──────────┐ +│ voice │ │ images │ │ browser │ +│ skill │ │ skill │ │ skill │ +└────────┘ └────────┘ └──────────┘ +``` + +- Skills register event listeners at startup +- Core has zero knowledge of skill internals +- Truly decoupled — load/unload dynamically + +**Pros:** +- Maps perfectly to Colibri's future Rust skill trait system +- Clean separation of concerns +- Skills can be added/removed without touching core +- Each skill fully testable in isolation + +**Cons:** +- Most implementation work — need to design the bus, error handling, async lifecycle +- TypeScript doesn't have great runtime plugin loading (startup scan or explicit registration) +- Risk of over-engineering for TS code that's heading toward Rust replacement +- Async error propagation across the bus needs careful design + +### Approach B: Config-Gated Modules (pragmatic — RECOMMENDED) + +``` +┌──────────────────────────────────────────────────┐ +│ config.ts │ +│ ENABLED_SKILLS=telegram-voice,browser-jail │ +└──────────────────────────────────────────────────┘ + │ + ┌────┴────────────────────────────────────────────┐ + │ channels/telegram.ts │ + │ │ + │ if (skills.has('telegram-voice')) { │ + │ const { transcribeVoice } = await import( │ + │ '../skills/telegram-voice/mod' │ + │ ); │ + │ await transcribeVoice(buffer); │ + │ } │ + └─────────────────────────────────────────────────┘ +``` + +- Existing imports wrapped in config checks +- Skills become config entries +- Each skill module stays self-contained but is only loaded when enabled + +**Pros:** +- Least risky — wrap existing imports in conditional blocks +- Skills are config entries (`ENABLED_SKILLS=telegram-voice,browser-jail`) +- Simple clawdie lane = `ENABLED_SKILLS=` (empty) — lean runtime +- Full deployment = full skill set +- Maps to Colibri's skill catalog concept (skills are named, declarable) +- Each skill stays self-contained for future extraction + +**Cons:** +- Still compiled into the binary — just not executed +- Not true plugin isolation (code is present, just dormant) +- Dynamic `import()` has Node.js quirks (needs `dist/` structure awareness) +- Less elegant than a proper event bus + +### Approach C: Build-Flag Exclusion (simplest, binary-level) + +``` +// tsconfig.features.json +// "compilerOptions": { ... } +// Features excluded via tsc project references + +// Simple clawdie build: tsc --build tsconfig.minimal.json +// Full build: tsc --build tsconfig.full.json +``` + +- `tsc` with feature flags or project references +- Simple clawdie binary doesn't compile voice/images/browser at all + +**Pros:** +- Dead simple +- Smallest binary (excluded code isn't compiled) +- Clean for ISO/live-USB builds + +**Cons:** +- Compile-time only — can't toggle at runtime +- Requires separate builds for different skill sets +- Doesn't map to Colibri's runtime skill catalog model +- Two build targets to maintain + +--- + +## 4. Recommendation + +### Short-term (clawdie-ai TypeScript): Approach B (config-gated) + +Rationale: +1. Least risky way to get "simple clawdie = no voice/images/browser" +2. Doesn't require a hook system redesign in TS code being superseded by Rust +3. Skill packages (`skills/telegram-voice/mod.ts`, etc.) map cleanly to future + Colibri Rust skills +4. Gets the binary-size and complexity reduction without surgery on the live + message loop +5. Config entries are declarable and visible — operators see what's loaded + +### Long-term (Colibri Rust): Approach A (event bus / trait registration) + +When these features migrate into `colibri-skills` as Rust modules: +- Each skill implements a `Skill` trait with `register_hooks(&mut SkillBus)` +- The `colibri-skills` catalog declares available skills +- MCP `list_skills` exposes loaded skills to editors +- `colibri_intake_task` matches task capabilities to agent skill sets +- Skills compile as separate crates or feature-gated modules within + `colibri-skills` + +### Browser family: delete, don't skill-ify + +The browser-jail feature is **already dead** — 0 importers on the orchestrator, +the only coupling is a schema-init call. Colibri's `colibri-glasspane` fully +supersedes the browser-clone supervision model. Skill-ifying dead code adds +complexity for no benefit. + +**Recommendation:** Merge the existing `chore/prune-browser-family` branch +as-is. Delete the browser family. Do not package it as a skill. + +### Telegram voice + images: skill-ify + +These features are **live code on active paths** (Telegram channel, main message +loop). Operators may still want them on deployed servers. Skill-ifying them gives +us: +- Simple clawdie = no voice/image processing (lean) +- Deployed clawdie = opt-in via config +- Clean extraction path to Colibri Rust skills later + +--- + +## 5. Implementation Plan (Approach B) + +### Phase 1: Browser family deletion (merge existing branch) + +- [ ] Merge `chore/prune-browser-family` (`8ee71c7`) into main +- [ ] Clean up hostd browser-clone privileged ops (~60 lines in + `hostd/privileged-commands.ts`) +- [ ] Remove `scripts/browser-jail-clone-validation/` +- [ ] Remove browser-related docs: `BROWSER-JAIL-HANDOFF.md`, + `BROWSER-JAIL-TEMPLATE-CLONE-PROPOSAL.md` +- [ ] Remove `jail/skills/agent-browser/SKILL.md` +- [ ] Verify: `tsc --noEmit` clean, all tests pass + +**Estimated removal:** ~3,200 lines +**Risk:** Low (orphaned code + one schema-init untangle) +**Assignee:** Open (Claude/Hermes can do the merge + hostd cleanup) + +### Phase 2: Skill config infrastructure + +- [ ] Add `ENABLED_SKILLS` to `config.ts` (comma-separated string from env) +- [ ] Add `skills/` directory convention: each skill has `mod.ts` (entrypoint) +- [ ] Add skill-loading utility: `loadSkill(name) → Promise` +- [ ] Add startup log line listing loaded skills + +**Estimated effort:** ~100 lines new code +**Risk:** Low + +### Phase 3: Extract `telegram-voice` skill + +- [ ] Create `skills/telegram-voice/mod.ts` — exports `transcribeVoice`, + `checkSttCooldown` +- [ ] Move `transcription.ts` and `stt-guard.ts` into `skills/telegram-voice/` +- [ ] Wrap 5 import sites in `channels/telegram.ts`, `telegram-commands.ts`, + `startup-report.ts` with config-gated dynamic imports +- [ ] Add `skills/telegram-voice/README.md` documenting the hook contract +- [ ] Test: with skill disabled, voice messages are skipped gracefully +- [ ] Test: with skill enabled, voice transcription works as before + +**Estimated effort:** ~1 day +**Risk:** Medium — touches Telegram channel (live path), but behavior is +identical when skill is enabled + +### Phase 4: Extract `telegram-images` skill + +- [ ] Create `skills/telegram-images/mod.ts` — exports `extractTmpImagePaths` +- [ ] Move `outbound-images.ts` into `skills/telegram-images/` +- [ ] Wrap `index.ts:166` import site with config-gated dynamic import +- [ ] Test: with skill disabled, agent output passes through (raw paths shown) +- [ ] Test: with skill enabled, image extraction works as before + +**Estimated effort:** ~half day +**Risk:** Medium — touches main message loop (`index.ts`), but the hook is +simple (extract-and-clean) + +### Phase 5: Documentation + Colibri skill catalog mapping + +- [ ] Document the skill system in `doc/CONTROLPLANE-ARCHITECTURE.md` +- [ ] Map TS skill packages to planned Colibri Rust skill crates: + - `skills/telegram-voice/` → `colibri-skills::telegram_voice` + - `skills/telegram-images/` → `colibri-skills::telegram_images` +- [ ] Update `colibri-skills` catalog to declare these as planned skills +- [ ] Update MCP `list_skills` tool schema to include capability metadata + +--- + +## 6. What the Skill Hook Contract Looks Like (Approach B) + +Each skill module exports a standard interface: + +```typescript +// skills/telegram-voice/mod.ts +export interface SkillManifest { + name: string; // "telegram-voice" + version: string; // "1.0.0" + capabilities: string[]; // ["voice-transcription", "stt-guard"] + hooks: string[]; // ["onVoiceMessage", "onVoiceCooldown"] +} + +export const manifest: SkillManifest = { + name: "telegram-voice", + version: "1.0.0", + capabilities: ["voice-transcription", "stt-guard"], + hooks: ["onVoiceMessage", "onVoiceCooldown"], +}; + +export async function transcribeVoice(audioBlob: Buffer): Promise { + // existing transcription logic +} + +export function checkSttCooldown(chatId: string): boolean { + // existing stt-guard logic +} +``` + +Core code loads conditionally: + +```typescript +// channels/telegram.ts +const voiceSkill = config.skills.includes('telegram-voice') + ? await import('../skills/telegram-voice/mod') + : null; + +// Later in voice message handler: +if (voiceSkill && voiceSkill.checkSttCooldown(chatId)) { + const text = await voiceSkill.transcribeVoice(audioBlob); + // ... process text +} else { + logger.debug('voice message skipped — telegram-voice skill not loaded'); +} +``` + +--- + +## 7. Open Questions for Agent Input + +1. **Event bus vs config-gated?** + Claude recommends config-gated (Approach B) for TS as a stepping stone. Codex + / Hermes — do you agree, or is the event bus (Approach A) worth the extra work + even for TS code heading toward Rust replacement? + +2. **Browser family: delete or skill-ify?** + Claude recommends deletion (already orphaned, glasspane supersedes it). Anyone + see a reason to keep browser-jail as a skill package? + +3. **`stripe-config` + `stripe-tools.ts`: skill or core?** + Currently 365 lines across 2 files. Stripe is a billing capability, not a + core runtime path. Should this become a `stripe-billing` skill? Or is it + small enough to leave in core? + +4. **`openrouter-status` (109 lines): skill or core?** + Status-display only (startup report + telegram commands). Not in the request + loop. Could be a `provider-status` skill, or left as-is since it's tiny. + +5. **Skill loading mechanism for TS:** + Should we use `await import()` (dynamic, async) or pre-require all skills and + gate at call-site (static, sync)? Dynamic import is cleaner but has Node.js + `dist/` path quirks. Static pre-require is simpler but defeats the "not + loaded" benefit. + +6. **Colibri Rust skill trait:** + When these migrate to Rust, should skills be separate crates in the workspace + (`colibri-skill-voice`, `colibri-skill-browser`) or feature-gated modules + within `colibri-skills`? Separate crates = cleaner compilation boundaries but + workspace bloat. Feature-gated = fewer crates but coarser. + +7. **FreeBSD impact:** + Does skill-gating change anything for the ISO build? The simple clawdie binary + on the live USB would have `ENABLED_SKILLS=` (empty). Does the ISO build + process need to know about skill flags, or is it purely runtime config? + +--- + +## 8. Current Import Dependency Map (reference data) + +Gathered from `rg` analysis of clawdie-ai at commit `8d5284d`. + +### Fully orphaned (0 importers — safe to delete) + +| Module | Lines | +|--------|------:| +| `browser-orchestrator.ts` | 614 | + +### Near-orphaned (1 importer, schema-init only) + +| Module | Lines | Importer | +|--------|------:|----------| +| `browser-session-registry.ts` | 350 | `controlplane-db.ts` | +| `browser-credentials-store.ts` | 168 | `controlplane-db.ts` | +| `browser-grant-tokens.ts` | 289 | `controlplane-db.ts` | +| `browser-backend/*` (7 files) | 775 | 0 TS (1 path ref in hostd) | + +### Single-importer on live path (skill candidates) + +| Module | Lines | Importer | Path | +|--------|------:|----------|------| +| `transcription.ts` | ~250 | `channels/telegram.ts` | Voice messages | +| `outbound-images.ts` | 62 | `index.ts:166` | Message rendering | +| `stripe-config.ts` | 41 | `config.ts` | Status display | +| `openrouter-status.ts` | 109 | `startup-report.ts`, `telegram-commands.ts` | Status display | + +### Low coupling (2-4 importers) + +| Module | Lines | Importers | +|--------|------:|-----------| +| `stt-guard.ts` | ~71 | 3 (telegram channel, commands, startup) | +| `tenant-site-content.ts` | 237 | 4 (pages, dashboard, publish-report, tenant-publish) | + +### Heavily core — DO NOT TOUCH + +| Module | Lines | Importers | +|--------|------:|-----------| +| `tenant-registry.ts` | 1,263 | **28** | +| `platform-identity.ts` | 18 | **23** | +| `memory-pg.ts` | 300 | **8** | +| `platform-layout.ts` | 253 | **8** | +| `surface-inventory.ts` | 96 | **5** (host routing dep) | +| `platform-audit.ts` | 221 | **5** (hostd authz dep) | + +--- + +## 9. Validation Notes + +### Linux (domedog / debby) + +- PR #7 (Tier-A): merged, tested, clean +- Tier-B branch (`chore/prune-browser-family`): written, untested on current main +- Skill extraction: not yet started + +### FreeBSD 15 (Codex) + +- Colibri `colibri-mcp` PR #32: **validated** (16 tests, stdio handshake, 2.4M + ELF binary) +- This plan: needs FreeBSD review — does the skill-gating approach make sense + for the FreeBSD deployment model? + +### Outstanding FreeBSD items + +- Tier-B branch needs FreeBSD `tsc --noEmit` + test run after merge +- Skill extraction phases need FreeBSD validation once implemented +- ISO build impact assessment (does `ENABLED_SKILLS` need to be a build-time or + runtime-only flag?) + +--- + +## 10. Attribution + +| Agent | Role | +|-------|------| +| Sam (operator) | Direction, cost-mode policy, FreeBSD validation | +| Claude (domedog) | Import dependency analysis, approach evaluation, this document | +| Hermes (debby) | Open — review requested | +| Codex (FreeBSD 15) | Open — FreeBSD feasibility review requested | + +--- + +*This document supersedes the inline pruning notes from the PR #7 commit +message and the stale "delete tenant-*/platform-*/surface-*" plan, which was +mis-bucketed.*