Convert 'do not', 'cannot', 'never', 'avoid', 'don't' patterns across AGENTS.md, README.md, and 11 docs/*.md files into positive, actionable instructions that tell the reader what TO do. Preserved: hard safety constraints (MUST NOT agent boundaries, vault credential confinement intent) — these are enforceable guardrails where the prohibition IS the instruction.
7.8 KiB
Colibri Skills Plan
Status: Phase 1 scaffolded — read-only split-brain consumer
Crate: crates/colibri-skills
Purpose
colibri-skills is Colibri's read-only runtime consumer for reviewed skill
artifacts authored in the Clawdie-AI repo. It does not author, edit, or store
canonical skills. Clawdie-AI remains the source of truth; Colibri indexes and
serves typed/runtime views.
Clawdie-AI repo (source of truth)
docs/astro-howto/
docs/forgejo-admin/
docs/vaultwarden-onboarding/
...
Colibri colibri-skills crate (read-only consumer)
reads committed skill artifacts
validates checksums
indexes Markdown/transcript chunks
exposes Skill, SkillArtifact, SkillChunk structs
serves CLI/TUI/search later
This keeps the split-brain model explicit:
system_skills: committed built-in knowledge / manuals / reviewed skillpackssystem_brain: user and agent memorysystem_ops: live runtime, task, service, and daemon state
Seed artifact: Astro how-to
The first concrete skillpack is docs/astro-howto/ in Clawdie-AI. It is useful
because it is not just prose; it includes transcript, generated how-to docs,
commands, screenshots, contact sheet, manifest, checksums, and scripts.
{
"skill_id": "astro-howto",
"source": "local video-derived training artifact",
"inputs": [
"transcript_local.txt",
"screenshots/",
"contact-sheet/contact_sheet.jpg"
],
"outputs": [
"docs/HOWTO.md",
"docs/COMMANDS.md",
"docs/SCREENSHOTS.md",
"docs/SUMMARY.md"
],
"verification": "can user create and run an Astro project?",
"media": "screenshots/*.jpg (paths + hashes, not blobs)",
"manifest": "run_manifest.json",
"checksums": "artifacts.sha256"
}
Pipeline shape:
video → local transcript → topic extraction → how-to/runbook
→ screenshots/contact sheet → commands → verification test
→ manifest + checksums → reviewed skill artifact → Colibri read-only index
Ownership
| Layer | Role | Writes | Reads |
|---|---|---|---|
| Clawdie-AI | Source of truth | Skill artifacts via PR | N/A |
colibri-skills |
Runtime consumer | Writes only to the runtime store; source repo remains read-only for the skills | |
| consumer. | Indexed skill structs from committed artifacts | ||
| Agents | Authors/reviewers | Candidate skill artifact PRs | Skill content for task routing |
system_brain |
Agent/user memory | Personal/user/agent context | Not canonical skill docs |
system_ops |
Runtime state | Live task/service state | Not skills |
What colibri-skills does
- Read skill manifests from a configured Clawdie-AI checkout path
- Parse
run_manifest.json - Validate checksums against
artifacts.sha256 - Classify artifacts as document, image, script, transcript, manifest, checksum, report, contact sheet, or other
- Index Markdown/transcript chunks for search
- Expose stable typed structs for daemon/client/TUI callers
- Persist runtime index metadata in SQLite
What colibri-skills does not do
- Author, edit, or create skills
- Store image blobs in SQLite; store paths and hashes only
- Replace
system_brain - Replace
system_ops - Own provider/API budget logic
- Require nonportable local source media paths at runtime
Phase 1 delivered
The scaffold crate now provides:
SkillSkillManifestSkillArtifactSkillChunkArtifactTypeSkillStatusImportSummarySearchResult- unit tests for artifact classification and status/summary behavior
Phase 1 is intentionally scaffold-only: compile and type proof, no runtime import behavior yet.
SQLite schema target
CREATE TABLE system_skills (
skill_id TEXT PRIMARY KEY,
display_name TEXT NOT NULL,
source_path TEXT NOT NULL, -- relative within Clawdie-AI repo
manifest_hash TEXT, -- sha256 of run_manifest.json
created_at TEXT NOT NULL, -- ISO 8601
updated_at TEXT NOT NULL,
verification TEXT, -- natural-language verification test
status TEXT NOT NULL DEFAULT 'active' -- active, archived, superseded
);
CREATE TABLE system_skill_artifacts (
artifact_id INTEGER PRIMARY KEY AUTOINCREMENT,
skill_id TEXT NOT NULL REFERENCES system_skills(skill_id),
artifact_type TEXT NOT NULL,
relative_path TEXT NOT NULL, -- within the skill directory
file_name TEXT NOT NULL,
mime_type TEXT,
size_bytes INTEGER,
sha256_hash TEXT NOT NULL,
UNIQUE(skill_id, relative_path)
);
CREATE TABLE system_skill_chunks (
chunk_id INTEGER PRIMARY KEY AUTOINCREMENT,
skill_id TEXT NOT NULL REFERENCES system_skills(skill_id),
artifact_id INTEGER NOT NULL REFERENCES system_skill_artifacts(artifact_id),
chunk_type TEXT NOT NULL,
heading TEXT,
content TEXT NOT NULL,
line_start INTEGER,
line_end INTEGER,
tokens_estimate INTEGER
);
CREATE INDEX idx_skills_status ON system_skills(status);
CREATE INDEX idx_artifacts_skill ON system_skill_artifacts(skill_id);
CREATE INDEX idx_artifacts_type ON system_skill_artifacts(artifact_type);
CREATE INDEX idx_chunks_skill ON system_skill_chunks(skill_id);
CREATE INDEX idx_chunks_type ON system_skill_chunks(chunk_type);
CREATE VIRTUAL TABLE IF NOT EXISTS skill_fts USING fts5(
content,
heading,
skill_id,
chunk_type,
content=system_skill_chunks,
content_rowid=chunk_id
);
Import flow target
- Read Clawdie-AI checkout path from config/env.
- Scan for directories containing
run_manifest.json. - Parse manifest and derive skill metadata.
- Read artifacts, compute SHA-256, and verify
artifacts.sha256when present. - Chunk Markdown by heading and transcripts by timestamp/segment.
- Upsert SQLite rows idempotently.
- Return
ImportSummarywith skills found/indexed/skipped, artifacts, chunks, checksum failures, and errors.
CLI surface target
colibri list-skills
colibri show-skill <id>
colibri search-skills <query>
colibri index-skills
colibri verify-skill <id>
Portability rules
- Store image paths and hashes, not blobs.
- Treat local provenance paths like
/home/samob/Videos/...as metadata only. - Verify checksums against committed artifacts, not local source paths.
- Store paths relative to the Clawdie-AI repo. Normal tests run with only local SQLite and committed test fixtures; keep PostgreSQL, remote Forgejo, and local media as optional integration dependencies.
Future skillpacks
astro-howto
forgejo-admin
vaultwarden-onboarding
freebsd-update-reboot
colibri-iso-build
zed-on-freebsd
pi-headless-login
Implementation phases
| Phase | What | Depends on |
|---|---|---|
| 1 | Scaffold crate + structs + schema plan | Nothing |
| 2 | Manifest parser (run_manifest.json → SkillManifest) |
Phase 1 |
| 3 | Checksum validator (artifacts.sha256 → verify) |
Phase 2 |
| 4 | Markdown/transcript chunker | Phase 1 |
| 5 | SQLite storage + FTS5 search | Phases 3, 4 |
| 6 | CLI commands (list, show, search, index, verify) |
Phase 5 |
| 7 | Daemon/client/TUI integration | Phase 6 |
Related sources
clawdie-ai/docs/astro-howto/clawdie-ai/docs/VAULTWARDEN-SETUP.mdclawdie-ai/bootstrap/skills-memory/artifact.sqlclawdie-ai/src/split-brain-status.ts