clawdie/colibri

Fork 0

Sam & Claude b878b4bdfb

CI / agent-jail-pkgs (pull_request) Has been cancelled

Details

CI / rust (pull_request) Has been cancelled

Details

CI / markdown (pull_request) Has been cancelled

Details

CI / port (pull_request) Has been cancelled

Details

docs: rewrite negative patterns as positive actionable instructions

Convert 'do not', 'cannot', 'never', 'avoid', 'don't' patterns across
AGENTS.md, README.md, and 11 docs/*.md files into positive,
actionable instructions that tell the reader what TO do.

Preserved: hard safety constraints (MUST NOT agent boundaries,
vault credential confinement intent) — these are enforceable
guardrails where the prohibition IS the instruction.

2026-06-21 13:09:19 +02:00

7.8 KiB

Raw Blame History

Colibri Skills Plan

Status: Phase 1 scaffolded — read-only split-brain consumer

Crate: crates/colibri-skills

Purpose

colibri-skills is Colibri's read-only runtime consumer for reviewed skill artifacts authored in the Clawdie-AI repo. It does not author, edit, or store canonical skills. Clawdie-AI remains the source of truth; Colibri indexes and serves typed/runtime views.

Clawdie-AI repo (source of truth)
  docs/astro-howto/
  docs/forgejo-admin/
  docs/vaultwarden-onboarding/
  ...

Colibri colibri-skills crate (read-only consumer)
  reads committed skill artifacts
  validates checksums
  indexes Markdown/transcript chunks
  exposes Skill, SkillArtifact, SkillChunk structs
  serves CLI/TUI/search later

This keeps the split-brain model explicit:

system_skills: committed built-in knowledge / manuals / reviewed skillpacks
system_brain: user and agent memory
system_ops: live runtime, task, service, and daemon state

Seed artifact: Astro how-to

The first concrete skillpack is docs/astro-howto/ in Clawdie-AI. It is useful because it is not just prose; it includes transcript, generated how-to docs, commands, screenshots, contact sheet, manifest, checksums, and scripts.

{
  "skill_id": "astro-howto",
  "source": "local video-derived training artifact",
  "inputs": [
    "transcript_local.txt",
    "screenshots/",
    "contact-sheet/contact_sheet.jpg"
  ],
  "outputs": [
    "docs/HOWTO.md",
    "docs/COMMANDS.md",
    "docs/SCREENSHOTS.md",
    "docs/SUMMARY.md"
  ],
  "verification": "can user create and run an Astro project?",
  "media": "screenshots/*.jpg (paths + hashes, not blobs)",
  "manifest": "run_manifest.json",
  "checksums": "artifacts.sha256"
}

Pipeline shape:

video → local transcript → topic extraction → how-to/runbook
→ screenshots/contact sheet → commands → verification test
→ manifest + checksums → reviewed skill artifact → Colibri read-only index

Ownership

Layer	Role	Writes	Reads
Clawdie-AI	Source of truth	Skill artifacts via PR	N/A
`colibri-skills`	Runtime consumer	Writes only to the runtime store; source repo remains read-only for the skills
consumer.	Indexed skill structs from committed artifacts
Agents	Authors/reviewers	Candidate skill artifact PRs	Skill content for task routing
`system_brain`	Agent/user memory	Personal/user/agent context	Not canonical skill docs
`system_ops`	Runtime state	Live task/service state	Not skills

What `colibri-skills` does

Read skill manifests from a configured Clawdie-AI checkout path
Parse run_manifest.json
Validate checksums against artifacts.sha256
Classify artifacts as document, image, script, transcript, manifest, checksum, report, contact sheet, or other
Index Markdown/transcript chunks for search
Expose stable typed structs for daemon/client/TUI callers
Persist runtime index metadata in SQLite

What `colibri-skills` does not do

Author, edit, or create skills
Store image blobs in SQLite; store paths and hashes only
Replace system_brain
Replace system_ops
Own provider/API budget logic
Require nonportable local source media paths at runtime

Phase 1 delivered

The scaffold crate now provides:

Skill
SkillManifest
SkillArtifact
SkillChunk
ArtifactType
SkillStatus
ImportSummary
SearchResult
unit tests for artifact classification and status/summary behavior

Phase 1 is intentionally scaffold-only: compile and type proof, no runtime import behavior yet.

SQLite schema target

CREATE TABLE system_skills (
    skill_id TEXT PRIMARY KEY,
    display_name TEXT NOT NULL,
    source_path TEXT NOT NULL,            -- relative within Clawdie-AI repo
    manifest_hash TEXT,                   -- sha256 of run_manifest.json
    created_at TEXT NOT NULL,             -- ISO 8601
    updated_at TEXT NOT NULL,
    verification TEXT,                    -- natural-language verification test
    status TEXT NOT NULL DEFAULT 'active' -- active, archived, superseded
);

CREATE TABLE system_skill_artifacts (
    artifact_id INTEGER PRIMARY KEY AUTOINCREMENT,
    skill_id TEXT NOT NULL REFERENCES system_skills(skill_id),
    artifact_type TEXT NOT NULL,
    relative_path TEXT NOT NULL,          -- within the skill directory
    file_name TEXT NOT NULL,
    mime_type TEXT,
    size_bytes INTEGER,
    sha256_hash TEXT NOT NULL,
    UNIQUE(skill_id, relative_path)
);

CREATE TABLE system_skill_chunks (
    chunk_id INTEGER PRIMARY KEY AUTOINCREMENT,
    skill_id TEXT NOT NULL REFERENCES system_skills(skill_id),
    artifact_id INTEGER NOT NULL REFERENCES system_skill_artifacts(artifact_id),
    chunk_type TEXT NOT NULL,
    heading TEXT,
    content TEXT NOT NULL,
    line_start INTEGER,
    line_end INTEGER,
    tokens_estimate INTEGER
);

CREATE INDEX idx_skills_status ON system_skills(status);
CREATE INDEX idx_artifacts_skill ON system_skill_artifacts(skill_id);
CREATE INDEX idx_artifacts_type ON system_skill_artifacts(artifact_type);
CREATE INDEX idx_chunks_skill ON system_skill_chunks(skill_id);
CREATE INDEX idx_chunks_type ON system_skill_chunks(chunk_type);

CREATE VIRTUAL TABLE IF NOT EXISTS skill_fts USING fts5(
    content,
    heading,
    skill_id,
    chunk_type,
    content=system_skill_chunks,
    content_rowid=chunk_id
);

Import flow target

Read Clawdie-AI checkout path from config/env.
Scan for directories containing run_manifest.json.
Parse manifest and derive skill metadata.
Read artifacts, compute SHA-256, and verify artifacts.sha256 when present.
Chunk Markdown by heading and transcripts by timestamp/segment.
Upsert SQLite rows idempotently.
Return ImportSummary with skills found/indexed/skipped, artifacts, chunks, checksum failures, and errors.

CLI surface target

colibri list-skills
colibri show-skill <id>
colibri search-skills <query>
colibri index-skills
colibri verify-skill <id>

Portability rules

Store image paths and hashes, not blobs.
Treat local provenance paths like /home/samob/Videos/... as metadata only.
Verify checksums against committed artifacts, not local source paths.
Store paths relative to the Clawdie-AI repo. Normal tests run with only local SQLite and committed test fixtures; keep PostgreSQL, remote Forgejo, and local media as optional integration dependencies.

Future skillpacks

astro-howto
forgejo-admin
vaultwarden-onboarding
freebsd-update-reboot
colibri-iso-build
zed-on-freebsd
pi-headless-login

Implementation phases

Phase	What	Depends on
1	Scaffold crate + structs + schema plan	Nothing
2	Manifest parser (`run_manifest.json` → `SkillManifest`)	Phase 1
3	Checksum validator (`artifacts.sha256` → verify)	Phase 2
4	Markdown/transcript chunker	Phase 1
5	SQLite storage + FTS5 search	Phases 3, 4
6	CLI commands (`list`, `show`, `search`, `index`, `verify`)	Phase 5
7	Daemon/client/TUI integration	Phase 6

clawdie-ai/docs/astro-howto/
clawdie-ai/docs/VAULTWARDEN-SETUP.md
clawdie-ai/bootstrap/skills-memory/artifact.sql
clawdie-ai/src/split-brain-status.ts

7.8 KiB Raw Blame History

Colibri Skills Plan

Purpose

Seed artifact: Astro how-to

Ownership

What colibri-skills does

What colibri-skills does not do

Phase 1 delivered

SQLite schema target

Import flow target

CLI surface target

Portability rules

Future skillpacks

Implementation phases

Related sources

7.8 KiB

Raw Blame History

What `colibri-skills` does

What `colibri-skills` does not do