Linux/FreeBSD Cross-platform Rust control plane core https://clawdie.si

Find a file

Sam & Claude 89e47363ef feat(store): T2.x Phase 1 eval harness — agent self-report Schema + store + daemon hook for the eval harness (Phase 1 of T2.x). Per docs/wiki/t2x-eval-harness.md, the eval harness records multi-dimensional success measurement per task — beyond the boolean 'did it exit 0?' that T1.5 already captures. Phase 1 uses agent self-report (exit code → quality 1.0 or 0.0). Phases 2/3/4 will layer on local-llm eval, cloud-llm eval, and model-selection routing. Schema (colibri-store): - New task_evals table: task_id, agent_id, eval_mode, completion_status, quality_score, correctness_check, eval_provider, eval_latency_ms, eval_cost_usd, evaluated_at. CHECK constraints enforce the enum fields. Intentionally no FK to tasks — we don't want DELETE CASCADE to destroy eval history and we don't want a missing task row to block eval writes. - task_costs gets quality_score and eval_mode columns for dashboard display. - Migrations use IF NOT EXISTS / try-block pattern for idempotent reopens. Store API: - write_task_eval: INSERT OR REPLACE — same task_id can be upgraded (e.g. skip → agent → local-llm → cloud-llm) - read_task_eval - list_task_evals_by_agent - list_all_task_evals - eval_summary(window_hours): aggregated rollup for Phase 3 routing Daemon integration: - New TaskCompletion struct consolidates what used to be 8 args to an inline cost-capture closure. The struct is a stable API that future eval modes (local-llm, cloud-llm) can populate with eval_provider, eval_latency_ms, eval_cost_usd without touching the hook signature. - record_task_completion(state, &TaskCompletion): single atomic hook now writes both task_costs AND task_evals. Called from heartbeat's poll_exit path; designed so RPC-completion and periodic-snapshot paths (the gap flagged in feat/rpc-task-dispatch for persistent RPC agents) can call the same function. - Hardcoded eval_mode='agent' in Phase 1 — future phases pass different values; the function itself is mode-agnostic. MCP tool: - colibri_get_task_eval(task_id): returns the eval record for a task. Client: - Client::get_task_eval() async method. Tests: - 6 new store tests: roundtrip, insert-or-replace upgrade path, list-by-agent filter, eval_summary aggregation, CHECK constraint enforcement, export_json integration. - tool_dispatch test updated for new tool count (20 → 21). All gates green: cargo fmt, clippy -D warnings, cargo test workspace, wiki-lint --strict (187/0). Sam & Claude		2026-06-28 08:23:05 +02:00
.agent/skills	fix(skills): remove duplicate PF validate line in freebsd-admin SKILL	2026-06-28 00:20:33 +02:00
.forgejo/workflows	chore(ci): add wiki-lint to CI for parity with ci-checks.sh	2026-06-25 22:50:19 +02:00
astro/wiki	fix(astro): EN index reads from src/content, not ../../docs/wiki	2026-06-28 00:59:13 +02:00
crates	feat(store): T2.x Phase 1 eval harness — agent self-report	2026-06-28 08:23:05 +02:00
docs	docs: remove legacy references — positive framing pass (11 files) (#248 )	2026-06-28 00:07:17 +02:00
manifests	refactor: rename smoke→test across provider contracts and docs	2026-06-27 11:54:30 +02:00
packaging	fix(dashboard): restore dual-proof lightbox — screenshot + text badges	2026-06-27 22:34:40 +02:00
scripts	style: restore main green — fmt + prettier drift (Sam & Claude)	2026-06-27 17:19:57 +02:00
src	refactor: clear pi-era residue from the harness-neutral agent path	2026-06-23 18:04:45 +02:00
tests	feat(rc): rename test agent and load provider env (Sam & Codex)	2026-06-15 07:35:44 +02:00
.env.example	Auto-load .env for the DeepSeek probe; gitignore .env (Sam & Claude)	2026-05-26 14:27:41 +02:00
.gitignore	Auto-load .env for the DeepSeek probe; gitignore .env (Sam & Claude)	2026-05-26 14:27:41 +02:00
.prettierignore	chore: adopt markdown formatting gate + one-shot prettier sweep (Sam & Claude)	2026-06-04 20:13:47 +02:00
.prettierrc	chore: adopt markdown formatting gate + one-shot prettier sweep (Sam & Claude)	2026-06-04 20:13:47 +02:00
AGENTS.md	docs: delete 3 stale docs; repoint refs to successor	2026-06-24 16:58:49 +02:00
Cargo.lock	feat(deploy): add colibri-deploy crate + MCP tools	2026-06-27 18:57:55 +02:00
Cargo.toml	feat(deploy): add colibri-deploy crate + MCP tools	2026-06-27 18:57:55 +02:00
LICENSE	release: colibri 0.11.0 + relicense AGPL-3.0 -> MIT	2026-06-20 22:05:47 +02:00
README.md	docs: fix README referrer to moved headroom-sidecar wiki page	2026-06-24 17:34:42 +02:00
rust-toolchain.toml	Scaffold Colibri Phase 1: colibri-probe DeepSeek cache smoke (Sam & Claude)	2026-05-26 10:08:23 +02:00

README.md

Colibri

The Clawdie control plane core — a small, cross-platform (FreeBSD + Linux) Rust daemon. Developed from an operator USB environment; deploys as the Clawdie service on bare FreeBSD hardware (ZFS RAID1, PostgreSQL + pgvector, bhyve VMs, Bastille jails). Unifies coordination (task board, agent registry, skills catalog) with cache-first cost discipline (byte-stable prompt prefixes, cache-hit metering).

Status: workspace gates are fmt/clippy/test/release green. Round 2 audit is closed. Current priorities: ISO boot/runtime validation, Pi spawn end-to-end, and cost-mode enforcement (see docs/MULTI-AGENT-HOST-PLAN.md). Always query live state: see the crate table below and run the gate commands for current counts.

FreeBSD build lane handoff: docs/FREEBSD-BUILD-LANE-HANDOFF.md. ISO acceptance runbook: docs/ISO-ACCEPTANCE-RUNBOOK.md. Clawdie Studio/Zed proposal: docs/CLAWDIE-STUDIO-PROPOSAL.md. External MCP host prototype: docs/COLIBRI-EXTERNAL-MCP-PROTOTYPE.md. Optional Headroom compression sidecar: docs/wiki/headroom-sidecar.md.

Workspace

Crate	Role
`colibri` (root)	Workspace root + probe binaries (colibri-probe, runtime-inventory)
`colibri-mcp`	MCP bridge for editor integration (Zed, Claude Code) via stdio JSON-RPC
`colibri-contracts`	JSON schema contracts (golden tests)
`colibri-deepseek`	DeepSeek cache-hit probe, prefix metering
`colibri-runtime`	Host status ingestion, runtime inventory
`colibri-glasspane`	Agent 5-state machine (zot/pi JSONL events → state)
`colibri-daemon`	Always-on Unix socket server, session lifecycle
`colibri-client`	Typed Unix-socket client + operator CLI
`colibri-glasspane-tui`	ratatui live dashboard (FreeBSD-native)
`colibri-store`	Embedded SQLite coordination (task board, agents, skills)
`colibri-skills`	Skills catalog crate
`clawdie`	Host installer/deployer: ZFS layout + `clawdie` service (FreeBSD/Linux)

Build

cargo build --release

Test

cargo test --workspace
cargo clippy --workspace --all-targets -- -D warnings

Architecture

colibri-daemon (always-on Unix socket server)
  ├── glasspane      — agent state machine (zot/pi JSONL → idle/working/blocked/done)
  ├── store          — SQLite coordination (tasks, agents, skills)
  ├── socket         — newline-JSON socket API
  ├── session        — append-only JSONL sessions, 3-region prompt assembly
  └── spawner        — agent subprocess management (retry/backoff, FreeBSD jail confinement)

colibri-client       — CLI tools (colibri, colibri-test-agent)
colibri-glasspane-tui— ratatui dashboard

Probe binaries

# DeepSeek cache probe (needs DEEPSEEK_API_KEY)
cargo run --release --bin colibri-probe

# Runtime inventory manifest
cargo run --release --bin colibri-runtime-inventory

FreeBSD

Target x86_64-unknown-freebsd (Rust Tier-2). TLS uses rustls for clean static linking (no openssl-sys dependency). Default DB path: /var/db/colibri/colibri.sqlite.