Add eval.rs module that calls ollama to independently verify task success/failure. Prefers small fast models (deepseek-r1:7b, qwen2.5:3b, llama3.2:1b) via detect_best_local_model(). Binary PASS/FAIL prompt with 2000-char task description trim. 30s timeout. Wired into record_task_completion() via tokio::task::spawn_blocking after agent self-report — non-blocking, best-effort. Uses INSERT OR REPLACE to overwrite correctness_check from local model. Keeps agent self-report eval_mode: 'agent' row; local eval writes a second row with eval_mode: 'local-llm'. 3 unit tests for prompt building + struct fields. All 102 daemon tests pass. fmt + clippy clean. |
||
|---|---|---|
| .. | ||
| guide | ||
| plans | ||
| wiki | ||
| CLAWDIE-INSTALLER-VALIDATION.md | ||
| CLAWDIE-STUDIO.md | ||
| COLIBRI-EXTERNAL-MCP-PROTOTYPE.md | ||
| COLIBRI-JAILED-AGENT-SPAWN-DESIGN.md | ||
| COLIBRI-TOKENOMICS-TRIFECTA.md | ||
| FREEBSD-BUILD-LANE.md | ||
| ISO-ACCEPTANCE-RUNBOOK.md | ||
| ISO-SERVICE-LAYOUT.md | ||
| MULTI-AGENT-HOST.md | ||
| README.md | ||
| VAULT-PROVISION-RUNBOOK.md | ||
| WIKI-CLAWDIE-SI.md | ||
Colibri Docs Index
A quick-reference guide to every document in this folder.
| Document | One-liner | Audience |
|---|---|---|
CLAWDIE-INSTALLER-VALIDATION.md |
FreeBSD validation handoff for the clawdie installer |
Codex (FreeBSD) |
CLAWDIE-STUDIO.md |
Zed/Claude Code + control plane integration (bare-metal deployment option) | Sam & agents |
COLIBRI-EXTERNAL-MCP-PROTOTYPE.md |
Colibri as MCP host for external stdio servers (jailed, 3-tier trust) | Agents |
COLIBRI-JAILED-AGENT-SPAWN-DESIGN.md |
FreeBSD jail confinement for spawned agents — accepted & implemented | Rust agents |
COLIBRI-TOKENOMICS-TRIFECTA.md |
Strategic vision: useful tokens, cost-per-intelligence, measurement | All |
ISO-ACCEPTANCE-RUNBOOK.md |
Post-boot acceptance commands after staging Colibri into an ISO | Codex (FreeBSD) |
ISO-SERVICE-LAYOUT.md |
rc.conf service layout for the ISO image |
All |
MULTI-AGENT-HOST.md |
Current sprint: multi-agent task-board tests + CLI surface gaps | All agents |
VAULT-PROVISION-RUNBOOK.md |
First-proof runbook: vault → jail → .env chain (clean CLI) |
Agents, Sam |