docs/model-selection-and-eval #244

Merged
clawdie merged 2 commits from docs/model-selection-and-eval into main 2026-06-27 22:24:29 +02:00
Owner
No description provided.
clawdie added 2 commits 2026-06-27 22:24:04 +02:00
Cargo fmt drift in the new cost pipeline integration tests:
- Multi-line .args() calls (8+ args per line)
- Multi-line assert!() with format strings
- Braced if-let-else blocks

Sam & Claude
docs(wiki): model selection + evaluation harness design
Some checks are pending
CI / rust (pull_request) Waiting to run
CI / markdown (pull_request) Waiting to run
CI / port (pull_request) Waiting to run
CI / agent-jail-pkgs (pull_request) Waiting to run
b096168aee
New wiki page: model-selection-and-eval.md (445 lines)

Completes the T2.x trifecta design:
- Evaluation harness: 3 modes (self-report, local LLM, cloud LLM)
- Model selection: weighted scoring (success rate, cost, capability, latency)
- Integration with hive-routing: data flow + implementation phases
- 4 implementation phases, ~10 days total, ~570 lines

Indexed in both en/index.md and sl/index.md.

Follows PR #241 (conflict marker fix) and the now-merged screenshot
pipeline. The eval harness provides the feedback loop that makes
model-selection decisions data-driven rather than heuristic.

Sam & Claude
clawdie merged commit 80543c5f46 into main 2026-06-27 22:24:29 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: clawdie/colibri#244
No description provided.