docs: Colibri Tokenomics — trifecta framework (performance/speed/cost) #15
1 changed files with 18 additions and 17 deletions
|
|
@ -22,11 +22,11 @@ measure-everything agent runtime. The "trifecta" is our north star.
|
|||
|
||||
## The Trifecta
|
||||
|
||||
| Axis | What it means for agents | Colibri surface |
|
||||
|-------------|---------------------------------------------------|---------------------------------------|
|
||||
| Performance | Did the agent get it right? Task success rate | Task outcomes, eval harness (T1.6) |
|
||||
| Speed | Tokens/second, cache-hit ratio, latency | `colibri-deepseek` cache probe, T1.4 |
|
||||
| Cost | Dollars per task. Not per token — per *result* | `cost.rs` CostMode, escalation, metering |
|
||||
| Axis | What it means for agents | Colibri surface |
|
||||
| ----------- | ---------------------------------------------- | ---------------------------------------- |
|
||||
| Performance | Did the agent get it right? Task success rate | Task outcomes, eval harness (T1.6) |
|
||||
| Speed | Tokens/second, cache-hit ratio, latency | `colibri-deepseek` cache probe, T1.4 |
|
||||
| Cost | Dollars per task. Not per token — per _result_ | `cost.rs` CostMode, escalation, metering |
|
||||
|
||||
You cannot optimize one without understanding impact on the other two.
|
||||
A cheap model that needs 5 retries is more expensive than a capable model
|
||||
|
|
@ -43,7 +43,7 @@ strategy:
|
|||
1. **Maximize cache-hit surface**: byte-stable system prefix, skills,
|
||||
tool definitions, agent identity — warm once, reuse thousands of times
|
||||
2. **Spend where it counts**: conversation turns, tool results, novel
|
||||
context — these are unavoidable, so make them *useful* (VSpecs,
|
||||
context — these are unavoidable, so make them _useful_ (VSpecs,
|
||||
rich context, HTML plans)
|
||||
3. **Trim where it doesn't**: auto-compaction, summarization, tool result
|
||||
truncation — Colibri's 3-region model already does this
|
||||
|
|
@ -96,6 +96,7 @@ Cost ████████░░ $0.047 avg/task (target: <$0.05)
|
|||
### Model selection arbitrage
|
||||
|
||||
Given a task, Colibri should be able to answer:
|
||||
|
||||
- Can this task be handled by a cheap model (DeepSeek V3, Gemini Flash)?
|
||||
- Is the cache-hit ratio high enough that the premium model is actually cheaper?
|
||||
- What's the cost delta between models for this specific task type?
|
||||
|
|
@ -134,17 +135,17 @@ text+image token budgets.
|
|||
|
||||
## Integration with Existing Work
|
||||
|
||||
| Colibri component | Trifecta role | Status |
|
||||
|------------------------------|-----------------------------------------|---------|
|
||||
| `colibri-deepseek` | Cache probe, hit metering | ✅ done |
|
||||
| `colibri-daemon/cost.rs` | CostMode, budget enforcement | ✅ done |
|
||||
| `colibri-daemon/session.rs` | 3-region prompt, compaction | ✅ done |
|
||||
| Cache warming (T1.4 PR3b) | Pre-warm stable prefix | ✅ done |
|
||||
| Prompt discipline (T1.4) | Byte-stable assembly, cost-aware trim | 🔧 WIP |
|
||||
| Trifecta dashboard (T1.5) | Per-task cost/speed/perf metrics | 📋 plan |
|
||||
| Eval harness (T1.6) | Task success measurement | 📋 plan |
|
||||
| Model selection (T2.x) | Arbitrage engine, cost-aware routing | 📋 plan |
|
||||
| VSpec support (T2.x) | Image tokens in prompt assembly | 📋 plan |
|
||||
| Colibri component | Trifecta role | Status |
|
||||
| --------------------------- | ------------------------------------- | ------- |
|
||||
| `colibri-deepseek` | Cache probe, hit metering | ✅ done |
|
||||
| `colibri-daemon/cost.rs` | CostMode, budget enforcement | ✅ done |
|
||||
| `colibri-daemon/session.rs` | 3-region prompt, compaction | ✅ done |
|
||||
| Cache warming (T1.4 PR3b) | Pre-warm stable prefix | ✅ done |
|
||||
| Prompt discipline (T1.4) | Byte-stable assembly, cost-aware trim | 🔧 WIP |
|
||||
| Trifecta dashboard (T1.5) | Per-task cost/speed/perf metrics | 📋 plan |
|
||||
| Eval harness (T1.6) | Task success measurement | 📋 plan |
|
||||
| Model selection (T2.x) | Arbitrage engine, cost-aware routing | 📋 plan |
|
||||
| VSpec support (T2.x) | Image tokens in prompt assembly | 📋 plan |
|
||||
|
||||
## Reference
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue