2026-06-02 17:56:08 +02:00
1 changed files with 18 additions and 17 deletions
--- a/docs/COLIBRI-TOKENOMICS-TRIFECTA.md
+++ b/docs/COLIBRI-TOKENOMICS-TRIFECTA.md
@ -22,11 +22,11 @@ measure-everything agent runtime. The "trifecta" is our north star.

 ## The Trifecta

-| Axis        | What it means for agents                          | Colibri surface                       |
-|-------------|---------------------------------------------------|---------------------------------------|
-| Performance | Did the agent get it right? Task success rate     | Task outcomes, eval harness (T1.6)    |
-| Speed       | Tokens/second, cache-hit ratio, latency           | `colibri-deepseek` cache probe, T1.4  |
-| Cost        | Dollars per task. Not per token — per *result*    | `cost.rs` CostMode, escalation, metering |
+| Axis        | What it means for agents                       | Colibri surface                          |
+| ----------- | ---------------------------------------------- | ---------------------------------------- |
+| Performance | Did the agent get it right? Task success rate  | Task outcomes, eval harness (T1.6)       |
+| Speed       | Tokens/second, cache-hit ratio, latency        | `colibri-deepseek` cache probe, T1.4     |
+| Cost        | Dollars per task. Not per token — per _result_ | `cost.rs` CostMode, escalation, metering |

 You cannot optimize one without understanding impact on the other two.
 A cheap model that needs 5 retries is more expensive than a capable model
@ -43,7 +43,7 @@ strategy:
 1. **Maximize cache-hit surface**: byte-stable system prefix, skills,
   tool definitions, agent identity — warm once, reuse thousands of times
 2. **Spend where it counts**: conversation turns, tool results, novel
-   context — these are unavoidable, so make them *useful* (VSpecs,
+   context — these are unavoidable, so make them _useful_ (VSpecs,
   rich context, HTML plans)
 3. **Trim where it doesn't**: auto-compaction, summarization, tool result
   truncation — Colibri's 3-region model already does this
@ -96,6 +96,7 @@ Cost         ████████░░  $0.047 avg/task (target: <$0.05)
 ### Model selection arbitrage

 Given a task, Colibri should be able to answer:
+
 - Can this task be handled by a cheap model (DeepSeek V3, Gemini Flash)?
 - Is the cache-hit ratio high enough that the premium model is actually cheaper?
 - What's the cost delta between models for this specific task type?
@ -134,17 +135,17 @@ text+image token budgets.

 ## Integration with Existing Work

-| Colibri component            | Trifecta role                           | Status  |
-|------------------------------|-----------------------------------------|---------|
-| `colibri-deepseek`           | Cache probe, hit metering               | ✅ done |
-| `colibri-daemon/cost.rs`     | CostMode, budget enforcement            | ✅ done |
-| `colibri-daemon/session.rs`  | 3-region prompt, compaction             | ✅ done |
-| Cache warming (T1.4 PR3b)    | Pre-warm stable prefix                  | ✅ done |
-| Prompt discipline (T1.4)     | Byte-stable assembly, cost-aware trim   | 🔧 WIP  |
-| Trifecta dashboard (T1.5)    | Per-task cost/speed/perf metrics        | 📋 plan |
-| Eval harness (T1.6)          | Task success measurement                | 📋 plan |
-| Model selection (T2.x)       | Arbitrage engine, cost-aware routing    | 📋 plan |
-| VSpec support (T2.x)         | Image tokens in prompt assembly         | 📋 plan |
+| Colibri component           | Trifecta role                         | Status  |
+| --------------------------- | ------------------------------------- | ------- |
+| `colibri-deepseek`          | Cache probe, hit metering             | ✅ done |
+| `colibri-daemon/cost.rs`    | CostMode, budget enforcement          | ✅ done |
+| `colibri-daemon/session.rs` | 3-region prompt, compaction           | ✅ done |
+| Cache warming (T1.4 PR3b)   | Pre-warm stable prefix                | ✅ done |
+| Prompt discipline (T1.4)    | Byte-stable assembly, cost-aware trim | 🔧 WIP  |
+| Trifecta dashboard (T1.5)   | Per-task cost/speed/perf metrics      | 📋 plan |
+| Eval harness (T1.6)         | Task success measurement              | 📋 plan |
+| Model selection (T2.x)      | Arbitrage engine, cost-aware routing  | 📋 plan |
+| VSpec support (T2.x)        | Image tokens in prompt assembly       | 📋 plan |

 ## Reference