colibri/docs/COLIBRI-TOKENOMICS-TRIFECTA.md

# Colibri Tokenomics — The Trifecta Framework

**Source:** Indie Devdan, "Agent Specs: The Unreasonable Effectiveness of Useful Tokens"
(https://www.youtube.com/watch?v=o4KZH_KSqYQ)
**Date:** 01.jun.2026
**Status:** Strategic vision — maps to existing T1.4/T1.5 work

> **Scope:** This applies to the full Colibri control plane.

## Core Thesis

```
More useful tokens > fewer useful tokens
Cost per intelligence > cost per token
If you don't measure, you can't improve
```

The video validates what Colibri is already building: a cache-first,
measure-everything agent runtime. The "trifecta" is our north star.

## The Trifecta

| Axis        | What it means for agents                       | Colibri surface                          |
| ----------- | ---------------------------------------------- | ---------------------------------------- |
| Performance | Did the agent get it right? Task success rate  | Task outcomes, eval harness (T1.6)       |
| Speed       | Tokens/second, cache-hit ratio, latency        | `colibri-deepseek` cache probe, T1.4     |
| Cost        | Dollars per task. Not per token — per _result_ | `cost.rs` CostMode, escalation, metering |

Optimize each dimension with full awareness of its impact on the other two.
A cheap model that needs 5 retries is more expensive than a capable model
that gets it right in one shot.

## Token Arbitrage (the "golden line")

**Arbitrage tokens for maximum value.** Every byte that hits cache is a 10×
discount — design prompts to maximize cache-hit prefixes.

Cache-hit tokens cost ~10% of fresh tokens (DeepSeek pricing). Every byte
in the stable prefix that hits cache is 90% cheaper. The arbitrage
strategy:

1. **Maximize cache-hit surface**: byte-stable system prefix, skills,
   tool definitions, agent identity — warm once, reuse thousands of times
2. **Spend where it counts**: conversation turns, tool results, novel
   context — these are unavoidable, so make them _useful_ (VSpecs,
   rich context, HTML plans)
3. **Trim where it doesn't**: auto-compaction, summarization, tool result
   truncation — Colibri's 3-region model already does this

### Existing Colibri arbitrage infrastructure

```
T1.4 Prompt Discipline (code present, integration in progress):
  Region 1: STABLE_SYSTEM_PREFIX          → cache-hit (90% cheaper)
  Region 2: conversation log (compacted)  → fresh tokens
  Region 3: volatile scratch (empty)      → zero cost

CostMode escalation (Fast → Smart → Max):
  Fast:    500K budget, compact tool results, 5 turns
  Smart:   2M budget, keep tool results, 20 turns  ← default
  Max:     8M budget, full context, 100 turns

Cache warming (T1.4 PR3b, merged):
  Pre-warm STABLE_SYSTEM_PREFIX on daemon startup
  Re-warm every N hours (configurable)
  ~3,500 tokens per warm cycle → pays off in ~7 agent tasks
```

## What We Still Need (Trifecta Dashboard)

The video's core message: observability isn't optional for production
agents. Colibri already captures the raw data. What's missing is the
trifecta view:

### Per-task cost tracking

```
task_id: "abc123"
model: "deepseek-v4-flash"
tokens_in: 45,230   (12,100 cache-hit, 33,130 fresh)
tokens_out: 2,847
cost: $0.047         (cache savings: $0.012)
latency: 8.3s
success: true
```

### Trifecta balance sheet

```
Performance  ████████░░  82% task success (rolling 24h)
Speed        ██████░░░░  61% cache-hit ratio
Cost         ████████░░  $0.047 avg/task (target: <$0.05)
```

### Model selection arbitrage

Given a task, Colibri should be able to answer:

- Can this task be handled by a cheap model (DeepSeek V3, Gemini Flash)?
- Is the cache-hit ratio high enough that the premium model is actually cheaper?
- What's the cost delta between models for this specific task type?

## Visual Specs (VSpecs) — Future Input Modality

The video introduces "VSpecs": plans with embedded images generated by
GPT Image 2. Multimodal models (Gemini 3.5 Flash, GPT-5) read these
images as "useful tokens" — a UI mockup is worth 1000 words of text
description.

For Colibri: this means the prompt assembly pipeline should eventually
support image tokens in Region 2 (conversation log). NOT for T1.4 —
this is T2.x territory. But the cost model should be ready for mixed
text+image token budgets.

## Golden Rules (from the video, adapted for Colibri)

1. **Measure everything.** Every tool call, every token, every dollar.
   Colibri's glasspane architecture already captures the event stream;
   the trifecta dashboard makes it actionable.

2. **Arbitrage cache vs spend.** The stable prefix is free money.
   Maximize its size, minimize its churn.

3. **Cost per intelligence, not per token.** Compare cost-per-successful-task,
   not raw model prices in isolation. A $0.05 task that
   works is infinitely cheaper than a $0.01 task that fails.

4. **Trade-offs are engineering.** There is no "best" model. There is
   only the right model for THIS task, under THESE constraints.

5. **Closed loop: measure → analyze → improve.** The trifecta dashboard
   isn't a report — it's a feedback loop. Every task feeds back into
   model selection, prompt design, and cache strategy.

## Integration with Existing Work

| Colibri component           | Trifecta role                         | Status  |
| --------------------------- | ------------------------------------- | ------- |
| `colibri-deepseek`          | Cache probe, hit metering             | ✅ done |
| `colibri-daemon/cost.rs`    | CostMode, budget enforcement          | ✅ done |
| `colibri-daemon/session.rs` | 3-region prompt, compaction           | ✅ done |
| Cache warming (T1.4 PR3b)   | Pre-warm stable prefix                | ✅ done |
| Prompt discipline (T1.4)    | Byte-stable assembly, cost-aware trim | 🔧 WIP  |
| Trifecta dashboard (T1.5)   | Per-task cost/speed/perf metrics      | 📋 plan |
| Eval harness (T1.6)         | Task success measurement              | 📋 plan |
| Model selection (T2.x)      | Arbitrage engine, cost-aware routing  | 📋 plan |
| VSpec support (T2.x)        | Image tokens in prompt assembly       | 📋 plan |

## Reference

- Video: "Agent Specs: The Unreasonable Effectiveness of Useful Tokens"
  https://www.youtube.com/watch?v=o4KZH_KSqYQ
- Colibri T1.4 Prompt Discipline: `docs/T1.4-PROMPT-DISCIPLINE-PLAN.md`
- Colibri Glasspane Design: `docs/COLIBRI-GLASSPANE-DESIGN.md`
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								# Colibri Tokenomics — The Trifecta Framework
 								**Source:** Indie Devdan, "Agent Specs: The Unreasonable Effectiveness of Useful Tokens"
 								(https://www.youtube.com/watch?v=o4KZH_KSqYQ)
-												docs: normalize prose dates to DD.mon.YYYY (AGENTS.md rule)

Convert US/ISO prose dates (2026-06-21) to EU format (21.jun.2026) across colibri
docs + wiki. Left as-is (data, not prose): the captured JSON "time" timestamp in
AGENT-EVENTS-REFERENCE and the rustc/cargo version strings in
CLAWDIE-INSTALLER-HANDOFF — ISO is correct for machine timestamps/filenames.

Gates: wiki-lint --strict clean; markdown format clean.

											
										
										
											2026-06-24 16:43:41 +02:00
+								**Date:** 01.jun.2026
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								**Status:** Strategic vision — maps to existing T1.4/T1.5 work
-												cleanup: drop the experimental clawdie mini-binary

The `clawdie` crate (Telegram+DeepSeek mini-agent over the control-plane core)
was an experimental operator-lane candidate. Per the agent-harness
consolidation, the live USB runs colibri_daemon + the zot agent, and the
deployed `service clawdie` is a reserved name, not this binary — so the
mini-binary is dead weight. Remove it and its now-orphaned docs.

- delete crates/clawdie (leaf crate; nothing depended on it)
- delete packaging/freebsd/clawdie.in (its rc.d candidate)
- delete docs/CLAWDIE-AGENT-WIKI.md + docs/CLAWDIE-BUILD.md (only described it)
- drop it from workspace members + Cargo.lock; tidy the strip-profile comment
- README: 11 → 10 crates, remove the clawdie row
- COLIBRI-TOKENOMICS-TRIFECTA: drop the stale clawdie-lane scope note

No "relay" existed in this repo (already gone). zot is untouched. The Clawdie
brand, the clawdie operator user, and the reserved deployed `service clawdie`
name are unaffected — this only removes the experimental Rust mini-binary.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

											
										
										
											2026-06-13 19:19:07 +02:00
+								> **Scope:** This applies to the full Colibri control plane.
-												docs: add clawdie scope exclusion + fix example model name

Per Claude review: the tokenomics doc implied cost-modes/metering as
universal Colibri behaviour, but the clawdie lane deliberately strips
all of it. Added explicit scope block referencing CLAWDIE-AGENT-WIKI.md.
Also aligned example model name deepseek-v4-flash with harness docs.

											
										
										
											2026-06-02 15:52:42 +02:00
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								## Core Thesis
 								```
 								More useful tokens > fewer useful tokens
 								Cost per intelligence > cost per token
 								If you don't measure, you can't improve
 								```
 								The video validates what Colibri is already building: a cache-first,
 								measure-everything agent runtime. The "trifecta" is our north star.
 								## The Trifecta
-												docs: format tokenomics trifecta v2

Run Prettier on the PR #15 tokenomics doc after the clawdie scope and model-name fixes.\n\nChecks: npx --yes prettier@3 --check docs/COLIBRI-TOKENOMICS-TRIFECTA.md; cargo fmt --check; git diff --check.

											
										
										
											2026-06-02 17:43:10 +02:00
+								| Axis        | What it means for agents                       | Colibri surface                          |
 								| ----------- | ---------------------------------------------- | ---------------------------------------- |
 								| Performance | Did the agent get it right? Task success rate  | Task outcomes, eval harness (T1.6)       |
 								| Speed       | Tokens/second, cache-hit ratio, latency        | `colibri-deepseek` cache probe, T1.4     |
 								| Cost        | Dollars per task. Not per token — per _result_ | `cost.rs` CostMode, escalation, metering |
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
-												docs: rewrite negative patterns as positive actionable instructions

Convert 'do not', 'cannot', 'never', 'avoid', 'don't' patterns across
AGENTS.md, README.md, and 11 docs/*.md files into positive,
actionable instructions that tell the reader what TO do.

Preserved: hard safety constraints (MUST NOT agent boundaries,
vault credential confinement intent) — these are enforceable
guardrails where the prohibition IS the instruction.

											
										
										
											2026-06-21 13:09:19 +02:00
+								Optimize each dimension with full awareness of its impact on the other two.
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								A cheap model that needs 5 retries is more expensive than a capable model
 								that gets it right in one shot.
 								## Token Arbitrage (the "golden line")
-												docs: rewrite negative patterns as positive actionable instructions

Convert 'do not', 'cannot', 'never', 'avoid', 'don't' patterns across
AGENTS.md, README.md, and 11 docs/*.md files into positive,
actionable instructions that tell the reader what TO do.

Preserved: hard safety constraints (MUST NOT agent boundaries,
vault credential confinement intent) — these are enforceable
guardrails where the prohibition IS the instruction.

											
										
										
											2026-06-21 13:09:19 +02:00
+								**Arbitrage tokens for maximum value.** Every byte that hits cache is a 10×
 								discount — design prompts to maximize cache-hit prefixes.
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
 								Cache-hit tokens cost ~10% of fresh tokens (DeepSeek pricing). Every byte
 								in the stable prefix that hits cache is 90% cheaper. The arbitrage
 								strategy:
 . **Maximize cache-hit surface**: byte-stable system prefix, skills,
 								   tool definitions, agent identity — warm once, reuse thousands of times
 . **Spend where it counts**: conversation turns, tool results, novel
-												docs: format tokenomics trifecta v2

Run Prettier on the PR #15 tokenomics doc after the clawdie scope and model-name fixes.\n\nChecks: npx --yes prettier@3 --check docs/COLIBRI-TOKENOMICS-TRIFECTA.md; cargo fmt --check; git diff --check.

											
										
										
											2026-06-02 17:43:10 +02:00
+								   context — these are unavoidable, so make them _useful_ (VSpecs,
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								   rich context, HTML plans)
 . **Trim where it doesn't**: auto-compaction, summarization, tool result
 								   truncation — Colibri's 3-region model already does this
 								### Existing Colibri arbitrage infrastructure
 								```
 								T1.4 Prompt Discipline (code present, integration in progress):
 								  Region 1: STABLE_SYSTEM_PREFIX          → cache-hit (90% cheaper)
 								  Region 2: conversation log (compacted)  → fresh tokens
 								  Region 3: volatile scratch (empty)      → zero cost
 								CostMode escalation (Fast → Smart → Max):
 								  Fast:    500K budget, compact tool results, 5 turns
 								  Smart:   2M budget, keep tool results, 20 turns  ← default
 								  Max:     8M budget, full context, 100 turns
 								Cache warming (T1.4 PR3b, merged):
 								  Pre-warm STABLE_SYSTEM_PREFIX on daemon startup
 								  Re-warm every N hours (configurable)
 								  ~3,500 tokens per warm cycle → pays off in ~7 agent tasks
 								```
 								## What We Still Need (Trifecta Dashboard)
 								The video's core message: observability isn't optional for production
 								agents. Colibri already captures the raw data. What's missing is the
 								trifecta view:
 								### Per-task cost tracking
 								```
 								task_id: "abc123"
-												docs: add clawdie scope exclusion + fix example model name

Per Claude review: the tokenomics doc implied cost-modes/metering as
universal Colibri behaviour, but the clawdie lane deliberately strips
all of it. Added explicit scope block referencing CLAWDIE-AGENT-WIKI.md.
Also aligned example model name deepseek-v4-flash with harness docs.

											
										
										
											2026-06-02 15:52:42 +02:00
+								model: "deepseek-v4-flash"
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								tokens_in: 45,230   (12,100 cache-hit, 33,130 fresh)
 								tokens_out: 2,847
 								cost: $0.047         (cache savings: $0.012)
 								latency: 8.3s
 								success: true
 								```
 								### Trifecta balance sheet
 								```
 								Performance  ████████░░  82% task success (rolling 24h)
 								Speed        ██████░░░░  61% cache-hit ratio
 								Cost         ████████░░  $0.047 avg/task (target: <$0.05)
 								```
 								### Model selection arbitrage
 								Given a task, Colibri should be able to answer:
-												docs: format tokenomics trifecta v2

Run Prettier on the PR #15 tokenomics doc after the clawdie scope and model-name fixes.\n\nChecks: npx --yes prettier@3 --check docs/COLIBRI-TOKENOMICS-TRIFECTA.md; cargo fmt --check; git diff --check.

											
										
										
											2026-06-02 17:43:10 +02:00
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								- Can this task be handled by a cheap model (DeepSeek V3, Gemini Flash)?
 								- Is the cache-hit ratio high enough that the premium model is actually cheaper?
 								- What's the cost delta between models for this specific task type?
 								## Visual Specs (VSpecs) — Future Input Modality
 								The video introduces "VSpecs": plans with embedded images generated by
 								GPT Image 2. Multimodal models (Gemini 3.5 Flash, GPT-5) read these
 								images as "useful tokens" — a UI mockup is worth 1000 words of text
 								description.
 								For Colibri: this means the prompt assembly pipeline should eventually
 								support image tokens in Region 2 (conversation log). NOT for T1.4 —
 								this is T2.x territory. But the cost model should be ready for mixed
 								text+image token budgets.
 								## Golden Rules (from the video, adapted for Colibri)
 . **Measure everything.** Every tool call, every token, every dollar.
 								   Colibri's glasspane architecture already captures the event stream;
 								   the trifecta dashboard makes it actionable.
 . **Arbitrage cache vs spend.** The stable prefix is free money.
 								   Maximize its size, minimize its churn.
-												docs: rewrite negative patterns as positive actionable instructions

Convert 'do not', 'cannot', 'never', 'avoid', 'don't' patterns across
AGENTS.md, README.md, and 11 docs/*.md files into positive,
actionable instructions that tell the reader what TO do.

Preserved: hard safety constraints (MUST NOT agent boundaries,
vault credential confinement intent) — these are enforceable
guardrails where the prohibition IS the instruction.

											
										
										
											2026-06-21 13:09:19 +02:00
+. **Cost per intelligence, not per token.** Compare cost-per-successful-task,
 								   not raw model prices in isolation. A $0.05 task that
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
+								   works is infinitely cheaper than a $0.01 task that fails.
 . **Trade-offs are engineering.** There is no "best" model. There is
 								   only the right model for THIS task, under THESE constraints.
 . **Closed loop: measure → analyze → improve.** The trifecta dashboard
 								   isn't a report — it's a feedback loop. Every task feeds back into
 								   model selection, prompt design, and cache strategy.
 								## Integration with Existing Work
-												docs: format tokenomics trifecta v2

Run Prettier on the PR #15 tokenomics doc after the clawdie scope and model-name fixes.\n\nChecks: npx --yes prettier@3 --check docs/COLIBRI-TOKENOMICS-TRIFECTA.md; cargo fmt --check; git diff --check.

											
										
										
											2026-06-02 17:43:10 +02:00
+								| Colibri component           | Trifecta role                         | Status  |
 								| --------------------------- | ------------------------------------- | ------- |
 								| `colibri-deepseek`          | Cache probe, hit metering             | ✅ done |
 								| `colibri-daemon/cost.rs`    | CostMode, budget enforcement          | ✅ done |
 								| `colibri-daemon/session.rs` | 3-region prompt, compaction           | ✅ done |
 								| Cache warming (T1.4 PR3b)   | Pre-warm stable prefix                | ✅ done |
 								| Prompt discipline (T1.4)    | Byte-stable assembly, cost-aware trim | 🔧 WIP  |
 								| Trifecta dashboard (T1.5)   | Per-task cost/speed/perf metrics      | 📋 plan |
 								| Eval harness (T1.6)         | Task success measurement              | 📋 plan |
 								| Model selection (T2.x)      | Arbitrage engine, cost-aware routing  | 📋 plan |
 								| VSpec support (T2.x)        | Image tokens in prompt assembly       | 📋 plan |
-												docs: Colibri Tokenomics — trifecta framework (performance/speed/cost)

Strategic vision integrating Indie Devdan's agent trifecta concept into
the Colibri roadmap. 'More useful tokens > fewer useful tokens' mapped
onto existing T1.4 cache-first architecture.

Trifecta = Performance (task success) + Speed (cache-hit/latency) +
Cost (dollars per result). Token arbitrage as the golden line:
maximize cache-hit surface, spend on useful context, trim waste.

Validates Colibri's 3-region prompt + CostMode + cache warming are
already trifecta-aligned. Adds T1.5 (dashboard) and T2.x (model
selection arbitrage, VSpec support) to roadmap.

											
										
										
											2026-06-02 15:19:21 +02:00
 								## Reference
 								- Video: "Agent Specs: The Unreasonable Effectiveness of Useful Tokens"
 								  https://www.youtube.com/watch?v=o4KZH_KSqYQ
 								- Colibri T1.4 Prompt Discipline: `docs/T1.4-PROMPT-DISCIPLINE-PLAN.md`
 								- Colibri Glasspane Design: `docs/COLIBRI-GLASSPANE-DESIGN.md`