zot/internal at b188801c32eae959bd333f4a94ec8aec08f8c5c1 - clawdie/zot

mirror of https://github.com/patriceckhart/zot.git synced 2026-06-26 21:36:31 +02:00

History

patriceckhart b188801c32 perf(prompt): slim system prompt from 1064 tokens to 126 tokens Earlier I bloated the default system prompt on purpose to cross Anthropic's 1024-token cacheable-prefix floor, on the theory that small prompts lose every fresh session to R=0. That theory turned out to be wrong: the real reason fresh sessions looked expensive was the double-counting bug in Stream() (message_start and message_delta both ship cumulative usage, we were summing both). Once that was fixed, the padding stopped earning its bytes. New default: You are zot, a lightweight terminal coding agent. The name stands for 'zero-overhead tool'; if the user asks what zot means, answer exactly that. Your output renders in a TUI that understands markdown for prose and plain text for tool output. Use markdown freely, keep answers concise, and let tool calls speak for themselves rather than narrating them in prose before you invoke them. Act first, then summarise what you did. Removed the tool-listing section (the provider already advertises tools in the request's tools[] array, so listing them in prose was pure duplication) and the full operating-guidelines block (frontier models already internalise "prefer edit over write", "read before editing", "don't run sudo", etc.). Benchmarked head-to-head on a fresh 2-turn session with the same scenario: heavy prompt (1064 tokens total): turn 1: in=7 R=0 W=7550 out=117 $0.0501 turn 2: in=6 R=2714 W=2176 out=61 $0.0165 total: $0.0666 slim prompt (126 tokens total): turn 1: in=1522 R=0 W=3636 out=111 $0.0334 turn 2: in=6 R=3642 W=60 out=71 $0.0040 total: $0.0374 Slim is 44% cheaper across two turns. Turn 2 is where it really pays off (W=60 vs W=2176): every extra token in the base prompt gets re-written on every turn because the trailing-user cache checkpoint keeps advancing. --system-prompt, --append-system-prompt, and $ZOT_HOME/SYSTEM.md still work and take precedence for users who want more biasing. 'what does zot mean?' still returns exactly 'zero-overhead tool'.		2026-04-19 19:23:16 +02:00
..
agent	perf(prompt): slim system prompt from 1064 tokens to 126 tokens	2026-04-19 19:23:16 +02:00
assets	add logo to callback page	2026-04-18 10:15:53 +02:00
auth	add auto compaction	2026-04-18 10:34:08 +02:00
core	fix(no-yolo): don't auto-refuse tool calls in non-interactive modes	2026-04-19 19:17:05 +02:00
extproto	feat(ext): phase 4 - full-event interception, arg rewrites, /reload-ext	2026-04-19 17:02:04 +02:00
provider	perf(anthropic): fix cost double-count, tighten caching, correct catalog	2026-04-19 18:57:18 +02:00
skills	perf(prompt): cut system prompt to the bone (410 -> 54 tokens)	2026-04-19 17:39:38 +02:00
tui	perf(read): drop line numbers from model-facing output	2026-04-19 17:33:05 +02:00