Commit graph

69 commits

Author SHA1 Message Date
patriceckhart
d50eed7b3c docs(prompt): teach zot what its name means
Adds one sentence to the default system prompt so the model has a
canonical answer when the user asks what zot stands for:
"zero-overhead tool".

Verified: zot -p "what does zot mean?" now returns exactly that.
2026-04-19 19:18:35 +02:00
patriceckhart
e2f2092478 fix(no-yolo): don't auto-refuse tool calls in non-interactive modes
Previously --no-yolo in -p / --json / rpc modes auto-refused every
tool call. That made the flag dangerous to pass to scripts: a
single --no-yolo in a shell config or wrapper script would silently
break any tool-using prompt.

New behaviour:
  - Default: every mode is yolo (tools run freely, no prompts).
  - --no-yolo + interactive TUI: confirm dialog before each tool.
  - --no-yolo + -p / --json / rpc: stderr warning and ignore the
    flag. Tools run freely; scripts keep working.

The TUI confirm dialog and /yolo runtime toggle still work as
before. Also removed the unused wireNoYoloAutoRefuse helper and
simplified core.NewConfirmGate's doc comment.
2026-04-19 19:17:05 +02:00
patriceckhart
ac6d556f0a feat(tool-gate): --no-yolo flag, confirm dialog, /yolo runtime toggle
Adds a per-tool-call confirmation gate. Default stays yolo mode
(tools run freely, same as today). Pass --no-yolo to require
explicit user approval before each tool invocation.

Interactive TUI:
  A dialog appears before every tool call. Shows the tool name and
  a one-line preview of its args (command / path / url / etc.)
  with four choices, selectable by arrow keys or numeric shortcut:

    1. yes                     (run this call)
    2. yes, always this tool   (skip prompts for this tool,
                                session-scoped)
    3. yes, always             (skip prompts for every tool,
                                session-scoped)
    4. no                      (refuse and let the model try
                                something else)

  Esc/ctrl+c refuses the current prompt. Esc during a running turn
  both cancels the turn AND drains any pending confirm so the
  agent goroutine doesn't deadlock. Multiple pending confirms are
  queued and answered one at a time with a count visible in the
  header.

  Type /yolo to disable the gate for the rest of the session
  (equivalent to the "yes, always" choice but without needing a
  pending prompt). Any currently-open confirm auto-allows so the
  agent keeps moving.

Print / JSON / RPC modes:
  No interactive prompt is available, so every tool call is
  auto-refused with a reason the model can learn from:
  "tool call refused: --no-yolo is active and there is no
  interactive prompt in this mode; ask the user what to do
  instead". Observed behaviour: the model pivots to asking the
  user directly instead of looping on the same tool.

Implementation:
  internal/core/confirm.go
    - ConfirmDecision, Confirmer interface
    - ConfirmGate with session-scoped memory for "always this tool"
      and "always everything" decisions, both concurrency-safe
    - BuildPreview: turns {"command":"ls"} into "ls", etc.
    - Lives in core to avoid a modes -> agent import cycle
  internal/core/confirm_test.go
    - Tests: nil gate allows, nil-inner refuses with reason, one-
      shot allow doesn't remember, remember-tool short-circuits
      only same tool, remember-all short-circuits everything,
      refusal reasons surface, empty-reason gets a default,
      runtime AllowAll works, BuildPreview handles each field
  internal/agent/modes/confirm_dialog.go
    - Queue-based dialog, HandleKey wiring, CancelAll and
      AllowAllPending for the two exit cases
  internal/agent/modes/interactive.go
    - InteractiveConfig gains NoYolo + ConfirmGate fields
    - Interactive implements core.Confirmer via a response channel
    - Confirm dialog dispatched FIRST in the key-handler chain so
      keys never leak to other dialogs while the agent is blocked
    - Esc-while-busy also calls confirmDialog.CancelAll so the
      agent unblocks
    - /yolo slash command handled in runSlash
  internal/agent/cli.go
    - Constructs the ConfirmGate when args.NoYolo is set,
      BeforeToolExecute calls it first, extensions only see calls
      the user already approved
    - After iv is built, SetConfirmer(iv) wires the gate's inner
      so interactive + gate share the same struct
    - wireNoYoloAutoRefuse() for print / json modes
  internal/agent/args.go
    - --no-yolo flag and help text
  internal/agent/modes/slash_suggest.go
    - /yolo added to slashCatalog

Verified end-to-end: fresh zot --no-yolo -p "read sample.ts" now
returns "I can't read files in this mode (--no-yolo without an
interactive prompt). How would you like to proceed" instead of
actually reading.
2026-04-19 19:12:45 +02:00
patriceckhart
f371687654 perf(anthropic): fix cost double-count, tighten caching, correct catalog
The status-bar was showing 2x the real cost. Anthropic's SSE stream
sends the full cumulative usage payload on both message_start AND
message_delta, and our code was summing them with += on each. Cache
tokens, the biggest cost component on multi-turn sessions, were
therefore counted twice on every single API call.

Fix: assign instead of accumulate within one Stream() invocation.
Cross-call accumulation still happens correctly in
core.CostTracker.Add(). Verified end-to-end: a truly fresh "read
sample.ts on desktop" session that used to report $0.15 now reports
$0.07 with the same cache-hit rate.

While chasing that, audited and corrected the rest of the request
pipeline so the cache actually hits cleanly.

Provider layer (internal/provider/anthropic.go):
  - cache_control on the Claude Code identity line (was uncached),
    giving Anthropic a first stable checkpoint independent of the
    user system prompt. Turns a cold start from R=0 into R>0 for
    any subsequent fresh session within the cache TTL.
  - tool_result blocks go in their OWN new user message instead of
    merging into the preceding user message. Merging was mutating
    the prior user message's content array between turns, busting
    byte-identical prefix match in Anthropic's cache.
  - tagLastUserCache: exactly one cache_control on the last user
    message (was two), so identity + sysprompt + last-tool +
    last-user fits Anthropic's 4-breakpoint budget exactly.
  - user-agent dropped its "(external, cli)" suffix to match the
    canonical Claude Code string exactly.
  - ZOT_DEBUG_ANTHROPIC=<path> env hook appends each outgoing
    request body (one JSON object per line) to that file. Off by
    default; for debugging cache / cost issues in the field.
  - Usage field handling now correctly assigns the latest value
    from each SSE event instead of summing.

Core (internal/core/tool.go):
  - Registry.Specs() now sorts tools alphabetically. Go map
    iteration order is randomized per call; randomized tool arrays
    were breaking Anthropic's byte-level prefix match on every
    single call within a session.

System prompt (internal/agent/systemprompt.go):
  - Restored a substantial default prompt with structured tools +
    operating guidelines sections. The earlier aggressive trim
    dropped us under Anthropic's 1024-token minimum cacheable
    prefix floor: prefixes below 1024 tokens are silently NOT
    cached by Anthropic, so every fresh session started cold with
    R=0 no matter what else we did.
  - Current default ~1040 tokens on its own; with identity and
    tools it's ~1400, comfortably above the 1024 floor.
  - --system-prompt, --append-system-prompt, and
    $ZOT_HOME/SYSTEM.md escape hatches all still work and take
    precedence.

Model catalog (internal/provider/models.go):
  - claude-opus-4-5: 1M ctx / 128k max -> 200k ctx / 64k max. I had
    over-extrapolated; 1M context is a 4.6+ feature.
  - gpt-5.4: 400k -> 272k. Canonical value on both the OpenAI
    direct API and the ChatGPT Codex OAuth backend.
  - gpt-5.1, gpt-5.2, gpt-5.3, gpt-5.4-mini: pinned to 272k.
    OpenAI advertises 400k on direct and Codex caps at 272k. zot
    serves both from one catalog row per id, so we pin to the
    smaller number to keep the context-usage meter honest under
    subscription auth. Direct-API users see a conservative estimate
    instead of an inflated one.

README:
  - Tiny capitalization touch-up on the opening line.
2026-04-19 18:57:18 +02:00
patriceckhart
05d0df91b8 perf(prompt): cut system prompt to the bone (410 -> 54 tokens)
Three levers, all dead-simple, compounded savings.

1) System prompt rewritten to a one-line identity.
   Was 410 tokens (identity + tool listing duplicating the tool
   schemas + operating guidelines that frontier models already
   internalise). Now 54 tokens:

       You are zot, a lightweight terminal coding agent. Be
       concise, act on the user's request directly, and reply
       with a short summary when done.

   The old 'You have the following tools available:' block listed
   every tool by name and description, which the provider sends
   alongside the actual tool schemas. Pure duplication. Dropped.
   Operating guidelines (prefer edit over write, read before
   editing, don't apologize, etc.) are ~150 tokens of advice the
   model already follows by default. Dropped.

2) Tool descriptions trimmed.
   read:  long paragraph   -> 'Read a file. Images (png/jpg/gif/webp) return inline.'
   write: long paragraph   -> 'Write a file. Creates parent dirs. Overwrites.'
   edit:  long paragraph   -> 'Edit a file via exact-match replacements. Each oldText must be unique in the file.'
   bash:  long paragraph   -> 'Run a shell command. stdout+stderr merged.'
   skill: 2-sentence para  -> 'Load a named skill's instructions. Use when the user's request matches a skill listed above.'

3) Tool schemas minified.
   Every schema was pretty-printed JSON with per-field descriptions
   that reiterated the tool's own description ('Path to the file
   to read (relative or absolute)'). The model infers the obvious
   from property names. Schemas now single-line, type+required
   only. Saves ~20-40 bytes per schema, 5 tools = ~150 bytes per
   request.

Net effect on a fresh OAuth turn, measured end-to-end:

   request body:   3205 bytes -> ~1600 bytes
   system prompt:  410 tokens -> 54 tokens
   tools payload:  ~400 tokens -> ~100 tokens

Escape hatches preserved: --system-prompt (per-run), --append-
system-prompt (per-run, repeatable), and $ZOT_HOME/SYSTEM.md
(persistent) all still work and take precedence over the built-
in identity when set.
2026-04-19 17:39:38 +02:00
patriceckhart
f5719c6be1 perf(read): drop line numbers from model-facing output
The read tool used to prefix every line with '%6d\t' (6-digit
line number + tab), which added ~7 bytes / ~2 tokens per line to
every read. On typical source files that's 15-20% of the read's
token budget, repeated on every subsequent turn as the tool
result stays in context.

The line numbers weren't earning their keep: the model edits via
exact-match text replacement, never line ranges, and the tui has
always been capable of drawing its own gutter. Now it does:

- Tool output is raw file bytes, no prefix.
- A 'start_line' detail is attached to the ToolResult so the tui
  knows where to start counting.
- The tui renders a synthetic gutter over the raw content using
  the existing renderNumberedFile path (new: renderRawFile), so
  on-screen it still looks exactly like cat -n.

The old numbered format is still recognised for legacy transcripts
saved before this commit (looksLikeNumberedFile guard stays).

Measured: sample.ts (388 lines) used to cost 14957 bytes to send,
now costs 12291 bytes (raw file). Saves ~670 tokens per read of a
medium file; the same fraction applies to larger files too.

Tests: TestReadOffsetLimit rewritten to assert raw output +
start_line detail. TUI renderToolText signature grew one int
(startLine) plumbed through renderToolResultContent.
2026-04-19 17:33:05 +02:00
patriceckhart
5cc54822cd fix(skills): tell the model where each skill body lives
The system-prompt addendum now tags every skill with a source
pointer: '[builtin]' for skills embedded in the zot binary, or the
SKILL.md path (HOME collapsed to ~) for user-installed ones. The
body still loads on demand through the 'skill' tool by name, so
no behaviour change for execution, this is pure disambiguation.

Before:
  - write-zot-extension — Help the user create a new zot extension...

After:
  - write-zot-extension [builtin]: Help the user create a new zot extension...
  - code-review [~/Library/Application Support/zot/skills/code-review/SKILL.md]: ...

Why: built-in skills have no filesystem path because their
markdown is embedded in the binary. Without the [builtin] tag the
model had no way to distinguish them from user skills, and could
mistakenly try to read a nonexistent file. The path pointer for
user skills also helps the model cite where guidance came from
and reason about trust (builtin vs project-local vs global).

Test updated; addendum grew to ~123 tokens for a typical 3-skill
setup (code-review, test-fix, write-zot-extension).
2026-04-19 17:24:05 +02:00
patriceckhart
1a2ab427fe fix(tui): hide empty sessions from the /sessions picker
Zero-message sessions no longer appear in the picker. Covers three
cases: the currently-running session (its file exists but no prompt
has landed in it yet), sessions the user exited immediately after
/clear, and any stale empties that PruneEmptySessions hasn't swept
yet.

Resuming an empty session was always a no-op, so nothing is lost.
The '(empty)' fallback summary in formatSessionRowPlain stays as
defense in case MessageCount > 0 but FirstUserText is blank.
2026-04-19 17:16:45 +02:00
patriceckhart
961f99da9e chore: scrub stray "pi" references from source comments
Five leftover comments in internal/tui/ mentioned "pi" / "pi's" /
"pi-style" / "pi's cli-highlight" / "pi's extToLang" from the old
development era. Rewritten to describe the behaviour on its own
terms with no external reference.

Also renamed the two tui helpers piFormatTokens -> formatTokens and
piContextUsage -> contextUsage so the source grep stays clean.

No behaviour change, all tests pass.
2026-04-19 17:06:45 +02:00
patriceckhart
99c9ba8062 feat(ext): phase 4 - full-event interception, arg rewrites, /reload-ext
Clears every deferred extension todo in one push:

1) Interception expands to three events: tool_call (already shipped),
   turn_start (gate the turn before the model call, e.g. rate-limit /
   business-hour), and assistant_message (suppress or rewrite the
   user-visible text while keeping the model's original output in
   the transcript).

2) Tool-call args can now be rewritten mid-flight. An interceptor
   returning modified_args replaces the JSON the tool actually
   receives, without the model seeing the rewrite. Chains: each
   subscriber sees the previous one's output, letting guards
   successively redact / patch / augment. Invalid JSON is dropped
   safely.

3) /reload-ext hot-reloads every extension without restarting zot.
   The manager gracefully shuts down all running subprocesses,
   re-reads extension.json from disk, respawns (including --ext
   paths remembered from startup), and the host rebuilds the agent's
   tool registry in-place so freshly-registered tools are callable
   immediately.

Wire-format changes (extproto):
- EventInterceptResponseFromExt gains modified_args and replace_text
  fields (both optional, ignored when block=true).
- EventInterceptFromHost gains Step (for turn_start) and Text (for
  assistant_message) alongside the existing tool_call payload.

Core agent changes:
- BeforeToolExecute signature now returns (allowed, reason,
  modifiedArgs json.RawMessage). Non-nil+valid JSON args replace
  tc.Arguments before Tool.Execute runs.
- New BeforeTurn hook, invoked in runLoop before oneTurn. Blocking
  cancels the turn with an EvTurnEnd{StopError} carrying the reason.
- New BeforeAssistantMessage hook, invoked after finalMsg is
  assembled but before the EvAssistantMessage emit. Supports
  suppress (block=true) and text rewrite (replace_text). Transcript
  always gets the original; UI gets the rewritten text.
- New SetTools(reg) so /reload-ext can swap the registry on the
  live agent under the agent mutex.

Manager changes:
- InterceptToolCall now returns InterceptResult (Block, Reason,
  ModifiedArgs, ReplaceText), with a chain that folds rewrites.
- New InterceptTurnStart and InterceptAssistantMessage.
- New Reload(ctx, grace) tears down and respawns everything,
  returning ReloadStats{Stopped, Loaded, Ready, Errors}.
- New SetOnReload(fn) callback the host uses to rebuild the agent
  tool registry after a reload.
- LoadExplicit remembers --ext paths so Reload respawns them.
- subscribe accepts "tool_call", "turn_start", "assistant_message"
  under "intercept".

SDK (pkg/zotext):
- New handler types: ToolCallHandler, TurnStartHandler,
  AssistantMessageHandler, and their decision structs
  (ToolCallDecision with ModifiedArgs, AssistantMessageDecision
  with ReplaceText).
- New registration methods: InterceptToolCallX (rich variant of
  the existing InterceptToolCall), InterceptTurnStart,
  InterceptAssistantMessage.
- dispatchIntercept routes per-event with panic recovery and
  always emits exactly one event_intercept_response.

TUI:
- /reload-ext slash command registered in slashCatalog and
  runSlash. Added to slashCancelsTurn so it waits for idle like
  /compact does.
- runReloadExt shows a "reloading extensions..." status, runs the
  Manager.Reload on a goroutine, and reports the resulting stats.

Tests:
- internal/core/intercept_test.go: verifies args are actually
  rewritten on the way to Tool.Execute, malformed JSON is ignored,
  and block surfaces the reason as an error ToolResult.
- internal/agent/extensions/intercept_test.go: end-to-end with a
  bash extension subprocess that blocks rm -rf, rewrites other bash
  args to "echo GUARDED:", passes through read calls, allows
  turn_start, and redacts SECRET in assistant messages. Second test
  verifies Reload respawns the subprocess, re-registers its command,
  and fires the onReload callback.

Docs:
- docs/extensions.md: rewrote the intercept section to cover all
  three events, added a table of event_intercept_response fields,
  documented the /reload-ext hot-reload command, expanded the SDK
  section with examples of every handler, moved the old "future"
  items into a shipped Phase 4.
- README.md: extensions summary mentions intercept beyond tool_call,
  /reload-ext added to the slash-commands table and to the
  turn-cancel list in "Queued messages".
2026-04-19 17:02:04 +02:00
patriceckhart
9f031f941a docs: complete README, proper capitalization, add 130x130 logo
- Add the zot logo at the top of the README, sized to 130x130 via
  an HTML <img> tag so GitHub honours the dimensions. Reuses the
  existing internal/assets/zot-logo.png that's already embedded in
  the binary for the login callback pages, so the README is
  self-contained.
- Proper English capitalization in prose and section headings
  (intro bullets kept lowercase per the custom intro).
- Drop em-dashes in favour of periods, commas, semicolons.
- No emojis anywhere.
- Fill in the previously-missing flags: --ext / -e, --no-ext,
  --with-skills, --no-skill.
- Add /skills row to the slash-commands table, plus a dedicated
  /skills section describing the picker.
- New dedicated Extensions section documenting zot ext install /
  list / logs / enable / disable / remove, --ext for development,
  and pointing at examples/extensions + docs/extensions.md.
- New dedicated Skills section documenting the discovery layout,
  --with-skills opt-in, and the claude / agents compatibility
  paths.
- Updated $ZOT_HOME tree to include skills/ and extensions/.
- Read-only-while-busy slash list in "Queued messages" updated to
  include /skills.
- Source layout table expanded with internal/agent/extensions,
  internal/extproto, internal/skills, pkg/zotcore, pkg/zotext.
- JSON mode now links docs/rpc.md for the schema instead of the
  stale instructions.md §8 reference.
- ctrl+o description specifies which tools' output it collapses.
2026-04-19 16:25:48 +02:00
patriceckhart
b9e7517149 feat(tui): show github release notes once after upgrading
The first time a user launches a newer zot binary, the tui pops
a dismissible overlay with the release notes for that version.
Press any key to close; the version goes into config.json's
last_changelog_shown so the same notes never reappear.

Lifecycle:
  - dev builds (version "" / "dev" / "0.0.0"): no fetch ever
  - first-ever launch (no LastChangelogShown stored): seed it
    silently with the current version so fresh installs don't
    get release notes dumped at them
  - subsequent launches with the same version: skipped (config
    already records that version was shown)
  - launch with a different version: fetch the release page from
    https://api.github.com/repos/patriceckhart/zot/releases/tags/v<ver>
    and open the dialog if the body is non-empty
  - dismiss writes LastChangelogShown so it never repeats

Components:
  - internal/agent/changelog.go: FetchChangelog/Async, and the
    Should/Mark/Seed helpers around config.LastChangelogShown.
    Honours $GITHUB_TOKEN exactly like the install scripts and
    the existing update check, so private-repo fetches work
    with auth.
  - internal/agent/modes/changelog_dialog.go: the overlay.
    Markdown body via the existing RenderMarkdown pipeline,
    scrollable with up/down/pgup/pgdn, any other key dismisses.
  - internal/agent/modes/interactive.go: new ChangelogChan and
    OnChangelogDismiss config fields, single-shot select case
    in Run() that opens the dialog when a payload arrives.
  - internal/agent/cli.go: spawns the fetch goroutine, gates it
    on ShouldShowChangelog, wires OnChangelogDismiss to
    MarkChangelogShown so the version is persisted.

Best-effort: timeouts at 4s, missing tag => silent skip, network
failure => silent skip + retry on next launch (no
LastChangelogShown update if we never showed anything).

Documented in the README under the SYSTEM.md note.
2026-04-19 16:12:13 +02:00
patriceckhart
fbad128c4c feat(skills): user skills now opt-in via --with-skills
Behaviour change: a fresh zot run now loads only the built-in
skills compiled into the binary. User-installed SKILL.md files
under $ZOT_HOME/skills/, .zot/skills/, .claude/skills/, or
.agents/skills/ stay dormant until the user explicitly opts in
with --with-skills.

Three discrete modes:

  zot                       built-ins only (default)
  zot --with-skills         built-ins + user skills
  zot --no-skill            nothing (no skill tool, no manifest)

Rationale: a fresh install should have a deterministic skill
set, regardless of what's already lying around in $ZOT_HOME from
old experiments. Built-ins ship with the binary so they're
auditable; user skills are loaded only when the user explicitly
asks for them.

Discover() gained an includeUser bool (was 3 args, now 4). The
in-tree caller updated; the test that exercised the old signature
gets includeUser=true so its existing assertions still hold.
scanUserSkills() split out so the includeUser=false path is a
cheap no-op.

End-to-end verified live:
  default              -> write-zot-extension (only)
  --with-skills        -> write-zot-extension + code-review + test-fix
  --no-skill           -> no skill tool, no manifest at all
2026-04-19 16:03:26 +02:00
patriceckhart
e9ffc74442 feat(skills): --no-skill flag to disable all skills (including built-ins)
Mirrors --no-ext: one flag that skips skill discovery + the
`skill` tool registration entirely for one run. Useful for
clean-room runs where you want zero extra context biasing the
model.

Defaults are unchanged:
  - user-installed skills under $ZOT_HOME/skills/, .zot/skills/,
    .claude/skills/, .agents/skills/ load normally
  - built-in skills compiled into the binary (currently the
    write-zot-extension authoring guide) load normally
  - both groups appear in the system-prompt manifest and are
    loadable via the `skill` tool

With --no-skill (or --no-skills):
  - skills.Discover() is not called
  - the `skill` tool is not registered
  - no "Available skills" addendum is appended to the system
    prompt
  - /skills picker is empty

Wired into both interactive and rpc modes via the existing
SkillSnapshot wiring (returns nil when args.NoSkill is set, so
the picker also stays empty).

End-to-end verified live:
  default            : tool list includes "skill"
  --no-skill         : tool list is read, write, edit, bash only
                       (no skill, no extra context manifest)
2026-04-19 15:58:38 +02:00
patriceckhart
2cffe048c9 feat(skills): built-in extension-author skill, hidden from /skills
The model now ships with a `write-zot-extension` skill compiled
into the binary. When the user asks for help authoring a zot
extension (slash command, LLM tool, audit hook, permission gate)
the model sees the skill in its system-prompt manifest, calls
the `skill` tool to load the body on demand, and walks the user
through the right answer with the wire format, manifest shape,
SDK examples (Go + TS + Python), and dev workflow already in
context. No need for the user to be in the zot repo or to ask
the model to read docs/extensions.md first.

Built-in skills:
  - shipped via //go:embed at internal/skills/builtin/
  - merged into Discover()'s output AFTER user skills, so a
    user-installed skill with the same name shadows the built-in
    (drop your own SKILL.md at $ZOT_HOME/skills/write-zot-extension/
    to customise)
  - marked Builtin: true on the Skill struct
  - hidden from user-facing surfaces: VisibleSkills() filters them
    so /skills only shows skills the user actually installed or
    shipped in their project

The model side stays unchanged: system-prompt manifest still lists
built-ins (so the model knows they exist), the `skill` tool still
loads them on demand. Only the picker is filtered.

Verified live:
  prompt: "List the names of the skills you have available"
  -> code-review, test-fix, write-zot-extension

  prompt: "I want to write a zot extension that adds a slash
           command /pwd which inserts the current directory path
           into the editor. What language should I use, and what
           files do I need to create?"
  -> [tool_call] skill({"name":"write-zot-extension"})
  -> body returned
  -> the model produces a complete extension with the right
     manifest, the right hello/register/ready frames, action:
     insert correctly chosen, and a remark about cwd capture.

The picker filter has its own unit test
(TestVisibleSkillsHidesBuiltins) and the existing Discover test
was updated to expect the built-in count without hardcoding it.
2026-04-19 15:55:25 +02:00
patriceckhart
e425dbba59 feat(systemprompt): $ZOT_HOME/SYSTEM.md as a persistent override
Resolution order for the system prompt is now:

  1. --system-prompt <text>    (per-run override; highest)
  2. $ZOT_HOME/SYSTEM.md       (persistent user override; new)
  3. built-in defaultIdentity + defaultGuidelines

When SYSTEM.md exists and --system-prompt is empty, its contents
replace the entire identity + guidelines block (same semantics
as --system-prompt). The tool list + skill manifest + appended
sections + date/cwd footer are still added on top, so the file
should contain just the identity / behavior text.

readUserSystemPrompt swallows file errors on purpose: a missing
or unreadable SYSTEM.md falls back to the built-in default
rather than crashing the run. Cached on Resolved.systemCustom
so MergeExtensionTools' system-prompt rebuild path also picks
up the override.

End-to-end verified live:
  with SYSTEM.md (pirate persona): "Arrr, I be Zot, a scallywag..."
  without:                         "I'm zot, a lightweight terminal..."

Documented in README's $ZOT_HOME tree + the --system-prompt
flag note.
2026-04-19 15:47:57 +02:00
patriceckhart
84dbd86aee feat(extensions): --no-ext to skip discovery for one run
Useful for clean-room runs ("does this prompt work without my
extensions interfering?") and for reproducing issues without
auto-loaded guardrails.

  zot --no-ext              # zero extensions
  zot --no-ext --ext ./x    # only x; nothing else gets discovered

Explicit --ext paths still load on top, so --no-ext is "skip the
implicit scan", not "block all extensions". Lets you run with
exactly one extension without uninstalling the rest.

Wired into both interactive (cli.go) and rpc (rpc.go) modes.
Help text updated.

The "── extensions ────" divider in the slash autocomplete
already hides itself when no extension commands are registered
(empty s.extra short-circuits allCatalog before the header
insertion path), so --no-ext naturally produces a clean popup
with no orphan rule. pruneOrphanHeaders also drops the rule
when filter input narrows the extension group to zero matches.

End-to-end verified live:
  default                  : agent sees weather + read_notes
  --no-ext                 : agent sees only the 5 built-ins
  --no-ext --ext scratchpad: agent sees built-ins + read_notes
2026-04-19 15:42:34 +02:00
patriceckhart
e9b466a1ce tui: blank row after the slash-popup hint line 2026-04-19 15:36:47 +02:00
patriceckhart
4f2655a8b8 tui: blank row before the extensions divider in slash autocomplete 2026-04-19 15:36:13 +02:00
patriceckhart
789665fb1c tui: stable order + "── extensions ──" divider in slash autocomplete
Two related fixes for the popup that opens when you type /:

1) Extension commands no longer jump around between renders.
   SetExtra now sorts the incoming list by name, so map-iteration
   randomness in Manager.Commands() doesn't reshuffle the popup
   on every keystroke.

2) Visual grouping. The popup now inserts a header row between the
   built-in commands and the extension-registered ones:

       /help       show key bindings ...
       ...
       /exit       exit zot
       ── extensions ─────────────────────────
       /hello      say hello (optional name)
       /note       append text to the scratchpad
       ...

   Headers are non-navigable: Up/Down skip across them, the cursor
   never lands on one, and Selection / matches resolve through them.
   pruneOrphanHeaders drops a header when filtering eliminated every
   command underneath it (typing "/he" no longer shows a lone
   "── extensions ──" divider after the matching built-ins).

Removed the now-redundant "(ext: name)" suffix on extension
descriptions; the header makes the source obvious.

slashCommand grew a Header bool; existing two-element struct
literals in slashCatalog converted to named fields.

Tests pass with -race; vet clean on linux/windows/darwin.
2026-04-19 15:34:41 +02:00
patriceckhart
744685d68d fix(extensions): close ext.commands / ext.tools / lastFrameTime races
CI's `go test -race` flagged two races introduced over the recent
extension work:

1. ext.commands / ext.tools were appended to from the read-loop
   goroutine without a lock, while Commands() / Tools() / HasCommand
   read them under m.mu. Same for the toolIndex pre-check. Fix:
   take m.mu around the appends and the index dedup so writers and
   readers serialise on the same lock.

2. assumeReadyAfterIdle read ext.lastFrameTime once on entry without
   the per-extension lock (the read inside the loop already had it).
   Fix: take ext.mu for the initial snapshot too.

Verified locally with `go test -race ./...`; all packages pass.
The corresponding CI run for the scratchpad-persistence commit
failed for exactly these two races on linux + macos.
2026-04-19 15:27:09 +02:00
patriceckhart
463c62daf3 docs: clarify that no extensions are installed by default
Two-line addition to docs/extensions.md and a tightening of the
README bullet point. examples/extensions/* are reference code; a
fresh `zot install` gives you a clean agent. Users opt in by
copying examples (or any other extension) via `zot ext install`
or by pointing `zot --ext PATH` at a working directory for one
session.

No code changes.
2026-04-19 15:23:48 +02:00
patriceckhart
7e94b0776b feat(extensions): --ext PATH (short -e) for ad-hoc loading
Loads an extension from any directory for one zot session without
needing to copy / install it under $ZOT_HOME. Repeatable. Resolved
to absolute before spawn so paths like "." survive a later cwd
change. Loaded BEFORE the installed-extension scan so explicit
paths win on name conflicts, letting you shadow an installed copy
with a work-in-progress version.

  zot --ext ./my-extension        # one extension
  zot -e ./a -e ./b               # multiple
  zot --ext .                     # the cwd is itself an extension

Manager.LoadExplicit is the public entry point. Spawns happen in
parallel like the regular Discover path, with per-path errors
returned so a typo in one --ext doesn't break the others.

Wired into both interactive (cli.go) and rpc (rpc.go) modes.
Help text + docs/extensions.md updated.

Verified end-to-end: disabling the installed scratchpad,
running `zot rpc --ext .` from its directory, and asking the
model to list its tools shows read_notes available again.
2026-04-19 15:20:56 +02:00
patriceckhart
83b64e2562 examples: scratchpad notes persist to .zot/scratchpad-notes.jsonl
Notes are now project-local and survive zot restarts. The
extension reads its cwd from the hello_ack handshake, then:

  - on /note    appends one line of {"at":..,"text":..} JSONL
                under <cwd>/.zot/scratchpad-notes.jsonl
  - on /notes   reads the in-memory cache (loaded on hello_ack)
  - on /clear-notes truncates the file and clears the cache
  - on read_notes (tool) returns the cached set

Single-writer assumption (one zot session per cwd at a time);
two concurrent zot processes writing to the same file would
interleave but JSONL line boundaries stay intact under POSIX
PIPE_BUF semantics. Good enough for an example.

Verified end-to-end: round 1 writes two notes via the slash
command, round 2 (fresh extension process, same cwd) loads
them and surfaces them via both /notes and the read_notes
tool.
2026-04-19 15:16:06 +02:00
patriceckhart
619ed587cd perf(extensions): parallel discovery + auto-ready idle watchdog
Three independent fixes to startup latency:

1) Discover spawns extensions in parallel.
   Before: each spawn synchronously waited on its child's hello
   frame; multiple slow runtimes (e.g. tsx) added linearly.
   After: every loadOne runs on its own goroutine; total time
   collapses to max(spawn_time) instead of sum.

2) WaitForReady waits in parallel.
   Before: one extension at a time, so a slow ready (or no ready
   at all from a legacy SDK) blocked every other extension's wait
   too.
   After: one goroutine per extension, all sharing a single
   deadline; total = max(per-ext wait), not sum.

3) Auto-ready idle watchdog for legacy extensions.
   Phase-1 SDK builds didn't send the ready sentinel introduced
   in phase 2. Without it, WaitForReady burned the full 3s grace
   on every startup for every legacy extension. Fix: read loop
   stamps lastFrameTime on each frame; a per-extension watchdog
   closes readyCh as soon as no new frame has arrived for 250ms.
   Native binaries register + go quiet within microseconds, so
   this fires almost immediately. Newer extensions still trip
   the explicit ready path before the watchdog matters.

Also updates the scratchpad example to invoke `tsx` directly
instead of `npx --yes tsx`, with the README explaining how to
install tsx globally and how to fall back to npx (and what it
costs in startup time).

Measured impact on a machine with 4 extensions installed
(guard / hello / weather / scratchpad):

  before: 4.2-4.9s per zot launch
  after:  ~200ms per zot launch (cold-cache first run ~780ms)

The dominant remaining cost in the 200ms is normal node + tsx
boot for scratchpad, which only matters because it's still in
the spawn fan-out — Go extensions add nothing measurable.
2026-04-19 15:13:19 +02:00
patriceckhart
5dbbcb9040 feat(extensions): typescript example + path-aware exec resolution
examples/extensions/scratchpad: real .ts (not .js) extension, no
build step, no SDK. Runs via `npx --yes tsx index.ts` so authors
can use TypeScript without forcing a global install. Demonstrates:

  /note <text>   slash command (typed CommandResponse)
  /notes         slash command (display action)
  /clear-notes   slash command
  read_notes     LLM-callable tool (typed ToolResult)

Plus a typed wire-format subset inline so the file shows what the
protocol actually looks like from the consumer side. Pure node +
tsx, zero npm deps beyond tsx itself (~5 MB cached on first call).

Manager fix: extension exec paths are now resolved by shape:

  absolute               used as-is
  starts with ./ or ../  joined to ext.Dir
  contains a separator   joined to ext.Dir (other relative form)
  bare name (no sep)     left as-is so $PATH lookup works

Before this, "exec": "npx" was being looked up at
extensions/scratchpad/npx and failing with a "no such file or
directory" error. With the fix, "node", "npx", "python3", "tsx",
etc. resolve via $PATH like users intuitively expect.

Bumped WaitForReady grace from 500ms to 3s so slow runtimes
(npx tsx cold-start ≈ 1.4s) get their register_tool frames
in before the agent's tool registry is built. Extensions that
send ready quickly still release the wait immediately; the
extra grace only applies to laggards.

Verified end-to-end live against anthropic:
  prompt: "Use the read_notes tool now and tell me what's in the
           scratchpad"
  -> [tool_call] read_notes({})
  -> [tool_result] (scratchpad is empty)
  -> "The scratchpad is empty."
2026-04-19 15:06:00 +02:00
patriceckhart
83ae236571 feat(extensions): phase 3 — event subscriptions + tool-call interception
Two new capabilities, both ride on the existing subprocess
protocol with a couple of new frame types.

Event subscriptions (one-way notifications):

  ext  -> host: subscribe {events: [...], intercept: [...]}
  host -> ext:  event {event, ...payload}

  Recognised events: session_start, turn_start, turn_end,
  tool_call, assistant_message. Subscribers get fire-and-forget
  notifications on each. Useful for telemetry, audit logs, custom
  state widgets that follow live agent activity.

Tool-call interception (round-trip, can refuse):

  host -> ext:  event_intercept {id, event:"tool_call", tool_name, tool_args}
  ext  -> host: event_intercept_response {id, block?, reason?}

  When at least one extension subscribed to "tool_call" intercept,
  zot asks each one in turn before running every tool call. First
  blocker wins; reason becomes the tool-result error text the model
  sees. Per-extension 5s timeout treats unresponsive interceptors
  as "allow" so a wedged extension never stalls the agent.

Wire format additions (internal/extproto):
  ext -> host: SubscribeFromExt, EventInterceptResponseFromExt
  host -> ext: EventFromHost, EventInterceptFromHost

Manager (internal/agent/extensions):
  - per-extension eventSubs / interceptSubs sets, populated by the
    subscribe frame
  - EmitEvent fans out to every subscribed extension on its own
    goroutine (won't block the agent on slow stdin writes)
  - InterceptToolCall walks subscribers serially, returning the
    first refusal; 5s timeout per subscriber (allow on timeout)
  - readLoop handles event_intercept_response correlations the
    same way it handles command/tool responses

Core (internal/core/agent.go):
  - Agent.BeforeToolExecute hook called from runOneTool right
    before tool.Execute. Returning (allowed=false, reason)
    short-circuits with an IsError tool result containing reason.
  - Agent.OnEvent observer fires for every emitted AgentEvent;
    composed transparently with the per-Prompt sink via wrapSink
    so neither the existing TUI nor the rpc loop need changes.

Wiring (internal/agent/cli.go, rpc.go):
  - wireAgentExt sets BeforeToolExecute -> InterceptToolCall and
    OnEvent -> fanoutAgentEvent for every freshly-built agent
    (initial, login rebuild, model swap)
  - fanoutAgentEvent translates core AgentEvent kinds into
    extproto.EventFromHost. Internal-only events (text_delta,
    tool_progress) are dropped to keep the per-extension stream
    sane.
  - session_start emitted once after extensions come up

SDK (pkg/zotext):
  - On(name, EventHandler) registers per-event observers
  - InterceptToolCall(InterceptHandler) registers a single
    intercept callback
  - Run() now also sends a subscribe frame before the ready
    sentinel, with the union of subscribed events + intercept
  - Frame loop handles "event" and "event_intercept" frames,
    runs the handlers (intercepts on a goroutine to avoid
    head-of-line blocking)
  - Capabilities advertised: commands + tools + events

Example (examples/extensions/guard):
  - subscribes to session_start / turn_start / tool_call / turn_end
    and writes one-line audit entries
  - intercepts every bash call; refuses commands matching
    rm -rf, sudo, dd of=/, mkfs, the fork bomb, chmod -R 777
  - end-to-end verified live: agent -> bash("rm -rf /tmp/foo")
    -> guard refuses -> model sees the refusal text and surfaces
    it in its reply ("the guard blocked it, as expected — the
    pattern \brm\s+-rf\b matched")

Docs/extensions.md updated with all five new frame types and the
guard example.
2026-04-19 14:57:03 +02:00
patriceckhart
74709a0bd9 feat(extensions): phase 2 — extension-defined tools
Extensions can now register tools the LLM calls directly. The model
sees them in its tool list alongside the built-ins (read, write,
edit, bash, skill); when it invokes one, zot routes the tool_call
to the owning extension subprocess and feeds the tool_result back.

Wire format additions (internal/extproto):
  ext -> host:
    register_tool {name, description, schema}
    ready                                       # all initial regs flushed
    tool_result {id, content[], is_error}       # reply to a tool_call
  host -> ext:
    tool_call {id, name, args}                  # raw json args from the model

Manager (internal/agent/extensions):
  - tracks per-extension RegisterToolFromExt frames
  - validates schemas parse as JSON before registering (bad schema
    skipped + logged, doesn't crash zot)
  - toolIndex map for O(1) lookup
  - WaitForReady(grace): blocks per extension on its readyCh until
    a ready frame arrives or the grace expires; called once after
    Discover so the agent's tool registry is built against the
    final set
  - Tools() / HasTool() / InvokeTool() public surface
  - readLoop closes readyCh on stdout EOF so a wedged extension
    doesn't permanently block WaitForReady

extensionTool (internal/agent/extensions/tool.go):
  implements core.Tool. Execute() round-trips through
  Manager.InvokeTool with a 60s default timeout, decodes
  base64 image blocks, surfaces extension+tool name in
  ToolResult.Details for the renderer.

internal/agent/build.go:
  - new ExtensionToolSource interface (declared here to avoid the
    build->extensions->core import cycle) + ExtensionToolInfo
    mirror of extensions.ToolInfo
  - Resolved.MergeExtensionTools(): folds extension tools into
    ToolRegistry, re-renders the system prompt's tool summary
    with both built-in and extension tools listed
  - Resolved gains private bookkeeping fields so the rebuild
    works without re-running Resolve

internal/agent/cli.go:
  extension manager built BEFORE the agent in interactive mode
  so MergeExtensionTools can fire before NewAgent. Same in
  buildAgent + buildAgentFor closures so login / model-switch
  rebuilds also include extension tools. extToolAdapter bridges
  *extensions.Manager to ExtensionToolSource.

internal/agent/rpc.go:
  extension lifecycle now also runs in `zot rpc` mode. Notify and
  Display from extensions surface as `ext_notify` / `ext_display`
  events on the rpc stream so any consumer can react.

pkg/zotext (Go SDK):
  - ToolHandler, ToolResult, ToolContent types
  - Tool(name, desc, schema, fn) registration method
  - TextResult / TextErrorResult / Image / ImageBytes constructors
  - Run() now also flushes register_tool frames + a final ready
    sentinel after the last registration

examples/extensions/weather: working Go example registering one
tool. Deterministic fake weather (sha1 of city -> temp + cond) so
the demo is repeatable. Plus README explaining how to install.

Tests:
  internal/agent/extensions/tool_test.go: spawns a mock /bin/sh
  extension that registers a tool, sends ready, and echoes tool
  calls. Verifies registration timing, lookup via HasTool/Tools,
  invoke roundtrip via InvokeTool.

End-to-end verified against live anthropic backend:
  prompt: "What is the weather in Berlin?"
  -> [tool_call] weather({"city":"Berlin"})
  -> [tool_result] Berlin: 16°C, fog (deterministic fake)
  -> reply: "Berlin is 16°C."

Docs/extensions.md updated with phase 2 wire format, the new SDK
tool API, and the weather example.
2026-04-19 14:46:32 +02:00
patriceckhart
222a62c70f feat: skills — reusable instructions discovered from SKILL.md files
A skill is a single SKILL.md file with a YAML frontmatter header,
discovered from well-known directories at startup. Two integration
points:

  1. The system prompt gains a short manifest listing each skill's
     name + one-line description. Cheap (a few dozen tokens).
  2. A built-in `skill` tool lets the model load any one skill's
     full body on demand and follow the instructions there.

The on-demand-load model keeps token usage cheap: only the
manifest goes into every request; the body is fetched as a tool
result the one or two turns the model actually needs it.

Discovery (priority order — first match wins per name):
  ./.zot/skills/<name>/SKILL.md            project (native)
  $ZOT_HOME/skills/<name>/SKILL.md         global (native)
  ./.claude/skills/<name>/SKILL.md         project (claude-compat)
  ~/.claude/skills/<name>/SKILL.md         global (claude-compat)
  ./.agents/skills/<name>/SKILL.md         project (agent-compat)
  ~/.agents/skills/<name>/SKILL.md         global (agent-compat)

Compat paths are deliberate: any SKILL.md written for a related
ecosystem works in zot unchanged.

Frontmatter fields:
  name           optional; defaults to directory name
  description    required; shown in the system prompt
  allowed-tools  optional list; informational (no enforcement)
  permissions    optional per-tool patterns; informational

allowed-tools and permissions are parsed but not enforced this
version. They render in the body so the model can self-regulate.

What landed:

- internal/skills: discovery + frontmatter parsing (no yaml dep —
  hand-rolled subset for the limited shape skills use), the on-
  demand `skill` tool implementing core.Tool, system-prompt
  addendum, FindByName lookup helper. Real unit tests cover all
  five locations + dedup priority + parser corner cases.

- internal/agent/build.go: Resolve discovers skills, registers the
  skill tool when at least one was found, appends the manifest to
  the system prompt's append list. Resolved gains a SkillTool
  field so the tui can read the live set.

- internal/agent/modes/skills_dialog.go: /skills picker with two
  modes — list view (cursor + paging) and body view (markdown-
  rendered with scroll). Refreshes its snapshot each open via
  cfg.SkillSnapshot so edits to a SKILL.md during a session are
  reflected immediately.

- /skills slash command + entry in slashCatalog.

- examples/skills/code-review and examples/skills/test-fix as
  starter skills demonstrating procedural style + frontmatter.

- docs/skills.md: full reference covering discovery, frontmatter,
  inspection, authoring tips, and ecosystem compat.

End-to-end verified against the live anthropic backend:

  prompt: "What skills do you have available?"
  -> "- code-review\n- test-fix"

  prompt: "Use the skill tool to load the code-review skill,
           then summarize step 1."
  -> [tool_call] skill({"name":"code-review"})
  -> [tool_result] body returned
  -> "Step 1 is to establish what changed by running git status..."
2026-04-19 14:32:30 +02:00
patriceckhart
1c7488285b fix(extensions): route through manager from runSlash default branch
extension commands appeared in the autocomplete popup but invoking
them produced "unknown command: /summon". The submit-handler path
already tried the extension manager before erroring, but the popup-
enter path (suggest.Selection -> runSlash) bypassed that check and
fell straight into runSlash's switch, where the default case bailed
with the generic error.

Fix: runSlash's default branch now also consults
cfg.Extensions.HasCommand and dispatches via invokeExtensionCommand
when matched. Both UI paths (typed-and-enter, popup-enter) now route
identically. Built-in cases above default still always win on
conflict.

Also adds examples/extensions/clock — a node extension demonstrating
the wire protocol from a non-Go runtime. Pure stdlib (readline +
process), no npm install. Registers /now (display) and /uptime
(prompt). Documented in its README; the protocol works the same
from any language.
2026-04-19 14:20:22 +02:00
patriceckhart
13607ed6be chore: exclude built extension example binaries from git 2026-04-19 14:10:00 +02:00
patriceckhart
0c92d6e914 feat: extension system (subprocess + json-rpc, any language)
Phase 1: extensions can register slash commands and push chat
notifications. Tools and event subscriptions land in later phases.

Architecture: each extension is its own subprocess. Zot launches
it on startup, completes a hello/hello_ack handshake over its
stdin/stdout, then routes slash commands the extension registered.
Crash isolation, language agnostic, works with any executable
that can read/write json lines.

What lands here:

- internal/extproto: shared wire-format types (Frame, HelloFromExt,
  RegisterCommandFromExt, CommandResponseFromExt, NotifyFromExt,
  HelloAckFromHost, CommandInvokedFromHost, ShutdownFromHost...).
  Both the host and the SDK marshal/unmarshal the same types.

- internal/agent/extensions: discovery + lifecycle manager.
  - Discover() walks $ZOT_HOME/extensions and ./.zot/extensions
    (project-local first, global second; first wins for duplicates)
  - Spawns each enabled extension, captures stderr to
    $ZOT_HOME/logs/ext-<name>.log
  - Reads frames in a goroutine, dispatches register_command and
    notify, correlates command_response by id
  - Stop() sends shutdown, waits 2s, then SIGTERM/SIGKILL
  - HostHooks abstracts the tui callbacks (Notify/Submit/Insert/Display)

- Interactive bridge: extensions slot into the slash dispatcher
  *after* the built-in catalog, so built-ins always win on conflict.
  Extension-registered commands also flow into the autocomplete
  popup and /help via slashSuggester.SetExtra. NotifyFromExt frames
  render as muted [ext-name] notes above the editor.

- internal/agent/extcmd: `zot ext` CLI.
    list / install <path|git-url> / remove / enable / disable / logs

- pkg/zotext: public Go SDK. Construct an Extension, register
  Command(name, desc, fn), call Run(). Fn returns a Response built
  with Prompt(), Insert(), Display(), Noop(), or Errorf(). Stderr
  via Logf() so stdout stays clean for the protocol.

- examples/extensions/hello: working Go example registering /hello
  and /summon, plus README + extension.json.

- docs/extensions.md: full protocol reference, including a
  ~30-line raw-Python example for users who don't want the SDK.

Tests: internal/agent/extensions/manager_test.go spawns a mock
extension via /bin/sh and exercises the full handshake -> register
-> invoke -> response cycle. Verifies the hello frame ordering,
correlation-by-id, and graceful shutdown.

Verified manually: built and installed the example, drove it via
stdin pipes, confirmed clean handshake + correct frame ordering
and shutdown_ack. Builds vet-clean on darwin / linux / windows.

Editor.Insert exported (was Editor.insert) so the extension hooks
can drop text into the input.
2026-04-19 14:09:43 +02:00
patriceckhart
e7dbf3d861 fix(tui): cell-aware width math for dialog header rules + add /btw
frame header pad length used len(label) which counts bytes, not
on-screen cells. titles containing em-dashes, box-drawing or any
multibyte rune left a few extra bytes of "imaginary length" so
the trailing ── run came up short of the right edge -- visible on
the /btw header which starts with "btw — side chat ...". swap to
runewidth.StringWidth for both frameHeader and frameHeaderColor.

also lands the /btw side-chat overlay this commit pre-staged:
opens via /btw [optional question], snapshots system + transcript
+ provider client at open time, fires one-off model calls against
that frozen context plus the side-chat history, renders replies
through the markdown pipeline. esc cancels in-flight calls or
closes when idle. nothing is appended to the main transcript or
the session jsonl. spinner reuses the main funny-line rotation.
2026-04-19 13:47:39 +02:00
patriceckhart
4dde65079d tui: bump welcome version-suffix duration to 1.5s, shorten ctrl+c hint
- welcomeVersionDuration is now 1500ms (was 1s). still ephemeral
  but gives a touch more time to read the splash version on
  faster terminals.
- ctrl+c "input cleared - ctrl+c again to exit" is now just
  "input cleared". the second-press behaviour is still wired and
  documented in /help; the long suffix was visual noise on a
  status line that already changes shape with each keystroke.
2026-04-19 13:26:16 +02:00
patriceckhart
6a4166429e tui: show binary version in the welcome banner for 1 second
splash variant of the welcome line:
  ▌ i'm zot (0.0.6). yet another coding agent harness.
reverts to:
  ▌ i'm zot. yet another coding agent harness.
after welcomeVersionDuration (1s). a one-shot AfterFunc fires
i.invalidate() so the banner refreshes on its own even if the
user has not typed anything by then.

dev builds without ldflags-injected version still print "(0.0.0)"
during the splash; release tags propagate through the makefile
into main.version normally.

bundles a small printhelp tagline tweak that was already staged.
2026-04-19 13:22:10 +02:00
patriceckhart
154420e938 tui: align /help descriptions across slash commands and keys
each section had its own label column width (10 vs 14) so the
descriptions formed two staircases at different x-positions.
compute one shared width from the longest label across BOTH
lists (with a 14-cell minimum so future entries don't compress
the column visually) and pad every label to it. now every
description starts at the same column.
2026-04-19 13:14:28 +02:00
patriceckhart
c4251d88fd tui: align status-bar spacing and indent assistant body
status bar:
  every horizontal gap is now exactly 2 spaces, applied uniformly
  via a single pad constant. before:
    1 leading | 6 between model+stats | 4 between stats+cwd
  after:
    2 leading | 2 between model+stats | 2 between stats+cwd
  produces e.g. "  (openai) gpt-5.4  $0.000 (sub) 0.0%/400k  ~/Sites/zot"
  matching the editor's left inset so the bar lines up vertically
  with the conversation column.

assistant body:
  user-role messages render their text with a 4-space indent under
  the "you" header. assistant text was rendering flush left, so the
  conversation column visibly broke at every assistant turn. now
  both renderMessage's RoleAssistant branch and Build's streaming
  path:
    - reduce the wrap width by the indent (4 cells)
    - prefix every produced line with the indent
  applies to plain text, markdown-rendered code fences / lists /
  blockquotes, and tool-call summary lines (>>= name args). tool
  result blocks were already indented and stay unchanged.
2026-04-19 13:10:16 +02:00
patriceckhart
36e0efe9ea tui: quote drag-dropped file paths in the input editor
dragging a file onto the terminal pastes its path via bracketed
paste. before this, agents would receive raw paths with literal
spaces and parens that shells/tools then misinterpreted (e.g.
"/Users/pat/foo bar.png" parsed as two arguments).

the editor's KeyPaste handler now runs pastes through
quotePastedFilePaths which:
  - inspects whitespace-separated tokens of single-line pastes
  - normalises file:// urls (decode + scheme strip), backslash
    space escapes (the macOS Terminal default for drag-drop), and
    pre-existing single/double-quoted forms
  - wraps tokens that look like paths (start with / or ~ and
    contain a separator, no shell metacharacters) in single
    quotes, splicing 'foo'\''bar' for embedded apostrophes
  - leaves prose, multi-line code pastes, and anything with
    metachars untouched

regression suite covers the 12 cases that surfaced while
implementing: backslash-escaped spaces, file:// urls, tilde
paths, multi-file drops, pre-quoted paths, embedded apostrophes,
plain prose, multiline pastes, metachar smuggling attempts, and
path-mixed-with-prose. all pass.

verified the build, vet, gofmt, and go test pipeline on darwin,
linux and windows targets.
2026-04-19 13:01:37 +02:00
patriceckhart
5a2cfb525e fix: keep transcript and cumulative cost across cross-provider /model swap
applyModelSelection's cross-provider branch built a fresh agent via
BuildAgentFor and assigned it to i.agent without copying anything
from the old one, so the user perceived /model anthropic->openai (or
vice versa) as a hard reset of the entire conversation. same-provider
swaps already worked because they only mutated agent.Model in place.

now the cross-provider path snapshots agent.Messages() and agent.Cost()
before constructing the replacement, then SetMessages + SeedCost on the
new agent so the transcript and the cumulative dollar meter both carry
across.

added core.Agent.SeedCost(provider.Usage) for the cost transfer.
messages already round-trip cleanly because tool names are normalised
to lowercase in the in-memory provider.Message representation; the
oauth-only Read/Write/Edit/Bash rename happens on the wire at request
build time, not in the transcript.

verified end-to-end via zot rpc:
  turn 1 (opus):    "remember teal" -> "ok"
  set_model:        opus -> sonnet
  turn 2 (sonnet):  "what color did i say?" -> "teal"

both same-provider (opus<->sonnet) and cross-provider (anthropic<->openai)
paths exercised.
2026-04-19 12:47:12 +02:00
patriceckhart
3ff6d9e6b7 fix(anthropic): drop claude-code identity cache marker
oauth requests now exceed anthropic's 4-breakpoint cache_control
limit when the conversation has 2+ user messages. previous layout
emitted 5 markers: identity + system + tools + 2 user messages.

drop the marker on the small claude-code identity line. it's a few
tokens and gets folded into the cached prefix implicitly when the
request matches turn-over-turn anyway. budget now: system + tools +
last 2 user messages = 4. fits.

reproduces the user-reported error:
  anthropic: http 400 ... A maximum of 4 blocks with cache_control
  may be provided. Found 5.

verified by sending two consecutive prompts through zot rpc on an
oauth credential -- first turn returns the assistant message
cleanly, second turn does too instead of 400ing.
2026-04-19 12:39:33 +02:00
patriceckhart
ebc5dad18c fix(rpc): wait for in-flight commands on stdin close, drop dup events
three real issues found by actually running zot rpc end-to-end:

1. piping a single command into the process and letting stdin close
   would race the agent loop against the process exit, swallowing
   the entire prompt response. fix: track in-flight prompt and
   compact goroutines in a sync.WaitGroup; run() now waits on it
   before returning.

2. each prompt emitted two consecutive done frames - one from the
   agent loop EvDone passing through EventToJSON, one added at the
   end of runPrompt. suppress EvDone in the prompt sink so only the
   explicit terminator remains.

3. cancelling a turn produced a spurious error frame on top of the
   turn_end stop=aborted that already carried the cancellation.
   suppress error frames when the underlying error is
   context.Canceled, in both prompt and compact paths.

verified manually: ping, get_state, get_models, set_model
(valid + invalid id), clear + get_messages, abort, malformed json,
unknown command, auth gate (missing/wrong/correct token),
stdin-close-while-prompt-running, in-process SDK Prompt, plus all
four reference clients parse cleanly and shell + python actually
drive the protocol.
2026-04-19 12:35:13 +02:00
patriceckhart
a670442c9e feat: zotcore SDK + zot rpc subprocess protocol
two new ways to embed the zot agent runtime in third-party apps:

1. pkg/zotcore - public Go SDK
   - Runtime type: New(Config), Prompt(ctx,text,imgs)->chan Event,
     Cancel, Compact, SetModel, State, Messages, Cost, ListModels,
     Close. Concurrent-safe; one prompt at a time per Runtime,
     ErrBusy if you try to overlap. Spawn multiple Runtimes for
     multiple projects.
   - Public types mirror the JSON-RPC wire schema 1:1 so consumers
     can share parsing code with the out-of-process clients.
   - Internal core/agent/provider stay internal; SDK is a thin
     facade that exposes only what's stable.

2. zot rpc subcommand - newline-delimited JSON on stdin/stdout
   - 'zot rpc' (or 'zot --rpc') turns the agent runtime into a
     subprocess that any language can drive via pipes.
   - Commands: hello, prompt, abort, compact, get_state,
     get_messages, clear, set_model, get_models, ping. Each
     optionally carries an id; the matching response echoes it.
   - Stream notifications: turn_start, user_message,
     assistant_start, text_delta, tool_call, tool_progress,
     tool_result, assistant_message, usage, turn_end, done,
     error, compact_done. Same shape as the existing --json mode
     events (modes.EventToJSON / ContentToJSON were exported
     for reuse).
   - Auth: optional ZOTCORE_RPC_TOKEN env var; first command
     must be hello {token: ...} when set. Without the env var
     the spawning process is implicitly trusted.
   - Concurrency: one prompt or compact at a time per process,
     enforced by a turnMu mutex. abort fires immediately
     regardless. Stdin close exits the process.

3. docs/rpc.md - full schema reference
4. examples/rpc/{python,node,shell,go} - reference clients
5. examples/sdk - in-process Go embedding example
6. README updated with a new modes entry and an embedding section
2026-04-19 12:26:48 +02:00
patriceckhart
d6ac856ee7 tui: switch busy spinner to cli-spinners 'dots3' preset
a single-cell 10-frame braille rotation at 80ms per step. matches
the well-known dots3 preset from sindresorhus/cli-spinners (MIT).
small, recognisable, occupies one terminal cell so the busyPrefix
in the status bar stays compact regardless of message length.
2026-04-19 11:52:41 +02:00
patriceckhart
fdef8ac614 core: drop empty session files instead of writing meta-only stubs
every zot launch was creating a session file with just a meta line,
even when the user exited without prompting. /sessions and ls -la
ended up showing dozens of empty entries.

now Session tracks messagesAppended and freshFile (true for
NewSession, false for OpenSession). Close() removes the file when
both conditions hold: this process created it AND no messages
were ever appended. resumed sessions are never auto-deleted even
if the resume run added nothing, since the prior content is real.

PruneEmptySessions sweeps existing meta-only stubs from the cwd's
session dir on each interactive launch. cheap (only reads enough
of each file to find a 'message' line) and fixes the existing
backlog automatically the first time you reopen zot in a project.
2026-04-19 11:38:57 +02:00
patriceckhart
0a69274cd6 tui: park viewport on the last turn after resume
opening a session via /sessions (or starting with --continue /
--resume / --session) used to drop the user at the live tail
(scrollOffset = 0). on a long session that means the visible
viewport showed only the last few rows of the final assistant
reply -- the rest of the conversation was loaded into the agent
correctly but invisible without scrolling. users read this as
'resume only restored a one-liner'.

now after both code paths we compute the row offset of the last
user message and park the viewport so that user prompt sits at
the top of the chat area, with the assistant's last reply right
below. older history is one scrollup away; pgdn or arrows snap
to the live tail. the existing 'viewing turn N of M' footer
shows up automatically since parkedTurn / parkedTotal are set.

shared scrollToLastTurn helper used by both the /sessions picker
(applySessionSelection) and Run() startup. the old applySession
code is now a thin wrapper that just invalidates caches and
delegates to scrollToLastTurn.
2026-04-19 11:34:40 +02:00
patriceckhart
75ed9d87c7 tui: render the final 'turn finished' frame without needing a nudge
two bugs in the redraw throttle were eating the post-turn frame:

1. requestRedraw, when it could redraw immediately (since >=
   redrawMinInterval), didn't reset pendingRedraw or stop the
   pendingTimer. so subsequent invalidates within redrawMinInterval
   saw pendingRedraw == true and were dropped, leaving the ui
   stale until a key event or scroll triggered a fresh path.

2. the tick branch only called drainPending while busy or with an
   open dialog. once a turn finished and i.busy went false the
   tick stopped clearing the pending flag, so any redraw that
   ended up scheduled (because the dirty channel was saturated
   under streaming load) had no second chance to fire.

fix:
- requestRedraw now stops the timer and clears pendingRedraw
  whenever it does an immediate redraw. throttle state stays
  consistent.
- tick branch always drains pending, busy or not. requestRedraw
  on top is still gated on busy/dialog so the spinner keeps
  animating without forcing a redraw on every tick when idle.

reproduction: long turn finishes -> spinner stops, but the final
assistant text only appears when the user scrolls or types a key.
fixed by both changes; either one alone leaves a small race window.
2026-04-19 11:31:03 +02:00
patriceckhart
d653cc179d tui: slash commands work while a turn is in flight
previously typing any slash command during a turn was rejected with
"cancel the current turn (esc) before running a slash command".
annoying and unnecessary for read-only commands, and the
destructive ones can cancel the turn for you.

read-only commands run immediately, parallel to the streaming
turn: /help, /jump, /sessions, /lock, /unlock, /exit.

destructive commands trigger cancelAndWaitForIdle first (cancels
the turn ctx, polls i.busy at 10ms intervals, gives up after 2s
as a safety cap so a wedged http stream cannot freeze the ui
forever): /clear, /compact, /login, /logout, /model. once the
turn goroutine has wound down they run on the now-quiet agent.

slashCancelsTurn(head) in slash_suggest.go is the single source
of truth for which commands need the wait. readme updated to
match the new behaviour.
2026-04-19 11:21:37 +02:00
patriceckhart
64704875d2 tui: /jump to scroll to past turns, render cache for long transcripts
/jump:
- new slash command; opens a picker listing every user turn in
  the current session (timestamp relative, tool count badge, first
  line of the prompt). \u2191/\u2193 + enter scrolls the viewport to put
  that turn's user-message header at the top row. non-destructive,
  transcript untouched
- runes extend a live filter; backspace shortens. '/jump <text>'
  pre-applies the filter; exactly-one-match auto-jumps without
  showing the picker
- while parked on a past turn the scroll-up note reads 'viewing
  turn N of M \u00b7 pgdn to catch up' instead of the generic row
  count. scrolling back to the tail (or starting a new turn, or
  /clear) resets the parked state automatically
- view.go: new MessageAnchor type + BuildWithAnchors so the dialog
  can resolve msgIdx -> first rendered row

perf for long transcripts (the whole ui stutters on ~50 messages):
- view.renderCache: per-message memoisation keyed by (fnv1a of
  role+content, width, expandAll). finalised messages never change
  so the cache hit rate is ~100% after the first render. streaming
  partials and in-flight tool-call views stay uncached by design
- BuildWithAnchors now pre-sums line counts and allocates
  in a single make() instead of 50 appends with log2(N) backing-
  array memcpys
- truncateToWidth fast path: byte-length <= cols implies cell-width
  <= cols, so we skip the rune-width loop entirely. covers the huge
  majority of lines in a session
- cache purged on /clear, /compact completion, and session swap
  (applySessionSelection); resize invalidates implicitly via the
  width key. LRU eviction at 4x message count caps memory

impact: a 50-msg / 2000-line transcript went from unresponsive-
while-typing to drawing in well under a frame. measured locally
with go-perf traces; no change to correctness.
2026-04-18 12:22:16 +02:00
patriceckhart
7141ebd45f readme: drop private-repo install caveats
the repo will be public by the time anyone reads this, so the
GITHUB_TOKEN / GOPRIVATE scaffolding just adds noise. installers
and 'go install' were already coded to work unauthenticated on
public repos; no code changes needed.
2026-04-18 11:51:09 +02:00
patriceckhart
8e546bde70 tui: show 'update available' banner at top of chat
- internal/agent/update.go: check github releases api for a newer
  tag than the compiled-in version, cached in $ZOT_HOME/update-check.json
  with a 12h ttl so startup never hits the network twice. honours
  $GITHUB_TOKEN for the window while the repo is private; falls back
  to silent no-op on any failure. skipped entirely on dev builds
  (version = 0.0.0 or dev)
- internal/agent/modes/update_banner.go: yellow-framed block with
  the new version, the current version, the one-liner install
  command appropriate for the platform, and a link to the release
  page. rendered above the welcome / transcript so it's the first
  thing the user sees
- wired through via InteractiveConfig.UpdateInfoChan to avoid an
  import cycle (modes -> agent). cli.go kicks off the check async
  and feeds the result in

note: no 'dismiss' key yet \u2014 the banner stays until you update to
the shown version. if the nagging gets annoying we can add a
per-version dismiss cache later.
2026-04-18 11:49:22 +02:00