Commit graph

60 commits

Author SHA1 Message Date
patriceckhart
cf7ddf5322 Add quick model switch shortcuts (Ctrl+1..9) with /settings model shortcuts sub-view
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-23 06:56:24 +02:00
mi-skam
1f663cb867
fix: use osascript for macos clipboard image paste 2026-06-21 22:31:34 +02:00
mi-skam
4a6d6915ca
Add clipboard paste support 2026-06-21 14:47:06 +02:00
patriceckhart
1d7dc39fe8 docs: document custom providers and updated insecure flag [release=skip]
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
Cover named custom providers in models.json (provider-level baseUrl and api format, model-level baseUrl override, derived API-key env vars, /login support, no-probe key storage). Note built-in models stay visible and correct the credential-resolution order. Update the --insecure description to cover models.json baseUrl endpoints.
2026-06-16 20:31:21 +02:00
patriceckhart
ab7fb37046 Scope --insecure TLS to explicit base URL, drop global transport override
Builds on s3rj1k's --insecure flag (#35) but limits insecure TLS to the
resolved inference client for an explicit --base-url, instead of mutating
http.DefaultTransport process-wide. Built-in providers, auth, and model
discovery keep normal certificate verification. Documents the flag in
the CLI reference.

Co-authored-by: s3rj1k <evasive.gyron@gmail.com>
2026-06-16 07:41:38 +02:00
patriceckhart
cde9298410 Add !command shell escape and fix VS Code terminal repaints
- Shell escape: typing "!cmd" runs it via the bash tool's shell in the
  session cwd, honoring the /jail sandbox. Output is parked below the
  transcript as a styled terminal-log block until the next prompt or
  /clear, so it never enters the model conversation. Shares busy state
  with the agent: esc cancels it and no turn or other escape can start
  while one is in flight.
- VS Code terminal: full repaints used \x1b[2J, which xterm.js scrolls
  into scrollback and duplicates the frame. Clear in place via cursor
  home + erase-to-end under keepScrollback; Clear()/Resize() no longer
  eagerly wipe. Force a viewport-safe Invalidate on slash/file popup
  open and close transitions there.
- Restore the live tool-call overlay behavior (keep in-flight boxes
  visible until the tool_result reaches the transcript) and drop the
  forced repaint at turn start.
- Document the shell escape in the README.
2026-06-04 18:05:17 +02:00
patriceckhart
dfd25012b6 Add JSON theming, theme-only extensions, and docs
- User themes from $ZOT_HOME/themes/*.json with partial overrides
  (colors, syntax, spinner) and dark/light fallback.
- /settings color-theme picker; selection persisted in config.json.
- Theme-only extensions: extension.json plus theme.json (or
  themes/theme.json) load without spawning a subprocess.
- write-zot-themes built-in skill and docs/themes.md.
- README, extensions docs, and embedded docs index updated.
2026-05-30 11:34:42 +02:00
patriceckhart
3ce114c8de feat(update): fast-forward installed extensions during zot update
After the binary swap succeeds, zot update now walks
$ZOT_HOME/extensions/ and runs git pull --ff-only on every
extension that is a git checkout.

Per-extension behaviour:
- disabled extensions: skipped
- no .git/ directory: skipped (no remote to pull from)
- dirty worktree: stashed (--include-untracked) before the pull,
  popped after; conflict on pop leaves markers in place with a
  warning rather than discarding the runtime state
- diverged / offline / any git failure: reported as failed and the
  next extension is processed
- timeout per extension: 60s
- no build step is ever executed; authors commit the runnable
  artifact, or the user rebuilds manually and /reload-ext

zot update itself never aborts because of an extension. The
binary swap is the source of truth for success.

Implementation in packages/agent/extupdate.go (~150 LoC), 13 unit
tests covering each branch including stash+pop with untracked
runtime files, diverged history, unreachable remote, and the
mixed-state scenario. README's Extensions section documents the
new behaviour.
2026-05-27 09:37:59 +02:00
patriceckhart
fa7d8d8be5 refactor: split source into packages/{provider,core,tui,agent}
Single Go module, four top-level packages under packages/. Import
paths become github.com/patriceckhart/zot/packages/<name>; downstream
consumers can depend on individual packages without pulling the rest.

Layout:
  packages/provider/     LLM clients + catalog
  packages/provider/auth/ credential store + OAuth + login server
  packages/core/         agent loop, sessions, cost
  packages/tui/          terminal toolkit + chat view
  packages/agent/        CLI wiring, system prompt
    extensions/ extproto/ modes/ tools/ skills/ swarm/
    sdk/  (was pkg/zotcore, package renamed zotcore -> sdk)
    ext/  (was pkg/zotext, package renamed zotext -> ext)

internal/ and pkg/ removed. The internal/assets logo moved into
packages/provider/auth/assets.

Public Go SDK identifiers renamed:
  pkg/zotcore (package zotcore) -> packages/agent/sdk (package sdk)
  pkg/zotext  (package zotext)  -> packages/agent/ext (package ext)

This breaks Go-based extensions and embedders; the JSON wire protocol
for extensions and RPC is unchanged, so non-Go extensions, already-
built extension binaries, and zot rpc consumers are unaffected.

Docs, examples, and the built-in write-zot-extension skill updated
for the new paths and identifiers. Shadow-bug fixes in code samples
(ext := ext.New -> e := ext.New).
2026-05-27 09:07:15 +02:00
patriceckhart
d55af26936 document default thinking level 2026-05-26 18:11:30 +02:00
patriceckhart
37ef90bbb3 add configurable thinking level 2026-05-26 18:07:33 +02:00
patriceckhart
80c0ac97a5 feat: auto-swarm summary, system-prompt nudge, README/docs
- Sub-agents are long-lived daemons that keep running on the inbox
  after the initial task, so agent.Wait() never unblocks for them.
  Replaced the Wait-based watcher with a per-turn OnTurnEnd callback:
  Agent.SetOnTurnEnd installs it under the agent mutex, the runner's
  stdout decoder fires it on every turn_end event from the child.
- trackSwarmAgent now subscribes via SetOnTurnEnd. First turn_end per
  tracked sub-agent marks it done; when every entry is done, zot
  flushes one [auto-swarm update] turn via SubmitOrQueue summarising
  each agent's status / task / transcript tail (and the turn error if
  any) so the main agent can recap the collective outcome in chat.
- System addendum extended to tell the model to expect that update
  message and treat it as observed state, not a new user request.
- README: /settings row in the slash-commands table, new /settings
  subsection covering both toggles, auto-swarm paragraph appended to
  the /swarm subsection, /settings listed as a read-only mid-turn
  command.
2026-05-25 18:47:29 +02:00
patriceckhart
5293277d36 Load user skills by default 2026-05-24 20:12:06 +02:00
patriceckhart
cbd9442039 Document expanded provider and model catalog in README 2026-05-24 11:53:42 +02:00
patriceckhart
8cd8410ace Use ASCII ellipses throughout 2026-05-22 17:19:29 +02:00
patriceckhart
81c913aef9 feat(/study): accept an optional file or directory argument
/study previously hard-coded the prompt to 'the current directory'.
It now takes an optional path - typed, drag-dropped, or selected via
the @ file picker - and tailors the prompt to whatever was passed,
distinguishing files from directories via os.Stat and rendering paths
under cwd as relative for readability. With no argument, behaviour is
unchanged.

Examples:
  /study                          -> current directory (old behaviour)
  /study internal                 -> directory internal
  /study [dir:internal/]          -> directory internal (via @-picker)
  /study cmd/zot/main.go          -> file cmd/zot/main.go
  /study [file:cmd/zot/main.go]   -> file cmd/zot/main.go (via @-picker)
2026-05-19 18:37:27 +02:00
patriceckhart
1aea23e419 swarm: drop git-worktree / isolation; agents share the host cwd
Each swarm subagent now runs with cwd == the parent zot's RepoRoot, just
like the main agent. No per-agent git worktree, no swarm/<id> branch, no
SetIsolation toggle, no '\''i'\'' dashboard shortcut, no --isolated flag. The
previous worktree flow was confusing (toggling '\''i'\'' on a running agent
couldn'\''t reseat its cwd, so edits kept landing in the host repo anyway)
and shipped without a real use case.

Concretely:

- delete internal/swarm/worktree.go and the WorktreeManager interface.
- Config loses Worktree; SpawnReq loses Isolated; Agent loses Branch and
  Isolated; AgentSnapshot loses Branch and Isolated; agentMeta loses
  branch and isolated (older meta.json files still decode \u2014 unknown JSON
  keys are ignored \u2014 and buildDetachedAgent coerces any stale per-
  worktree Dir back to the live RepoRoot so detached agents resume in
  the right place).
- Swarm.Remove no longer calls into any worktree manager, so it can'\''t
  accidentally git-worktree-remove the user'\''s actual source tree; it
  only clears <swarm-root>/agents/<id>/.
- runner.go drops the <Dir>/.zot/session.json fallback (every plausible
  Dir is now the user'\''s repo, where a stray .zot/ would litter the
  source tree); SessionPath is required and Spawn always populates it
  under <swarm-root>/agents/<id>/session.json.
- swarm dialog: remove isolate/SetIsolateFunc, the '\''i'\'' key handler, the
  MODE column, the mode/branch lines in the transcript header. Fix the
  transcript-view cursor row math (row += 4 was counting a now-removed
  branch row, leaving the caret one row above the editor accent bar).
- swarm slash command: drop /swarm isolate, /swarm unisolate, and the
  --isolated flag on /swarm new; trim the spawn-flag parser and tests.
- README and slash-suggest description updated; site copy updated in a
  separate commit.

Tests adjusted accordingly; full suite green.
2026-05-17 00:01:29 +02:00
patriceckhart
467ad0a990 modes: adapt slide-back chord hint to terminal
The "Press Option+up to slide back into input" hint shown under the
sliding-in queue was correct on Ghostty, iTerm2 (Meta=Option),
Terminal.app (Use Option as Meta), Alacritty, and Kitty -- all of
which send CSI 1;3A for Option+Up, which the input parser reads as
KeyUp + Alt.

VS Code's integrated terminal (xterm.js on macOS) swallows plain
Option as a compose modifier by default, so Option+Up never reaches
zot as an Alt-modified arrow. Option+Shift+Up does work there:
xterm.js emits CSI 1;4A (Shift+Alt), which the parser already
accepts as alt=true. The binding has always worked in VS Code;
only the displayed hint was misleading.

Fix: a small slideBackChordHint() helper that returns
"Option+Shift+up" when TERM_PROGRAM=vscode and "Option+up"
otherwise. interactive.go's queue-hint row calls it instead of
hardcoding the chord. The binding itself is unchanged -- both
chords work on every terminal -- the hint just adapts to what the
host actually delivers.

README.md gains a one-sentence note under Queued messages
documenting both chords and that the hint adapts via
$TERM_PROGRAM.

Tests cover the VS Code branch, case-insensitive detection
(VSCode / VSCODE / VsCode), and the default for "", "ghostty",
"iTerm.app", "Apple_Terminal", "alacritty", "kitty".
2026-05-16 12:01:05 +02:00
patriceckhart
b11e6ed4e4 swarm: introduce /swarm dashboard, /btw-style transcript view, and per-session scope
A /swarm subsystem for long-running parallel subagents. Each agent runs
in its own subprocess against a fresh git worktree (branch swarm/<id>)
with its own persistent session file and unix-socket inbox; the parent
zot stays in the main session and pokes / observes them via the
dashboard.

Highlights:

- New internal/swarm package: Agent, Spawn/Resume/Kill/Remove, event log
  (events.jsonl), inbox protocol (listen/dial), worktree manager, exec
  runner that spawns "zot --swarm-agent ...".
- New internal/agent/swarm_agent.go: daemon-mode child entry point.
  Reuses the standard agent loop but persists turns to the supervisor-
  chosen session.json and streams events as JSONL on stdout. Mirror to
  events.jsonl is dormant while the supervisor's stdout pipe is alive so
  events do not get double-written.
- Resume reattaches in place: reuses the same worktree, session, branch
  and inbox path; carries forward the prior transcript replayed from
  events.jsonl. Resume no longer re-fires the original Task as a fresh
  user turn -- that was producing "agent busy; send cancel first" races.
- core.NewSessionAtPath plus an openOrCreateSession fallback so the
  child actually persists its session.json at the supervisor-chosen path
  on first spawn instead of running with sess==nil.
- Dashboard in internal/agent/modes/swarm_dialog.go + swarm_slash.go:
  list / new / kill / remove / resume / logs / send subcommands plus an
  interactive picker. Transcript view is /btw-style: an always-on
  inline editor at the bottom, streaming auto-follow, inline busy
  spinner with the agent's current activity such as "thinking" or
  "tool: edit". /model inside the spawn editor pops the global model
  picker.
- Per-session scope: each spawn is stamped with the host session's id
  and only shows in that session's /swarm dashboard. Pre-upgrade agents
  -- empty session_id -- remain visible everywhere as a safety net. The
  active scope is re-applied whenever loadSession swaps sessions.
- Resolve falls back to the provider's default model when the persisted
  cfg.Model is no longer in the catalogue, warns on stderr, and rewrites
  config.json so the next launch is silent.
- ReadEventLog folds back-to-back same-type identical-payload events
  within 250ms so events.jsonl files polluted by the old supervisor +
  mirror double-write read back cleanly.
- DrawLog gains an idle no-op fast path: identical buffer plus identical
  cursor = emit nothing, so the terminal's cursor blink keeps ticking in
  dialogs whose underlying agent is idle.

Slash UX:

- New /swarm command with subcommands; the suggester picks it up.
- README.md documents the full dashboard, CLI, and persistence story,
  and explicitly notes that /session export does NOT bundle subagents
  -- their worktree and unix-socket inbox cannot round-trip through a
  .zotsession.

Tests cover: SpawnReq + Resume lifecycle, session-id scoping + persistence,
default-child-args spawn vs resume contract, NewSessionAtPath at a fixed
path, model fallback when the configured model is gone, swarm dialog
behaviour -- auto-open editor, /model in spawn editor, transcript grows
without internal scroll, busy spinner, multi-message send -- event-log
dedup, swarm emitter dormant-until-orphan, and the DrawLog idle no-op +
change-breaks-fast-path invariants.
2026-05-16 11:53:20 +02:00
patriceckhart
ce78a49b87 add deepseek provider (api-key, openai-compatible v4 catalog) 2026-05-10 16:49:31 +02:00
patriceckhart
25cb7d5003 remove /yolo slash command
The runtime escape hatch was redundant: --no-yolo's per-call dialog
already exposes 'yes-always-this-session' which flips ConfirmGate
into allow-all mode without a separate command. Once a session
starts with --no-yolo, the only way to disable confirmations is now
to either pick the always-this-session option in the dialog or exit
and relaunch.

- slash_suggest.go: drop the /yolo entry from the autocomplete list.
- interactive.go: remove the case "/yolo" dispatch (falls through
  to 'unknown command') and delete the orphaned runYoloOn method.
- README: drop the /yolo row from the slash-command table and the
  trailing reference in the --no-yolo flag description.
2026-05-08 08:18:16 +02:00
patriceckhart
ef93175bf9 add Google Gemini provider
- internal/provider/gemini.go: REST client against
  generativelanguage.googleapis.com/v1beta/models/{id}:streamGenerateContent
  ?alt=sse, mapping our message/tool format to Gemini's Content/Part schema
  and translating SSE chunks into the existing assistant-message event
  stream. Handles text, tool calls, thought-summary parts, and per-model
  thinking config (thinkingBudget for 2.5, thinkingLevel for 3.x with
  Gemini-3-Pro pinned to LOW minimum).
- internal/provider/discover.go: DiscoverGoogle pages /v1beta/models and
  filters to chat-capable ids (skips embeddings, AQA).
- internal/provider/models.go: catalog entries for gemini-2.5-pro,
  2.5-flash, 2.5-flash-lite, 2.0-flash, 2.0-flash-lite.
- internal/auth: 'google' is a recognized provider; API-key probe hits
  /v1beta/models with x-goog-api-key. OAuth flows reject google with a
  clear 'API-key only' error since Gemini Advanced subscriptions don't
  issue API tokens.
- internal/agent: env lookup for GEMINI_API_KEY / GOOGLE_API_KEY,
  default model gemini-2.5-pro, NewClient wires provider.NewGemini,
  background model discovery, /login + /logout + rescue dialog all
  include google.
- README: new ### Google Gemini section with auth model, free-tier
  limits, and reasoning-config notes.
2026-05-07 21:15:34 +02:00
patriceckhart
380eca9615 drop rescue status bar claim from readme 2026-05-06 20:46:29 +02:00
patriceckhart
0a85be3c33 add model rescue picker and silent transient retries 2026-05-06 20:40:31 +02:00
patriceckhart
07c073055d Document Kimi provider support 2026-05-05 11:04:07 +02:00
patriceckhart
25b2bd4c96 feat: changelog on update, full-width highlights, @ file picker docs
Changelog dialog now shows only the changelog section from release notes with headings in accent color. Works for local 0.0.0 builds (fetches latest release). Full-width highlight bars fixed everywhere via erase-to-EOL and trailing ANSI preservation in truncateToWidth. Session ops dialog fixed. README documents the @ file picker.
2026-04-25 11:24:09 +02:00
patriceckhart
cb9de10ec6 feat: add ollama as first-class provider
Adds --provider ollama with auto-detection of local ollama at localhost:11434. No API key required for local models. Optional --api-key and --base-url for remote/authenticated instances. Uses the OpenAI chat completions client internally. Unknown models are accepted without catalog entries. Updated README with ollama documentation.
2026-04-24 19:13:45 +02:00
patriceckhart
b25a2bc854 feat: custom models with baseUrl + domain migration to www.zot.sh
Adds baseUrl support in models.json for local models (ollama, vLLM, etc). Migrates all install URLs and references from zot.patriceckhart.com to www.zot.sh.
2026-04-24 14:00:31 +02:00
patriceckhart
b245be02e5 feat(models): support user-defined models via models.json
Reads $ZOT_HOME/models.json at startup and merges user-defined models into the active catalog with highest precedence. Provider keys like openai-codex are normalized. Documented in README.
2026-04-23 23:09:32 +02:00
patriceckhart
1f28b62a47 style: replace middle-dot separators with ascii hyphens
Swept the TUI strings and README for the stray U+00B7 MIDDLE DOT
(\u00b7) separators left over from earlier UI iterations. They read
fine on terminals that render the glyph as a small bullet, but
on some fonts (especially the telegram desktop client, a few
linux terminal fonts) it renders as an off-center dot that
looks like a smudge or a broken pipe. Plain ' - ' is universally
readable and matches every other separator already in the
status bar and dialogs.

Touched:
  README.md                    paragraph separators
  modes/btw_dialog.go          header joiner
  modes/help.go                table row separators
  modes/interactive.go         status bar tags, telegram mirror
  modes/jump_dialog.go         row separators
  modes/login_dialog.go        header joiners, status line
  modes/model_dialog.go        model + source joiner
  modes/slash_suggest.go       commands list
  tui/view.go                  assorted tui separators

No functional change. go test ./... still passes.
2026-04-21 17:39:08 +02:00
patriceckhart
ce806272e0 feat(tui): /study slash command to prime the agent on the current project
New built-in /study command that runs a single canned prompt:
"Read and understand everything in the current directory." The
first thing most sessions need is project context, and typing
the full sentence every time is friction; /study turns that
into one keystroke-saving shortcut.

Dispatched through the same queue-or-start path as a typed
prompt, so it behaves identically:

  - idle  -> startTurn(studyPrompt)
  - busy  -> queued behind the running turn, delivered next

Also added to the README slash-commands table so /help output
and the top-level docs stay in sync with slashCatalog.
2026-04-21 08:59:53 +02:00
patriceckhart
13b8947fba tweak(tui): ctrl+c no longer interrupts a running turn
A single ctrl+c during a busy turn used to cancel the turn
(same as esc). That misfired a lot in practice because ctrl+c
is reflex muscle-memory ("be quiet" in a shell) rather than a
deliberate decision to kill a multi-minute model call you have
already paid tokens for. Users kept aborting expensive turns by
accident.

New behavior:

- busy + first ctrl+c  -> arms the exit hint, status line
                          reads "press ctrl+c again to exit,
                          esc to cancel the turn"; the turn
                          keeps running.
- busy + second ctrl+c (within ctrlCExitWindow = 2s)
                       -> exits zot.
- busy + esc           -> cancels the running turn (unchanged).
- idle + ctrl+c        -> clears editor/queue as before;
                          second press within 2s exits.

The double-tap-to-exit pattern now works the same from busy and
idle, which also matches the habits from python repls and
similar tools.

Also:
- assistant body keeps a 4-cell right gutter that mirrors the
  4-space left indent so wrapped prose sits in a symmetric
  column instead of kissing the terminal edge on ultra-wide
  windows. The prose cap itself is gone; the new
  assistantBodyRightPad constant replaces maxAssistantWidth.

- README Keys table + Queued messages paragraph updated to
  describe the new ctrl+c / esc split so the docs match the
  code.
2026-04-20 18:23:59 +02:00
patriceckhart
ad60f82390 docs(readme): point install one-liners at zot.patriceckhart.com
The website already redirects /install.sh and /install.ps1 to
the raw github files with a 301, so the short domain is the
stable public entry point for the installers. Updated the three
command snippets in the install section to match.

Nothing else moves \u2014 the rest of the github URLs in the readme
(release page, clone, go install) still use github.com directly
since those aren't proxied.
2026-04-20 17:12:26 +02:00
patriceckhart
b29c6e7e2d chore: drop homebrew distribution
The homebrew-tap repo was never created and maintaining a
separate tap for a small tool adds release-pipeline surface
for no real benefit (install.sh and go install cover macos
already). Removed from:

- README.md install section
- .goreleaser.yaml brews block + the release header that
  advertised the brew one-liner
- .github/workflows/release.yml env export for
  HOMEBREW_TAP_TOKEN (no longer consumed)

No other surfaces referenced it. Installers (install.sh /
install.ps1) never mentioned brew.
2026-04-20 14:17:58 +02:00
patriceckhart
c08057804d docs(readme): dedicated /session subsection with fork + tree
Table row already covered the four ops in a dense one-liner; added
a full "### /session" subsection next to /sessions with one
paragraph per op (export, import, fork, tree) spelling out
defaults, path-handling, and the parent/child invariants behind
the tree view.
2026-04-20 11:37:52 +02:00
patriceckhart
7794a253b9 feat(session): /session fork + /session tree
Branch semantics for conversations: rewind to a past user message
and continue from there in a new session, with a visual tree
picker to switch between branches later.

/session fork
  Opens the /jump turn picker in fork mode. Pick any past user
  message; zot copies every message from the session start up to
  and including that turn into a new session file, records the
  parent id + fork point in the new meta, and swaps the running
  agent onto the new branch. The parent session file stays on
  disk unchanged; you can return to it later via /session tree.

/session tree
  Shows every session in the current cwd arranged by parent/child
  relationships. Depth-first flatten with two-space indent per
  level; the current session is tagged "[current]". Pick any
  other entry to switch into it (same semantics as /sessions).

Why both commands:
  /sessions remains the "flat list of everything in this
  directory" resume picker. /session tree is the fork-aware
  variant. /session fork is the equivalent of git branch; /session
  tree is the equivalent of checkout.

core additions:

  SessionMeta gains two fields:
    - Parent    string  (parent session ID, empty for roots)
    - ForkPoint int     (0-indexed message position of the cut)

  core.BranchSession(parentPath, root, cwd, version, upToIdx)
    Reads the parent session, writes a new session file in
    SessionsDir(root, cwd) containing the first upToIdx message
    rows + any usage rows that came before the cut. The new meta
    records Parent=<parent id>, ForkPoint=<upToIdx>, fresh id,
    cwd, Started, Version.

  core.BuildSessionTree(root, cwd) []*TreeNode
    Walks every session file in the cwd dir, reads each one's
    meta, links children to parents by ID. Returns the forest
    rooted at parentless sessions. Missing-parent sessions (if
    the parent file was manually deleted) surface as roots so
    they stay discoverable.

  core.FindSessionByID(root, cwd, id) string
    O(n) lookup used when resolving a tree pick back to a file
    path. Files in the dir are small in practice.

  readSessionMeta helper (unexported) reads just the first line
    of a session file and decodes the meta; avoids loading the
    whole transcript when BuildSessionTree only needs the
    parent/id pair.

tui additions:

  session_tree_dialog.go
    Flat list with indent-based nesting to match the other
    picker dialogs' shape. Up/down moves; enter switches; esc
    cancels. Rows show "<relative-when> <prompt-preview> N msgs"
    with a muted "[current]" tag on the current session.

  interactive.go
    - sessionTreeDialog field + constructor.
    - /session fork / /session tree cases in doSessionOp.
    - doSessionFork flips pendingFork=true and opens the
      jumpDialog over the agent's current messages.
    - The jump-dialog key handler checks pendingFork; if set,
      routes the selection to applyForkSelection instead of the
      normal applyJumpSelection. pendingFork clears on select
      OR on dismiss so a later plain /jump isn't hijacked.
    - applyForkSelection calls FlushSession (so the branch gets
      everything in memory, not just what was lazy-flushed),
      then core.BranchSession, then LoadSession to swap.
    - doSessionTree calls FlushSession first so the tree shows
      the true current message count, then
      core.BuildSessionTree, then hands the forest to the tree
      dialog.
    - applySessionTreeSelection hands the picked path to
      LoadSession.

tests:

  TestBranchSessionCopiesPrefix
    Parent with three messages; branch at upToIdx=2; verify the
    child has exactly 2 messages, parent ID matches, fork point
    = 2, ID rotated.

  TestBuildSessionTree
    Parent + 2 branches off it; verify roots=[parent],
    roots[0].Children has both branches.

README: /session row expanded to cover all four ops.
2026-04-20 11:10:56 +02:00
patriceckhart
ef80f9cd80 feat(session): /session export + import with portable .zotsession file
Lets one user hand a conversation off to another machine or
user. New slash command:

  /session                    picker with export / import rows
  /session export             defaults to ~/Downloads/<name>.zotsession
  /session export ~/foo       writes ~/foo.zotsession
  /session export ~/bar/x.zs  writes to that exact path (ext added if missing)
  /session import <path>      loads and switches to it

Exported file is the same jsonl the live session writes, with
the meta row rewritten to strip the source user's cwd. The
importer rotates the id and cwd to claim the copy, so the
imported session becomes a first-class entry in the current
user's sessions/ directory and shows up in /sessions,
/jump, and on-disk summaries like any other.

core/session_portable.go (new)
  - ExportSession(src, dst) string  returns the resolved
    output path. dst can be a file, a directory, or a bare
    name missing the .zotsession ext; all three shapes land
    somewhere sensible.
  - ImportSession(src, root, cwd, version) string  returns
    the newly-created session file path, ready for
    OpenSession.
  - firstUserPrompt() + slugify() build descriptive
    "20260420-080305-3f268850-say-hello-in-one-sentence.zotsession"
    filenames when exporting into a directory.

core/session_portable_test.go (new)
  - Full round trip: write → export → import into a
    different cwd → OpenSession → message payloads match.
  - Verifies the exported meta drops the original cwd.
  - Verifies the .zotsession extension is appended when
    missing from dst.

modes/session_ops_dialog.go (new)
  - Tiny picker matching the telegramDialog / logoutDialog
    shape: arrow keys, enter, esc. Two rows (export / import)
    with muted hint text.

modes/interactive.go
  - sessionOpsDialog field + constructor + key dispatch +
    render selector, identical boilerplate to the other small
    dialogs.
  - openSessionOpsDialog, doSessionOp, doSessionExport,
    doSessionImport. Export uses CurrentSessionPath (new
    config hook); import calls core.ImportSession then routes
    through the existing LoadSession so the agent switches to
    the new file.
  - defaultExportDir (~/Downloads → ~ → /tmp fallback),
    expandTilde, friendlyPath helpers.

cli.go
  - CurrentSessionPath: sess.Path getter wired into the
    interactive config.

slash_suggest.go + README
  - /session listed in the slash catalog and the README
    commands table, with a short description of the two
    direct forms.

Not wired into the session_dialog.go picker (which stays
resume-only); a later change could add "export this one"
directly from the picker rows if that's useful.
2026-04-20 10:04:33 +02:00
patriceckhart
9a32f9cf5c docs(readme): document /telegram bridge + split from standalone daemon
README: the Telegram section now leads with "two ways to run it"
and splits into a "From inside the TUI" subsection (covering
/telegram connect/disconnect/status, the you:/zot: mirroring
convention, the · tg · status tag, and the refuse-when-daemon-
running guard) followed by the existing "Standalone daemon"
subsection (unchanged content, renamed heading).

No code change; description only.
2026-04-20 09:35:25 +02:00
patriceckhart
098a79743d feat(tui): /telegram connect | disconnect | status
The Telegram bridge can now mirror into the running TUI session.
Runs inside the zot process (no daemon needed); DMs from the
paired user become prompts in the current agent, and the
assistant's final text is sent back to Telegram. You see the full
conversation in the TUI in real time and on your phone.

UI:
  - /telegram or /tg with no arg opens a picker (connect /
    disconnect / status) that reflects current state.
  - /telegram connect  starts the bridge. Refuses if bot.json
    has no token (tells you to run `zot telegram-bot setup`) or
    if the background daemon is already polling.
  - /telegram disconnect  stops the bridge cleanly.
  - /telegram status  one-liner: "connected as @botname, paired
    with user X" / "background daemon running (pid N)" /
    "not configured" / "disconnected".
  - Status bar gets a "· tg · ~/cwd" tag while the bridge is
    active, next to the "· jailed ·" tag if that's also on.

How it's wired:

  internal/agent/modes/telegram/bridge.go (new)
    A slim Bridge type that owns the long-poll loop + typing
    indicator + reply sender but delegates the agent side to a
    Host interface. Not an agent itself - just a courier that
    pushes inbound DMs at a host and relays outbound text.

  internal/agent/modes/telegram_dialog.go (new)
    Picker with connect / disconnect / status rows. Shape
    mirrors the logout dialog: arrow keys, enter, esc.

  internal/agent/modes/interactive.go
    - New SubmitOrQueue(text, images) that runs if idle or
      queues if busy. Telegram Host calls this so DMs use the
      same queuing semantics as the user's editor submit.
    - New CancelTurn() for when Telegram sends /stop.
    - telegramHost adapter wires the Interactive to the
      bridge without a cyclic import (bridge lives in
      modes/telegram, interactive in modes; the adapter is
      in modes so it's fine).
    - EvAssistantMessage handler now also forwards the final
      visible text to the bridge when active (goroutine, so
      the network call doesn't hold the event-loop lock).
    - Bridge is stopped on zot exit via a defer in Run().

  internal/tui/view.go
    StatusBarParams gains Telegram bool; the cwd line builds a
    composite "· jailed · tg · ~/cwd" when both tags apply.

  internal/agent/modes/slash_suggest.go
    /telegram added to the slash catalog.

Collision safety:
  /telegram connect refuses when the background daemon
  (telegram.IsRunning via bot.pid) is alive. Two concurrent
  long-poll consumers of the same bot always race and one
  drops half the updates; refusing up-front beats half-working
  silently. Message tells the user exactly what to do.

Attachments:
  Image attachments arriving in Telegram are downloaded and
  queued as user-prompt images (same code path as drag-drop).
  Non-image attachments are ignored for now.

Pairing:
  First Telegram user to DM /start claims the bridge; the id
  is persisted to bot.json so subsequent connects are already
  paired. Anyone else DMing the bot gets "this bot is paired
  with a different user."

README: /telegram row added to the slash-commands table.
2026-04-20 09:18:04 +02:00
patriceckhart
b6fc3fd886 rename: /lock -> /jail, /unlock -> /unjail
User-facing slash commands renamed to /jail and /unjail. The
internal Sandbox type (Lock/Unlock/Locked methods, atomic.Bool
field) keeps its mutex-style names because those describe the
implementation, not the feature. Everything the user sees swaps:

- slashCatalog: /jail + /unjail entries and descriptions.
- runSlash handlers: case "/jail" / case "/unjail"; status line
  reports "jailed to <cwd>" / "unjailed".
- Status bar tag: "· jailed · ~/cwd" (was "· locked ·").
- Sandbox error messages: "jailed: path X is outside sandbox
  root Y (use /unjail to disable)" etc.
- README: table rows, section heading, body text, busy-mode
  section all updated.
- Website (/Users/pat/Sites/zot): Tools section prose updated.
- SDK doc comment in pkg/zotcore refers to /jail.

Internal identifiers (Sandbox, Lock(), Unlock(), Locked(),
CheckPath, CheckCommand, slashCancelsTurn switch) unchanged.

Verified: go vet clean, go test -race ./... clean, bun
typecheck + lint + build clean on the site.
2026-04-20 08:57:40 +02:00
patriceckhart
e2f2092478 fix(no-yolo): don't auto-refuse tool calls in non-interactive modes
Previously --no-yolo in -p / --json / rpc modes auto-refused every
tool call. That made the flag dangerous to pass to scripts: a
single --no-yolo in a shell config or wrapper script would silently
break any tool-using prompt.

New behaviour:
  - Default: every mode is yolo (tools run freely, no prompts).
  - --no-yolo + interactive TUI: confirm dialog before each tool.
  - --no-yolo + -p / --json / rpc: stderr warning and ignore the
    flag. Tools run freely; scripts keep working.

The TUI confirm dialog and /yolo runtime toggle still work as
before. Also removed the unused wireNoYoloAutoRefuse helper and
simplified core.NewConfirmGate's doc comment.
2026-04-19 19:17:05 +02:00
patriceckhart
ac6d556f0a feat(tool-gate): --no-yolo flag, confirm dialog, /yolo runtime toggle
Adds a per-tool-call confirmation gate. Default stays yolo mode
(tools run freely, same as today). Pass --no-yolo to require
explicit user approval before each tool invocation.

Interactive TUI:
  A dialog appears before every tool call. Shows the tool name and
  a one-line preview of its args (command / path / url / etc.)
  with four choices, selectable by arrow keys or numeric shortcut:

    1. yes                     (run this call)
    2. yes, always this tool   (skip prompts for this tool,
                                session-scoped)
    3. yes, always             (skip prompts for every tool,
                                session-scoped)
    4. no                      (refuse and let the model try
                                something else)

  Esc/ctrl+c refuses the current prompt. Esc during a running turn
  both cancels the turn AND drains any pending confirm so the
  agent goroutine doesn't deadlock. Multiple pending confirms are
  queued and answered one at a time with a count visible in the
  header.

  Type /yolo to disable the gate for the rest of the session
  (equivalent to the "yes, always" choice but without needing a
  pending prompt). Any currently-open confirm auto-allows so the
  agent keeps moving.

Print / JSON / RPC modes:
  No interactive prompt is available, so every tool call is
  auto-refused with a reason the model can learn from:
  "tool call refused: --no-yolo is active and there is no
  interactive prompt in this mode; ask the user what to do
  instead". Observed behaviour: the model pivots to asking the
  user directly instead of looping on the same tool.

Implementation:
  internal/core/confirm.go
    - ConfirmDecision, Confirmer interface
    - ConfirmGate with session-scoped memory for "always this tool"
      and "always everything" decisions, both concurrency-safe
    - BuildPreview: turns {"command":"ls"} into "ls", etc.
    - Lives in core to avoid a modes -> agent import cycle
  internal/core/confirm_test.go
    - Tests: nil gate allows, nil-inner refuses with reason, one-
      shot allow doesn't remember, remember-tool short-circuits
      only same tool, remember-all short-circuits everything,
      refusal reasons surface, empty-reason gets a default,
      runtime AllowAll works, BuildPreview handles each field
  internal/agent/modes/confirm_dialog.go
    - Queue-based dialog, HandleKey wiring, CancelAll and
      AllowAllPending for the two exit cases
  internal/agent/modes/interactive.go
    - InteractiveConfig gains NoYolo + ConfirmGate fields
    - Interactive implements core.Confirmer via a response channel
    - Confirm dialog dispatched FIRST in the key-handler chain so
      keys never leak to other dialogs while the agent is blocked
    - Esc-while-busy also calls confirmDialog.CancelAll so the
      agent unblocks
    - /yolo slash command handled in runSlash
  internal/agent/cli.go
    - Constructs the ConfirmGate when args.NoYolo is set,
      BeforeToolExecute calls it first, extensions only see calls
      the user already approved
    - After iv is built, SetConfirmer(iv) wires the gate's inner
      so interactive + gate share the same struct
    - wireNoYoloAutoRefuse() for print / json modes
  internal/agent/args.go
    - --no-yolo flag and help text
  internal/agent/modes/slash_suggest.go
    - /yolo added to slashCatalog

Verified end-to-end: fresh zot --no-yolo -p "read sample.ts" now
returns "I can't read files in this mode (--no-yolo without an
interactive prompt). How would you like to proceed" instead of
actually reading.
2026-04-19 19:12:45 +02:00
patriceckhart
f371687654 perf(anthropic): fix cost double-count, tighten caching, correct catalog
The status-bar was showing 2x the real cost. Anthropic's SSE stream
sends the full cumulative usage payload on both message_start AND
message_delta, and our code was summing them with += on each. Cache
tokens, the biggest cost component on multi-turn sessions, were
therefore counted twice on every single API call.

Fix: assign instead of accumulate within one Stream() invocation.
Cross-call accumulation still happens correctly in
core.CostTracker.Add(). Verified end-to-end: a truly fresh "read
sample.ts on desktop" session that used to report $0.15 now reports
$0.07 with the same cache-hit rate.

While chasing that, audited and corrected the rest of the request
pipeline so the cache actually hits cleanly.

Provider layer (internal/provider/anthropic.go):
  - cache_control on the Claude Code identity line (was uncached),
    giving Anthropic a first stable checkpoint independent of the
    user system prompt. Turns a cold start from R=0 into R>0 for
    any subsequent fresh session within the cache TTL.
  - tool_result blocks go in their OWN new user message instead of
    merging into the preceding user message. Merging was mutating
    the prior user message's content array between turns, busting
    byte-identical prefix match in Anthropic's cache.
  - tagLastUserCache: exactly one cache_control on the last user
    message (was two), so identity + sysprompt + last-tool +
    last-user fits Anthropic's 4-breakpoint budget exactly.
  - user-agent dropped its "(external, cli)" suffix to match the
    canonical Claude Code string exactly.
  - ZOT_DEBUG_ANTHROPIC=<path> env hook appends each outgoing
    request body (one JSON object per line) to that file. Off by
    default; for debugging cache / cost issues in the field.
  - Usage field handling now correctly assigns the latest value
    from each SSE event instead of summing.

Core (internal/core/tool.go):
  - Registry.Specs() now sorts tools alphabetically. Go map
    iteration order is randomized per call; randomized tool arrays
    were breaking Anthropic's byte-level prefix match on every
    single call within a session.

System prompt (internal/agent/systemprompt.go):
  - Restored a substantial default prompt with structured tools +
    operating guidelines sections. The earlier aggressive trim
    dropped us under Anthropic's 1024-token minimum cacheable
    prefix floor: prefixes below 1024 tokens are silently NOT
    cached by Anthropic, so every fresh session started cold with
    R=0 no matter what else we did.
  - Current default ~1040 tokens on its own; with identity and
    tools it's ~1400, comfortably above the 1024 floor.
  - --system-prompt, --append-system-prompt, and
    $ZOT_HOME/SYSTEM.md escape hatches all still work and take
    precedence.

Model catalog (internal/provider/models.go):
  - claude-opus-4-5: 1M ctx / 128k max -> 200k ctx / 64k max. I had
    over-extrapolated; 1M context is a 4.6+ feature.
  - gpt-5.4: 400k -> 272k. Canonical value on both the OpenAI
    direct API and the ChatGPT Codex OAuth backend.
  - gpt-5.1, gpt-5.2, gpt-5.3, gpt-5.4-mini: pinned to 272k.
    OpenAI advertises 400k on direct and Codex caps at 272k. zot
    serves both from one catalog row per id, so we pin to the
    smaller number to keep the context-usage meter honest under
    subscription auth. Direct-API users see a conservative estimate
    instead of an inflated one.

README:
  - Tiny capitalization touch-up on the opening line.
2026-04-19 18:57:18 +02:00
patriceckhart
99c9ba8062 feat(ext): phase 4 - full-event interception, arg rewrites, /reload-ext
Clears every deferred extension todo in one push:

1) Interception expands to three events: tool_call (already shipped),
   turn_start (gate the turn before the model call, e.g. rate-limit /
   business-hour), and assistant_message (suppress or rewrite the
   user-visible text while keeping the model's original output in
   the transcript).

2) Tool-call args can now be rewritten mid-flight. An interceptor
   returning modified_args replaces the JSON the tool actually
   receives, without the model seeing the rewrite. Chains: each
   subscriber sees the previous one's output, letting guards
   successively redact / patch / augment. Invalid JSON is dropped
   safely.

3) /reload-ext hot-reloads every extension without restarting zot.
   The manager gracefully shuts down all running subprocesses,
   re-reads extension.json from disk, respawns (including --ext
   paths remembered from startup), and the host rebuilds the agent's
   tool registry in-place so freshly-registered tools are callable
   immediately.

Wire-format changes (extproto):
- EventInterceptResponseFromExt gains modified_args and replace_text
  fields (both optional, ignored when block=true).
- EventInterceptFromHost gains Step (for turn_start) and Text (for
  assistant_message) alongside the existing tool_call payload.

Core agent changes:
- BeforeToolExecute signature now returns (allowed, reason,
  modifiedArgs json.RawMessage). Non-nil+valid JSON args replace
  tc.Arguments before Tool.Execute runs.
- New BeforeTurn hook, invoked in runLoop before oneTurn. Blocking
  cancels the turn with an EvTurnEnd{StopError} carrying the reason.
- New BeforeAssistantMessage hook, invoked after finalMsg is
  assembled but before the EvAssistantMessage emit. Supports
  suppress (block=true) and text rewrite (replace_text). Transcript
  always gets the original; UI gets the rewritten text.
- New SetTools(reg) so /reload-ext can swap the registry on the
  live agent under the agent mutex.

Manager changes:
- InterceptToolCall now returns InterceptResult (Block, Reason,
  ModifiedArgs, ReplaceText), with a chain that folds rewrites.
- New InterceptTurnStart and InterceptAssistantMessage.
- New Reload(ctx, grace) tears down and respawns everything,
  returning ReloadStats{Stopped, Loaded, Ready, Errors}.
- New SetOnReload(fn) callback the host uses to rebuild the agent
  tool registry after a reload.
- LoadExplicit remembers --ext paths so Reload respawns them.
- subscribe accepts "tool_call", "turn_start", "assistant_message"
  under "intercept".

SDK (pkg/zotext):
- New handler types: ToolCallHandler, TurnStartHandler,
  AssistantMessageHandler, and their decision structs
  (ToolCallDecision with ModifiedArgs, AssistantMessageDecision
  with ReplaceText).
- New registration methods: InterceptToolCallX (rich variant of
  the existing InterceptToolCall), InterceptTurnStart,
  InterceptAssistantMessage.
- dispatchIntercept routes per-event with panic recovery and
  always emits exactly one event_intercept_response.

TUI:
- /reload-ext slash command registered in slashCatalog and
  runSlash. Added to slashCancelsTurn so it waits for idle like
  /compact does.
- runReloadExt shows a "reloading extensions..." status, runs the
  Manager.Reload on a goroutine, and reports the resulting stats.

Tests:
- internal/core/intercept_test.go: verifies args are actually
  rewritten on the way to Tool.Execute, malformed JSON is ignored,
  and block surfaces the reason as an error ToolResult.
- internal/agent/extensions/intercept_test.go: end-to-end with a
  bash extension subprocess that blocks rm -rf, rewrites other bash
  args to "echo GUARDED:", passes through read calls, allows
  turn_start, and redacts SECRET in assistant messages. Second test
  verifies Reload respawns the subprocess, re-registers its command,
  and fires the onReload callback.

Docs:
- docs/extensions.md: rewrote the intercept section to cover all
  three events, added a table of event_intercept_response fields,
  documented the /reload-ext hot-reload command, expanded the SDK
  section with examples of every handler, moved the old "future"
  items into a shipped Phase 4.
- README.md: extensions summary mentions intercept beyond tool_call,
  /reload-ext added to the slash-commands table and to the
  turn-cancel list in "Queued messages".
2026-04-19 17:02:04 +02:00
patriceckhart
9f031f941a docs: complete README, proper capitalization, add 130x130 logo
- Add the zot logo at the top of the README, sized to 130x130 via
  an HTML <img> tag so GitHub honours the dimensions. Reuses the
  existing internal/assets/zot-logo.png that's already embedded in
  the binary for the login callback pages, so the README is
  self-contained.
- Proper English capitalization in prose and section headings
  (intro bullets kept lowercase per the custom intro).
- Drop em-dashes in favour of periods, commas, semicolons.
- No emojis anywhere.
- Fill in the previously-missing flags: --ext / -e, --no-ext,
  --with-skills, --no-skill.
- Add /skills row to the slash-commands table, plus a dedicated
  /skills section describing the picker.
- New dedicated Extensions section documenting zot ext install /
  list / logs / enable / disable / remove, --ext for development,
  and pointing at examples/extensions + docs/extensions.md.
- New dedicated Skills section documenting the discovery layout,
  --with-skills opt-in, and the claude / agents compatibility
  paths.
- Updated $ZOT_HOME tree to include skills/ and extensions/.
- Read-only-while-busy slash list in "Queued messages" updated to
  include /skills.
- Source layout table expanded with internal/agent/extensions,
  internal/extproto, internal/skills, pkg/zotcore, pkg/zotext.
- JSON mode now links docs/rpc.md for the schema instead of the
  stale instructions.md §8 reference.
- ctrl+o description specifies which tools' output it collapses.
2026-04-19 16:25:48 +02:00
patriceckhart
b9e7517149 feat(tui): show github release notes once after upgrading
The first time a user launches a newer zot binary, the tui pops
a dismissible overlay with the release notes for that version.
Press any key to close; the version goes into config.json's
last_changelog_shown so the same notes never reappear.

Lifecycle:
  - dev builds (version "" / "dev" / "0.0.0"): no fetch ever
  - first-ever launch (no LastChangelogShown stored): seed it
    silently with the current version so fresh installs don't
    get release notes dumped at them
  - subsequent launches with the same version: skipped (config
    already records that version was shown)
  - launch with a different version: fetch the release page from
    https://api.github.com/repos/patriceckhart/zot/releases/tags/v<ver>
    and open the dialog if the body is non-empty
  - dismiss writes LastChangelogShown so it never repeats

Components:
  - internal/agent/changelog.go: FetchChangelog/Async, and the
    Should/Mark/Seed helpers around config.LastChangelogShown.
    Honours $GITHUB_TOKEN exactly like the install scripts and
    the existing update check, so private-repo fetches work
    with auth.
  - internal/agent/modes/changelog_dialog.go: the overlay.
    Markdown body via the existing RenderMarkdown pipeline,
    scrollable with up/down/pgup/pgdn, any other key dismisses.
  - internal/agent/modes/interactive.go: new ChangelogChan and
    OnChangelogDismiss config fields, single-shot select case
    in Run() that opens the dialog when a payload arrives.
  - internal/agent/cli.go: spawns the fetch goroutine, gates it
    on ShouldShowChangelog, wires OnChangelogDismiss to
    MarkChangelogShown so the version is persisted.

Best-effort: timeouts at 4s, missing tag => silent skip, network
failure => silent skip + retry on next launch (no
LastChangelogShown update if we never showed anything).

Documented in the README under the SYSTEM.md note.
2026-04-19 16:12:13 +02:00
patriceckhart
e425dbba59 feat(systemprompt): $ZOT_HOME/SYSTEM.md as a persistent override
Resolution order for the system prompt is now:

  1. --system-prompt <text>    (per-run override; highest)
  2. $ZOT_HOME/SYSTEM.md       (persistent user override; new)
  3. built-in defaultIdentity + defaultGuidelines

When SYSTEM.md exists and --system-prompt is empty, its contents
replace the entire identity + guidelines block (same semantics
as --system-prompt). The tool list + skill manifest + appended
sections + date/cwd footer are still added on top, so the file
should contain just the identity / behavior text.

readUserSystemPrompt swallows file errors on purpose: a missing
or unreadable SYSTEM.md falls back to the built-in default
rather than crashing the run. Cached on Resolved.systemCustom
so MergeExtensionTools' system-prompt rebuild path also picks
up the override.

End-to-end verified live:
  with SYSTEM.md (pirate persona): "Arrr, I be Zot, a scallywag..."
  without:                         "I'm zot, a lightweight terminal..."

Documented in README's $ZOT_HOME tree + the --system-prompt
flag note.
2026-04-19 15:47:57 +02:00
patriceckhart
463c62daf3 docs: clarify that no extensions are installed by default
Two-line addition to docs/extensions.md and a tightening of the
README bullet point. examples/extensions/* are reference code; a
fresh `zot install` gives you a clean agent. Users opt in by
copying examples (or any other extension) via `zot ext install`
or by pointing `zot --ext PATH` at a working directory for one
session.

No code changes.
2026-04-19 15:23:48 +02:00
patriceckhart
222a62c70f feat: skills — reusable instructions discovered from SKILL.md files
A skill is a single SKILL.md file with a YAML frontmatter header,
discovered from well-known directories at startup. Two integration
points:

  1. The system prompt gains a short manifest listing each skill's
     name + one-line description. Cheap (a few dozen tokens).
  2. A built-in `skill` tool lets the model load any one skill's
     full body on demand and follow the instructions there.

The on-demand-load model keeps token usage cheap: only the
manifest goes into every request; the body is fetched as a tool
result the one or two turns the model actually needs it.

Discovery (priority order — first match wins per name):
  ./.zot/skills/<name>/SKILL.md            project (native)
  $ZOT_HOME/skills/<name>/SKILL.md         global (native)
  ./.claude/skills/<name>/SKILL.md         project (claude-compat)
  ~/.claude/skills/<name>/SKILL.md         global (claude-compat)
  ./.agents/skills/<name>/SKILL.md         project (agent-compat)
  ~/.agents/skills/<name>/SKILL.md         global (agent-compat)

Compat paths are deliberate: any SKILL.md written for a related
ecosystem works in zot unchanged.

Frontmatter fields:
  name           optional; defaults to directory name
  description    required; shown in the system prompt
  allowed-tools  optional list; informational (no enforcement)
  permissions    optional per-tool patterns; informational

allowed-tools and permissions are parsed but not enforced this
version. They render in the body so the model can self-regulate.

What landed:

- internal/skills: discovery + frontmatter parsing (no yaml dep —
  hand-rolled subset for the limited shape skills use), the on-
  demand `skill` tool implementing core.Tool, system-prompt
  addendum, FindByName lookup helper. Real unit tests cover all
  five locations + dedup priority + parser corner cases.

- internal/agent/build.go: Resolve discovers skills, registers the
  skill tool when at least one was found, appends the manifest to
  the system prompt's append list. Resolved gains a SkillTool
  field so the tui can read the live set.

- internal/agent/modes/skills_dialog.go: /skills picker with two
  modes — list view (cursor + paging) and body view (markdown-
  rendered with scroll). Refreshes its snapshot each open via
  cfg.SkillSnapshot so edits to a SKILL.md during a session are
  reflected immediately.

- /skills slash command + entry in slashCatalog.

- examples/skills/code-review and examples/skills/test-fix as
  starter skills demonstrating procedural style + frontmatter.

- docs/skills.md: full reference covering discovery, frontmatter,
  inspection, authoring tips, and ecosystem compat.

End-to-end verified against the live anthropic backend:

  prompt: "What skills do you have available?"
  -> "- code-review\n- test-fix"

  prompt: "Use the skill tool to load the code-review skill,
           then summarize step 1."
  -> [tool_call] skill({"name":"code-review"})
  -> [tool_result] body returned
  -> "Step 1 is to establish what changed by running git status..."
2026-04-19 14:32:30 +02:00
patriceckhart
0c92d6e914 feat: extension system (subprocess + json-rpc, any language)
Phase 1: extensions can register slash commands and push chat
notifications. Tools and event subscriptions land in later phases.

Architecture: each extension is its own subprocess. Zot launches
it on startup, completes a hello/hello_ack handshake over its
stdin/stdout, then routes slash commands the extension registered.
Crash isolation, language agnostic, works with any executable
that can read/write json lines.

What lands here:

- internal/extproto: shared wire-format types (Frame, HelloFromExt,
  RegisterCommandFromExt, CommandResponseFromExt, NotifyFromExt,
  HelloAckFromHost, CommandInvokedFromHost, ShutdownFromHost...).
  Both the host and the SDK marshal/unmarshal the same types.

- internal/agent/extensions: discovery + lifecycle manager.
  - Discover() walks $ZOT_HOME/extensions and ./.zot/extensions
    (project-local first, global second; first wins for duplicates)
  - Spawns each enabled extension, captures stderr to
    $ZOT_HOME/logs/ext-<name>.log
  - Reads frames in a goroutine, dispatches register_command and
    notify, correlates command_response by id
  - Stop() sends shutdown, waits 2s, then SIGTERM/SIGKILL
  - HostHooks abstracts the tui callbacks (Notify/Submit/Insert/Display)

- Interactive bridge: extensions slot into the slash dispatcher
  *after* the built-in catalog, so built-ins always win on conflict.
  Extension-registered commands also flow into the autocomplete
  popup and /help via slashSuggester.SetExtra. NotifyFromExt frames
  render as muted [ext-name] notes above the editor.

- internal/agent/extcmd: `zot ext` CLI.
    list / install <path|git-url> / remove / enable / disable / logs

- pkg/zotext: public Go SDK. Construct an Extension, register
  Command(name, desc, fn), call Run(). Fn returns a Response built
  with Prompt(), Insert(), Display(), Noop(), or Errorf(). Stderr
  via Logf() so stdout stays clean for the protocol.

- examples/extensions/hello: working Go example registering /hello
  and /summon, plus README + extension.json.

- docs/extensions.md: full protocol reference, including a
  ~30-line raw-Python example for users who don't want the SDK.

Tests: internal/agent/extensions/manager_test.go spawns a mock
extension via /bin/sh and exercises the full handshake -> register
-> invoke -> response cycle. Verifies the hello frame ordering,
correlation-by-id, and graceful shutdown.

Verified manually: built and installed the example, drove it via
stdin pipes, confirmed clean handshake + correct frame ordering
and shutdown_ack. Builds vet-clean on darwin / linux / windows.

Editor.Insert exported (was Editor.insert) so the extension hooks
can drop text into the input.
2026-04-19 14:09:43 +02:00