Commit graph

230 commits

Author SHA1 Message Date
patriceckhart
63e28ad156 Close swarm event log in follower test 2026-05-16 14:18:12 +02:00
patriceckhart
d7fe461910 Fix swarm tests on Windows CI 2026-05-16 14:12:03 +02:00
patriceckhart
37526a6286 Format swarm code 2026-05-16 14:01:23 +02:00
patriceckhart
36f190af31 Deliver sliding-in messages during agent loop 2026-05-16 12:47:38 +02:00
patriceckhart
467ad0a990 modes: adapt slide-back chord hint to terminal
The "Press Option+up to slide back into input" hint shown under the
sliding-in queue was correct on Ghostty, iTerm2 (Meta=Option),
Terminal.app (Use Option as Meta), Alacritty, and Kitty -- all of
which send CSI 1;3A for Option+Up, which the input parser reads as
KeyUp + Alt.

VS Code's integrated terminal (xterm.js on macOS) swallows plain
Option as a compose modifier by default, so Option+Up never reaches
zot as an Alt-modified arrow. Option+Shift+Up does work there:
xterm.js emits CSI 1;4A (Shift+Alt), which the parser already
accepts as alt=true. The binding has always worked in VS Code;
only the displayed hint was misleading.

Fix: a small slideBackChordHint() helper that returns
"Option+Shift+up" when TERM_PROGRAM=vscode and "Option+up"
otherwise. interactive.go's queue-hint row calls it instead of
hardcoding the chord. The binding itself is unchanged -- both
chords work on every terminal -- the hint just adapts to what the
host actually delivers.

README.md gains a one-sentence note under Queued messages
documenting both chords and that the hint adapts via
$TERM_PROGRAM.

Tests cover the VS Code branch, case-insensitive detection
(VSCode / VSCODE / VsCode), and the default for "", "ghostty",
"iTerm.app", "Apple_Terminal", "alacritty", "kitty".
2026-05-16 12:01:05 +02:00
patriceckhart
b11e6ed4e4 swarm: introduce /swarm dashboard, /btw-style transcript view, and per-session scope
A /swarm subsystem for long-running parallel subagents. Each agent runs
in its own subprocess against a fresh git worktree (branch swarm/<id>)
with its own persistent session file and unix-socket inbox; the parent
zot stays in the main session and pokes / observes them via the
dashboard.

Highlights:

- New internal/swarm package: Agent, Spawn/Resume/Kill/Remove, event log
  (events.jsonl), inbox protocol (listen/dial), worktree manager, exec
  runner that spawns "zot --swarm-agent ...".
- New internal/agent/swarm_agent.go: daemon-mode child entry point.
  Reuses the standard agent loop but persists turns to the supervisor-
  chosen session.json and streams events as JSONL on stdout. Mirror to
  events.jsonl is dormant while the supervisor's stdout pipe is alive so
  events do not get double-written.
- Resume reattaches in place: reuses the same worktree, session, branch
  and inbox path; carries forward the prior transcript replayed from
  events.jsonl. Resume no longer re-fires the original Task as a fresh
  user turn -- that was producing "agent busy; send cancel first" races.
- core.NewSessionAtPath plus an openOrCreateSession fallback so the
  child actually persists its session.json at the supervisor-chosen path
  on first spawn instead of running with sess==nil.
- Dashboard in internal/agent/modes/swarm_dialog.go + swarm_slash.go:
  list / new / kill / remove / resume / logs / send subcommands plus an
  interactive picker. Transcript view is /btw-style: an always-on
  inline editor at the bottom, streaming auto-follow, inline busy
  spinner with the agent's current activity such as "thinking" or
  "tool: edit". /model inside the spawn editor pops the global model
  picker.
- Per-session scope: each spawn is stamped with the host session's id
  and only shows in that session's /swarm dashboard. Pre-upgrade agents
  -- empty session_id -- remain visible everywhere as a safety net. The
  active scope is re-applied whenever loadSession swaps sessions.
- Resolve falls back to the provider's default model when the persisted
  cfg.Model is no longer in the catalogue, warns on stderr, and rewrites
  config.json so the next launch is silent.
- ReadEventLog folds back-to-back same-type identical-payload events
  within 250ms so events.jsonl files polluted by the old supervisor +
  mirror double-write read back cleanly.
- DrawLog gains an idle no-op fast path: identical buffer plus identical
  cursor = emit nothing, so the terminal's cursor blink keeps ticking in
  dialogs whose underlying agent is idle.

Slash UX:

- New /swarm command with subcommands; the suggester picks it up.
- README.md documents the full dashboard, CLI, and persistence story,
  and explicitly notes that /session export does NOT bundle subagents
  -- their worktree and unix-socket inbox cannot round-trip through a
  .zotsession.

Tests cover: SpawnReq + Resume lifecycle, session-id scoping + persistence,
default-child-args spawn vs resume contract, NewSessionAtPath at a fixed
path, model fallback when the configured model is gone, swarm dialog
behaviour -- auto-open editor, /model in spawn editor, transcript grows
without internal scroll, busy spinner, multi-message send -- event-log
dedup, swarm emitter dormant-until-orphan, and the DrawLog idle no-op +
change-breaks-fast-path invariants.
2026-05-16 11:53:20 +02:00
patriceckhart
797041288d tui: strip ANSI from prompt before wrapping so cursor lands at end
When the interactive editor's Prompt carried ANSI styling — the
themed "▌ " glyph that interactive mode builds via
cfg.Theme.AccentBar — the raw escape bytes leaked into wrapLine's
per-rune width counter. Each byte of the SGR sequence (ESC, '[',
digits, ';', 'm') reported runewidth 1, so an 11-byte color escape
inflated the perceived prompt width by ~10 cells. wrapLine then
made wrap decisions against that inflated width, locateCursor
walked the inflated bodies, and Render() returned a cursor column
that landed inside the wrapped row instead of at its visible end.

The bug was intermittent because depending on the typed buffer's
length the geometry sometimes aligned by accident. Drag-and-dropping
a long screencaptureui temp path into VS Code's terminal reliably
triggered it, since the path stays inline (the temp file is already
gone by paste time so collapseOrQuoteFilePaths -> pathExists
returns false -> falls through to verbatim insert).

Fix: in Editor.Render, do all wrap and cursor math against a
plain-text prompt (ANSI stripped via stripANSI), then re-attach
the styled original to the very first wrapped row's leading
substring before returning. Continuation rows already use an
indent of spaces only, so they need no styling fixup. wrapLine
itself stays ANSI-unaware on purpose: the rest of the codebase
relies on its simple rune-based behaviour for plain text and
making it ANSI-aware would be a much bigger change with regression
risk elsewhere.

Adds editor_ansi_prompt_test.go which reproduces the exact captured
live scenario (the ANSI-themed prompt + the verbatim screencaptureui
path + ' hello' typed afterwards) and asserts the cursor lands at
the visible end of the last wrapped row.
2026-05-13 11:13:20 +02:00
patriceckhart
8b4b62f240 tui: suppress \x1b[3J scrollback-clear on VS Code's terminal
VS Code's integrated terminal (xterm.js) interprets the
erase-in-display-3 escape (\x1b[3J) as "drop scrollback rows AND
snap the viewport to the top of the remaining buffer." Once the
user has reopened a terminal with VS Code's persistent-sessions
feature on, there is real replayed scrollback above the live
cursor, so the snap is visible: the host scrollbar yanks to the
top on every full repaint — first paint, Ctrl+L (Renderer.Clear),
and any writeFull(true) shrink.

Every other terminal we tested (iTerm, Ghostty, Kitty, Alacritty,
Apple Terminal) treats \x1b[3J as "drop scrollback rows without
moving the viewport," which is what we want.

Detect VS Code (and Cursor, which shares xterm.js) via
$TERM_PROGRAM == "vscode" in NewRenderer and stash the result on
the Renderer as keepScrollback. Gate all three emission sites
(resize handler, Clear(), writeFull(true)) through a single
helper clearScrollbackSeq() that returns "" when keepScrollback
is true and SeqClearScrollback otherwise.

Trade-off on VS Code: stale zot frames remain visible if you
scroll up in the terminal's scrollback. Strictly less disruptive
than the scrollbar yanking on every Ctrl+L, and limited to the
one terminal that actually has the bug.
2026-05-13 10:37:48 +02:00
patriceckhart
f47996bc82 agent: add 'zot update' subcommand for in-place self-update
Adds a CLI subcommand that downloads the latest GitHub release for
the current GOOS/GOARCH, verifies its sha256 against checksums.txt,
extracts the archive, and atomically replaces the running binary.

  zot update           install the latest release
  zot update --check   show whether one is available, install nothing
  zot update --help    usage

Dispatch follows the same router shape as runBotCommand /
runExtCommand in cli.go. Asset naming stays in sync with the
archives.name_template in .goreleaser.yaml (zot_<ver>_<os>_<arch>).
Reuses fetchLatestRelease + versionLess from update.go so the
"what's latest" answer is identical to the in-TUI banner.

Refuses to operate on dev builds (version 0.0.0) since the version
comparison is meaningless and we'd happily downgrade a freshly
compiled local build back to whatever ships on GitHub. $GITHUB_TOKEN
is honoured so private-repo releases work.

Unix: atomic os.Rename in place (the kernel keeps the running
binary's inode alive until exit). Windows: rename current aside
to .old, drop the new exe in, leave the .old for next-update
cleanup since the running process has it locked.
2026-05-12 21:27:11 +02:00
patriceckhart
43da5e5249 tui: reset auto-follow baseline on new turn to stop viewport jump
startTurnWithImages clears the previous turn's tool-call overlay and
pins scrollOffset to 0. Without also resetting prevChatLen/prevChatCols,
the auto-follow guard on the next render sees a synthetic negative
delta equal to the number of overlay rows that were cleared, and
nudges scrollOffset by that amount. On terminals that mirror zot's
chat-pane scroll into their native scrollbar this is visible as a
viewport jump the instant the user presses enter on a follow-up
prompt.

Zero them out in the same locked block so the guard short-circuits
on the very next render, the same way it already does on column
resize. The legitimate "user scrolled up while content streams in"
case is unaffected because prevChatLen is repopulated on that first
post-submit render.
2026-05-12 20:43:43 +02:00
patriceckhart
1030ae584d Hide /jail when sandbox is already jailed
/jail and /unjail are mutually exclusive actions, so only show the
one that actually applies in the current state.
2026-05-11 18:11:18 +02:00
patriceckhart
21b88f6a6b tui: drop cursor-home from forced-full-repaint clear too 2026-05-11 11:44:22 +02:00
patriceckhart
6dbe0b7b47 gofmt: realign const block in tui/terminal.go 2026-05-11 09:31:21 +02:00
patriceckhart
33a865b782 tui: drop cursor-home from clear-screen to stop vscode scrollbar jump 2026-05-11 09:27:08 +02:00
patriceckhart
ce78a49b87 add deepseek provider (api-key, openai-compatible v4 catalog) 2026-05-10 16:49:31 +02:00
patriceckhart
f4d678e61e Trigger full redraw on overlay close and new turn submit 2026-05-10 10:59:57 +02:00
patriceckhart
c01e026961 Stabilize chat rendering: remove forced full repaints, fix scroll/diff edge cases 2026-05-10 10:36:05 +02:00
patriceckhart
c98e701843 Persist compaction checkpoints in sessions 2026-05-09 23:02:54 +02:00
patriceckhart
facc709060 fix: session dialog highlight doubling, row overflow, and resize flicker
- DrawLog: invalidate cached bottom rows when selection-highlight
  escapes are present so VS Code's terminal doesn't leave stale
  background colors on the previous cursor row
- session dialog: hard-clamp row text to terminal width so long
  session summaries don't soft-wrap into adjacent rows
- Resize: clear scrollback alongside screen so stale wider content
  doesn't bleed through when the terminal is narrowed
2026-05-09 22:55:45 +02:00
patriceckhart
9f0629bcaf Support AGENTS.md context files 2026-05-09 18:37:27 +02:00
patriceckhart
25cb7d5003 remove /yolo slash command
The runtime escape hatch was redundant: --no-yolo's per-call dialog
already exposes 'yes-always-this-session' which flips ConfirmGate
into allow-all mode without a separate command. Once a session
starts with --no-yolo, the only way to disable confirmations is now
to either pick the always-this-session option in the dialog or exit
and relaunch.

- slash_suggest.go: drop the /yolo entry from the autocomplete list.
- interactive.go: remove the case "/yolo" dispatch (falls through
  to 'unknown command') and delete the orphaned runYoloOn method.
- README: drop the /yolo row from the slash-command table and the
  trailing reference in the --no-yolo flag description.
2026-05-08 08:18:16 +02:00
patriceckhart
09acdbd1d4 lowercase funny working lines 2026-05-08 08:13:30 +02:00
patriceckhart
9b4f4da559 spinner: capitalize working-line phrases
Sentence-case looks better in the TUI status area than all-lowercase.
2026-05-08 08:04:18 +02:00
patriceckhart
ef93175bf9 add Google Gemini provider
- internal/provider/gemini.go: REST client against
  generativelanguage.googleapis.com/v1beta/models/{id}:streamGenerateContent
  ?alt=sse, mapping our message/tool format to Gemini's Content/Part schema
  and translating SSE chunks into the existing assistant-message event
  stream. Handles text, tool calls, thought-summary parts, and per-model
  thinking config (thinkingBudget for 2.5, thinkingLevel for 3.x with
  Gemini-3-Pro pinned to LOW minimum).
- internal/provider/discover.go: DiscoverGoogle pages /v1beta/models and
  filters to chat-capable ids (skips embeddings, AQA).
- internal/provider/models.go: catalog entries for gemini-2.5-pro,
  2.5-flash, 2.5-flash-lite, 2.0-flash, 2.0-flash-lite.
- internal/auth: 'google' is a recognized provider; API-key probe hits
  /v1beta/models with x-goog-api-key. OAuth flows reject google with a
  clear 'API-key only' error since Gemini Advanced subscriptions don't
  issue API tokens.
- internal/agent: env lookup for GEMINI_API_KEY / GOOGLE_API_KEY,
  default model gemini-2.5-pro, NewClient wires provider.NewGemini,
  background model discovery, /login + /logout + rescue dialog all
  include google.
- README: new ### Google Gemini section with auth model, free-tier
  limits, and reasoning-config notes.
2026-05-07 21:15:34 +02:00
patriceckhart
90351066b1 Recover queued messages with Option+Up
Pressing Option+Up while the agent is busy now pops the most recently
queued ('sliding in') message back into the editor so the user can
edit and resend it. Repeated presses keep peeling messages off the
tail of the queue, newest first; each press replaces the editor
contents rather than appending. When the queue is empty the keypress
falls through to the existing scroll-up behavior.

A muted hint row underneath the chips advertises the shortcut, using
the same color as the model info on the status bar so it reads as
ambient metadata.
2026-05-07 19:41:27 +02:00
patriceckhart
63694afce8 Improve Telegram status and stop commands 2026-05-07 19:05:57 +02:00
patriceckhart
caac4915ed Fix PowerShell checksum parsing 2026-05-07 18:54:45 +02:00
patriceckhart
528ecf2db3 Hide unjail command unless jailed 2026-05-07 18:41:13 +02:00
patriceckhart
380eca9615 drop rescue status bar claim from readme 2026-05-06 20:46:29 +02:00
patriceckhart
0a85be3c33 add model rescue picker and silent transient retries 2026-05-06 20:40:31 +02:00
patriceckhart
0f332faf66 fix kimi empty assistant messages 2026-05-06 18:49:43 +02:00
patriceckhart
f37f42b189 fix windows installer checksum lookup 2026-05-06 18:26:11 +02:00
patriceckhart
25a4cafb4e fix kimi oauth token refresh 2026-05-06 18:19:42 +02:00
patriceckhart
168941f09e tui: separate live prose from tool boxes 2026-05-05 18:06:40 +02:00
patriceckhart
bfaaaa3dd9 provider: set explicit kimi pricing 2026-05-05 17:48:04 +02:00
patriceckhart
6c5c4f213a tui: make main-screen renderer viewport-safe 2026-05-05 17:33:34 +02:00
patriceckhart
ea10cbad79 openai: replay reasoning_content for chat completions 2026-05-05 15:15:07 +02:00
patriceckhart
7134fb7c2a provider: replay reasoning items and repair orphan tool results 2026-05-05 15:00:41 +02:00
patriceckhart
ae4e019ee4 tui: use DECSC/DECRC for bottom-band anchor; normalize \r in editor input
DrawLog now saves/restores the cursor at the top of the bottom band
instead of relying on relative up-N math that drifted when the
terminal naturally scrolled between frames. This fixes duplicated
transcript blocks with empty gaps (previously only ctrl+l recovered).

Also strip literal carriage returns from pasted/typed editor text
before rendering. A bare \r moves the terminal cursor to column 0
and overwrites the left side of the input row, which looked like
missing highlight segments on continuation lines.
2026-05-05 14:35:40 +02:00
patriceckhart
07c073055d Document Kimi provider support 2026-05-05 11:04:07 +02:00
patriceckhart
a41cda5093 Add built-in Kimi provider support 2026-05-05 08:40:37 +02:00
patriceckhart
ff1af01fd7 tui: tighten live tool/streaming spacing 2026-05-04 16:55:33 +02:00
patriceckhart
b5ab07f9a0 tui: strip syntax highlight backgrounds 2026-05-04 15:54:13 +02:00
patriceckhart
17dec30b36 interactive: load resumed sessions asynchronously 2026-05-04 15:47:55 +02:00
patriceckhart
aceaffdec2 tui: keep live output outside scrollback 2026-05-04 15:05:30 +02:00
patriceckhart
76ca012170 tui: replay scrollback on transcript rewrites 2026-05-04 14:33:27 +02:00
patriceckhart
7ff6942f71 tui: render interactive mode in main scrollback 2026-05-04 14:12:20 +02:00
patriceckhart
471fdea0c0 interactive: smarter auto-compact triggers and resume scroll
- Pre-turn auto-compact: when the previous turn already pushed
  context past the threshold, condense before sending. The user's
  prompt is re-queued and fired automatically once compaction
  succeeds.
- HTTP 413 handling: a 'payload too large' from the provider is no
  longer surfaced as a status_err. Instead the request is retried
  after a transparent auto-compact pass.
- Both inline auto-compacts surface a yellow chat note above the
  status bar so the user sees the spinner *and* the reason; on
  success a status_ok like 'context auto-compacted; sending your
  last message' confirms the retry.
- Resume picker (/sessions and startup) now scrolls to the bottom
  of the loaded transcript instead of parking at the last user
  turn, so the most recent reply is always fully visible.
- Drop the VS Code mouse-capture path: native click-drag selection
  beats the wheel-speed boost there.
2026-05-04 12:46:53 +02:00
patriceckhart
1a49277130 core: read sessions with line reader instead of bufio.Scanner
Loading or exporting a session containing very large JSONL rows
(image blocks, big tool outputs, compacted history) failed with
'bufio.Scanner: token too long' — Scanner caps each token to its
buffer size, even when bumped to 20 MiB. A single oversized row
blocked OpenSession entirely so an existing long session could
not be resumed.

Switch session readers (OpenSession, SessionUsage, describeSession,
sessionHasNoMessages, ExportSession, ImportSession, BranchSession,
firstUserPrompt) to a shared bufio.Reader.ReadBytes-based JSONL
helper that handles arbitrarily long lines. Add a regression test
that opens and exports a session containing a >20 MiB row.
2026-05-04 12:14:12 +02:00
patriceckhart
b45f10718f interactive: cache idle chat render
The redraw path rebuilt the full transcript on every key event:
filtered the agent's full message slice, refreshed tool path maps,
walked every message through the per-message render cache, and
re-assembled the entire chat line buffer. With a long session, the
O(N) work per keystroke made typing visibly lag.

Add an idle render cache: the previously built chat is reused when
nothing relevant changed (terminal width, transcript revision,
status notes, help/update banners, expand-all). The agent now
exposes a cheap monotonically increasing Revision() that ticks
whenever messages are appended or replaced, so the cache key stays
trivial. Live turns (busy/streaming/tool-call mutations) keep the
old rebuild path.
2026-05-04 12:14:12 +02:00