Insert one blank row between the chat area and the status bar
(spinner / model / cwd). Previously the status block sat flush
against the chat content above; the blank gives it air whether
the agent is busy (spinner first) or idle (model line first).
Cursor row math bumped to skip both the new top blank and the
existing blank between statusLines and edLines so the input
cursor still lands on the right cell.
The bridge already mirrored the assistant's text reply into the
paired Telegram chat but had no way to push real attachments. A
turn that came in over Telegram could only ever produce a textual
description of an image, never the image itself.
Add two model-facing tools, registered on the running agent only
while the bridge is connected:
- telegram_send_image(path, caption?) uploads a local image
(png/jpg/gif/webp) as an inline Telegram photo. Telegram
compresses for preview, which is what you usually want for a
screenshot or chart.
- telegram_send_file(path, caption?) uploads any local file as a
document attachment with no compression. Use for non-images or
when the recipient needs the original bytes.
Plumbing:
- Client.SendPhoto multipart upload mirrors SendDocument, hitting
sendPhoto so Telegram renders the image inline.
- Bridge.SendImage / SendDocument resolve the paired chat id and
return a clear error when the bridge is not running or no user
has paired yet.
- A small TelegramSender interface in package tools keeps the
tools package free of any telegram dependency; an adapter in
interactive.go forwards to the live *telegram.Bridge.
- applyTelegramTools mutates the running agent's tool registry on
/telegram connect / disconnect, on /model swaps, and on login
rebuilds. Walks the live registry rather than restoring from a
snapshot so extension or /reload-ext additions survive a later
disconnect; we only add or strip the two telegram entries.
Both tools respect the sandbox, refuse non-image inputs in
send_image, and reject directories. They return a one-line text
result the model can use to confirm the upload ("sent /path/foo.png
to telegram (1.2 MB)").
Two fixes for image-block rendering after the box refactor.
clipBottomClippedImages decides whether to suppress an image escape
this frame by scanning forward for the matching "image - ..." info
line, treating any non-blank intervening row as a blocker. The image
block layout is: escape row + N blank reservation rows + info row.
With tool boxes, the reservation rows are wrapped as "\u2502 \u2026 \u2502"
and the naive whitespace check now sees them as non-blank \u2014 the scan
broke before reaching the info line, decided the image was clipped,
and zeroed out the escape sequence. Result: a tall blank gap inside
the box where the image should be, with only the metadata caption
visible at the bottom (visible on iTerm2 / Kitty / WezTerm).
Add isBoxBlankLine() that strips ANSI escapes, surrounding whitespace,
and \u2502 box edges before checking for emptiness. Use it in both
clipBottomClippedImages and snapViewportStartToImageBlock so the
viewport snap-back also recognises wrapped reservation rows.
Also add one extra blank reservation row between the image footprint
and the muted "image - mime - WxH - sizeKB" caption, so the caption
doesn't sit flush against the last pixel row of the image.
Until now an interactive session's turns lived only in the running
agent's in-memory transcript; the session file on disk got nothing
new until iv.Run returned and the deferred WriteNewTranscript at
the bottom of runInteractive fired. That meant any process death
that bypassed the deferred flush \u2014 closed terminal window (SIGHUP),
system shutdown (SIGTERM), kill -9, an OS / power crash \u2014 took the
entire session with it. The window of data-at-risk was "everything
since the TUI started or last switched session."
Two changes that close the window:
- Add OnMessageAppended / OnUsage hooks on core.Agent, fired right
after each transcript message is appended (user prompt, finalised
assistant turn, tool-results message, the OpenAI image mirror)
and after each usage event arrives. The interactive runtime in
cli.go binds them to sess.AppendMessage / sess.AppendUsage so
every finished turn lands in the session JSONL the moment it's
durable in memory. A persistMu mutex coordinates the agent
goroutine's per-message writes with the TUI goroutine's session
swap (/sessions) and explicit flushes (/session export).
sessBaselineMsgs advances in lock-step so the exit-time
WriteNewTranscript no longer double-writes rows already on disk.
- Install a SIGTERM / SIGHUP handler in runInteractive that flushes
any not-yet-persisted in-flight turn before exiting. SIGINT is
intentionally NOT handled \u2014 the TUI consumes Ctrl+C as a regular
key event for cancel/clear semantics, so installing a SIGINT
notifier here would swallow it. The handler exits with os.Exit(0)
rather than re-raising, because re-raising would skip the chance
to flush and the only at-risk state we care about (the session
file) is already flushed by the time we get there.
Build closures (BuildAgent / BuildAgentFor) are re-wrapped after
the persistence wiring exists so any agent the TUI constructs on
login or /model swap also gets the hooks; otherwise switching
provider would silently revert to the old in-memory-only behaviour.
Print and JSON modes are unaffected: they run a single turn and
already flush via WriteNewTranscript at the end of their handler.
UI polish:
- Add Theme.AccentBar(c) helper that returns the half-block leader
("\u258c ") in colour c. Use it everywhere a speaker / prompt bar is
drawn: main editor prompt, /btw editor + speaker labels, login
code editor, welcome banner, --help headline, and the chat side
speaker headers (you / zot, including the streaming overlay).
Single source of truth for the bar style across the UI.
- Insert one blank row between the status bar and the editor and
one trailing blank below the editor so the input has breathing
room from the surrounding chrome instead of sitting flush against
the status line and the terminal edge. Cursor row math is bumped
+1 to account for the inserted row.
- Status bar narrow split: when the idle status line would exceed
the terminal width, split it into provider/model on one row,
token+cost+context stats on the next, then cwd, instead of
letting the terminal hard-wrap mid-line. Mirrors the existing
busy-prefix split.
Session cost restoration:
- Add core.SessionUsage(path) that scans a session file for the
latest "usage" row and returns its cumulative usage (the running
session total). Old sessions with no usage rows return zero.
- Seed the agent with that cumulative usage on every load path:
/sessions picker, --continue, --resume, --session. Previously
loading a session restored the messages but not the cost, so the
status bar showed \/bin/bash.000 until the next turn produced a fresh
EvUsage event.
- Mirror the seeded cost into i.cumUsage on NewInteractive (CLI
startup loads) and applySessionSelection (in-tui /sessions load)
so the status bar reflects the historical total immediately.
renderToolCall used to prepend a blank line at the top of each of
its three branches (streaming-with-body, finished-no-result,
finished-with-result) so the live box wouldn't sit flush against
the prose / previous tool block above it. That made sense before
each tool result owned its own complete box \u2014 back then the
assistant message also injected a blank before its embedded top
edge, and the two paths matched.
After the box-ownership refactor the transcript path no longer
emits a leading blank (Build()'s natural inter-message separator
provides one), but renderToolCall still did, so a streaming tool
sat two blank rows below the previous content while the same
call, once finalised, sat only one. The visible "snap tighter"
when streaming ended was the doubled gap collapsing.
Drop the three leading blanks. Build()'s inter-message blank is
the single source of vertical spacing now, so streaming and
finalised forms have identical gaps.
Body content used to start on the row immediately under the top
edge and end on the row immediately above the bottom edge, leaving
the box visually cramped against its corners.
Insert a blank box-side row (a \u2502 \u2026 \u2502 with no content) after the
top edge and before the bottom edge in both the transcript path
(renderMessage RoleTool branch) and the live overlay path
(renderToolCall, both streaming and finished-with-body). Empty-
bodied boxes (finished call with no result) stay compact \u2014 the top
and bottom edges sit directly adjacent so a no-output tool doesn't
render as a tall hollow frame.
Each tool result owns its own box, but the assistant message that
batched the originating tool_use blocks now renders to nothing (the
box edges moved to the tool result). Build() was still emitting an
inter-message blank for that empty render, which compounded with
the blank from the next real message and produced two blank rows
between adjacent tool boxes \u2014 visible whenever the model batched
multiple tool calls without prose between them.
Skip the inter-message blank when a message rendered to zero lines
so an empty assistant message no longer contributes a separator
row. Adjacent tool boxes now sit one blank row apart, matching the
spacing between every other pair of chat blocks.
Loading a session via the /sessions dialog used to park the viewport
on the last user turn, intended to show "where you left off". In
practice users open /sessions to resume work and expect to land at
the live tail of the conversation, not somewhere mid-scroll.
Replace the scrollToLastTurn call in applySessionSelection with
scrollToBottom so a /sessions resume snaps to the latest message.
The CLI startup paths (--continue, --resume, --session) still park
on the last turn since they're a different entry point with a
different mental model (boot into a saved session vs. switch
sessions during an active run).
Each tool block now renders as a labelled box:
\u250c\u2500 bash ls -la \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510
\u2502 $ ls -la \u2502
\u2502 total ... \u2502
\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518
instead of the open top/bottom horizontal rules used previously.
The tool name + short args are embedded in the top edge so each
call has a self-contained frame; body lines get vertical edges
with a 1-cell inner gutter; image escapes are passed through
un-bordered so the graphics layer isn't smeared.
Each tool_result message owns its complete box (top + body +
bottom). When the model batches multiple tool_use blocks in one
assistant message, the assistant message emits no box pieces \u2014
each result renders its own adjacent box, looking up the call's
label from a new toolCallLabels map. This avoids the prior bug
where two tool_use blocks in a row produced two unclosed top
edges with no body between them.
The streaming live overlay (renderToolCall) draws the same
top/body/bottom directly so an in-flight write/edit looks the
same as the finalised transcript form. Body lines have their
first 2 leading-indent cells stripped so content lands at a
consistent column inside the box rather than drifting right.
Inter-message blanks come from Build()'s natural separator;
no extra blanks are emitted around boxes, so two consecutive
boxes are separated by exactly one blank row.
normalisePathToken's heuristic only fires for unix-shaped paths
(leading /, ~, or file:// with a / path). On Windows, t.TempDir()
returns paths like C:\Users\RUNNER~1\AppData\Local\Temp\...
which never match those prefixes, so the test inputs never get
quoted and every assertion fails. The function is unix-only by
design (drag-and-drop on macOS/Linux); skip the test on Windows
rather than carry a parallel Windows test that would just
duplicate platform-specific quoting logic.
normalisePathToken used a purely syntactic heuristic ("starts with
/ or ~ and contains / or .") to decide whether a pasted token was
a drag-dropped file path. URL path segments pasted from a browser
address bar (e.g. /de/downloads/dokumentenarchiv) matched that
shape and got collapsed to a [file:N:basename] chip or wrapped in
single quotes, which is never what the user meant.
Add a pathExists check (with ~ / ~/ expansion via os.UserHomeDir)
and gate normalisePathToken on it. Real drag-and-drop from a file
manager still works because those paths exist; URL fragments and
fictional paths fall through untouched.
Rewrote quote_paste_test.go to build fixtures under t.TempDir() so
the existence check has something real to stat, plus regression
cases for URL-segment and non-existent-path pastes.
scrollOffset is measured from the bottom of the chat buffer, so when
the agent appends new lines while the user has scrolled up to read
history, the visible window slides down through the buffer and the
content the user was reading drifts off the top.
Track the previous chat line count and column width across redraws.
While the user is in free-scroll (scrollOffset > 0) and the terminal
hasn't been resized, bump scrollOffset by the chat-length delta so
the visible content stays pinned. Compensation is skipped on resize
(line counts aren't comparable across reflows) and when following
the tail (scrollOffset == 0), where new content should keep pushing
the viewport as before.
Each assistant API message used to draw its own "zot" header, so a
turn that round-tripped through several tool_use / tool_result pairs
showed multiple headers (or, when the first assistant message was
pure tool_use, the header appeared late and earlier tool blocks
visibly snapped underneath it once the model started emitting text).
Track whether the previous non-compaction message in the transcript
belongs to the same turn (assistant or tool role) and suppress the
header on subsequent assistant messages and on the streaming overlay
while a turn is already open. Header now marks the speaker boundary
(you <-> agent), not the underlying API message boundary.
SubmitOrQueue was discarding the images parameter. Now passes them through to startTurnWithImages so photos sent via Telegram are included in the prompt.
Forces full attribute reset before clearing each row when selection highlights are present. VS Code's xterm.js doesn't reliably clear background colors on row overwrite without an explicit reset.
Press r to rename the selected session. Uses append-based rename lines so the active session is not corrupted. Native terminal cursor in rename input. Title shown in picker if set.
Login and logout dialogs show descriptive labels (Anthropic Claude Pro/Max, OpenAI ChatGPT Plus/Pro). Unknown saved providers fall back to an available one instead of crashing. Model picker only shows models from logged-in providers.
Version strings like '0.1.12 (25b2bd4, ...)' were used as-is for GitHub API lookups and comparisons, causing changelog to never show. Now strips to semver only.
Changelog dialog now shows only the changelog section from release notes with headings in accent color. Works for local 0.0.0 builds (fetches latest release). Full-width highlight bars fixed everywhere via erase-to-EOL and trailing ANSI preservation in truncateToWidth. Session ops dialog fixed. README documents the @ file picker.
Type @ to browse files in the working directory. Up/down navigate, right arrow opens directories, left arrow goes back, enter selects. Files insert as [file:name], directories as [dir:name/]. Drag-dropped folders also show as [dir:] chips. Separate counters for file and dir chips.
Drag-dropped or pasted file paths are shown as compact [file:name] chips in the editor instead of the full path. Expanded back to the full quoted path on submit. URLs are never collapsed.
Wraps OAuth clients with a RefreshingClient that checks token expiry before every Stream call. Refreshes transparently and rebuilds the underlying client with the fresh token. Fixes sessions silently dying after the 1-hour token lifetime.
Adds --provider ollama with auto-detection of local ollama at localhost:11434. No API key required for local models. Optional --api-key and --base-url for remote/authenticated instances. Uses the OpenAI chat completions client internally. Unknown models are accepted without catalog entries. Updated README with ollama documentation.
Calls Invalidate() on tool result, assistant message commit, and turn end to prevent stale code fragments from bleeding into the status bar when the diff renderer misses row changes.
Adds baseUrl support in models.json for local models (ollama, vLLM, etc). Migrates all install URLs and references from zot.patriceckhart.com to www.zot.sh.
Reads $ZOT_HOME/models.json at startup and merges user-defined models into the active catalog with highest precedence. Provider keys like openai-codex are normalized. Documented in README.
Reduces gutter padding between line numbers and code, replaces tab characters with 4 spaces in code panels so Go and other tab-indented languages render at consistent width.
Watches for context cancellation in a goroutine and kills the entire process group plus closes the pipe immediately, instead of waiting for cmd.Wait() which deadlocks when child processes hold the pipe open.
Compaction no longer streams the summary into the chat. The spinner shows while compacting and users can queue prompts that fire after completion. The compaction info appears in the status line with ctrl+o to expand. Orphaned tool_result blocks in the preserved tail are stripped to prevent Anthropic rejection.
Sets Setpgid on bash commands and sends SIGTERM/SIGKILL to the negative pgid so backgrounded children are cleaned up on esc. Also splits the status bar onto multiple lines on narrow terminals when a spinner is active.
Makes zot --help use zot's TUI palette and section layout, pads columns consistently across terminals, expands section rules to terminal width, and adds extension install guidance in the help output.
Adds the todo panel example under examples/extensions, updates example manifests and READMEs to match the current extension API, and surfaces extension install/load commands in zot --help.
Runs the local callback server and the manual copy-code flow in parallel. Displays a real input field (with blinking cursor) for pasting the authorization code / redirect URL / code#state. Anthropic manual variant uses the console copy-code redirect URI to bypass localhost.
Add open_panel / panel_render / panel_close / panel_key to the extension protocol, expose extension_dir + data_dir in hello_ack, wire panel rendering and key routing through the interactive TUI, extend the Go SDK, and document the new capability. Also fix doubled user-message indent and redundant assistant wrap.
Mirror image-bearing tool output into an OpenAI-only user message so GPT vision models receive image bytes, and hide that synthetic message from the TUI transcript.