iTerm's OSC 1337 and kitty/ghostty's APC G escapes both advance the
cursor by the image's actual rendered cell count, which depends on
the terminal's font cell aspect ratio. Without knowing that ratio,
zot's box renderer couldn't reliably place the closing │ at the box's
right column: spaces-based padding either overshot (cutting the bar
off the screen) or undershot (drawing the bar mid-image).
Use the protocols' "don't move cursor" flags to keep the terminal
cursor at the start of the image after rendering:
- iTerm2: doNotMoveCursor=1
- kitty/ghostty: C=1
With cursor pinned, the row's logical advance is just the leading
indent spaces, and toolBoxSideWithImage can pad to the right edge
column without caring how the image was scaled. The image's graphics
layer paints over the padding spaces visually.
Re-enable inline images in iTerm.app since the new approach removes
the alignment hack that previously misbehaved there.
User message bubble redesign:
- No more "\u258c you" / "\u258c zot" speaker labels; turns are
delimited by a tinted bubble panel for the user side and plain
prose for the assistant side.
- The bubble is a full-width tinted row prefixed by an accent "\u258c "
bar (BG behind the bar matches the panel so bar and panel read as
one continuous coloured strip). One blank tinted row above and
below the message gives the bubble vertical breathing room \u2014 the
closest a terminal can get to css padding-top / padding-bottom.
- Theme.UserBubble + Theme.UserBubbleBG/FG carry the panel colours.
- /btw side-chat reuses the same bubble helper so it matches the
main chat layout.
- One blank line above the slash popup, the dialog block, and the
sliding-in chips so they don't sit flush against the chat above.
- One trailing blank inside the /btw frame after the editor.
Tool box outer/inner geometry:
- toolBoxOuterMargin pulls the box frame in from the terminal
edges so user bubble, assistant prose, and box frames all share
the same left/right column.
- Image-footprint rows (escape + reservation blanks + caption gap)
are now tagged with imageFootprintSentinel by renderImageBlock
and the three box-side wrappers strip the tag and still wrap the
rows in \u2502 \u2026 \u2502 so the box edges stay continuous around the
image. The escape is indented inside the frame instead of
kissing the \u2502.
- toolBoxSide bypasses width measurement / truncation when the row
carries an iTerm OSC 1337 or Kitty APC G escape: visibleWidth
doesn't recognise OSC payloads and was destroying the image
bytes when wrapping turned them into thousand-cell-wide rows.
Markdown / code fences:
- RenderMarkdown emits a FlushLeftSentinel byte at the start of
every fenced code-block line so a future caller can opt those
rows out of prose indent. The current consumer doesn't strip
the sentinel, but stale callers in skills / changelog / btw /
compaction summaries do, so leftover sentinels never leak into
rendered output.
Theme detection:
- New tui.DetectThemeFromBackground(timeout) probes the terminal
via OSC 11, parses rgb:RRRR/GGGG/BBBB, and returns Light if
Rec. 709 luma >= 0.5 else Dark. ZOT_THEME=dark|light overrides
the probe. Falls back to Dark when the terminal doesn't
respond within the timeout (Linux console, certain VS Code
configs, tmux without pass-through).
- cli.go now picks the theme via DetectThemeFromBackground
instead of hard-coding tui.Dark.
- Light theme bubble palette tuned (UserBubbleBG 254 / FG 240) so
user rows stay legible on a light terminal.
Editor / spinner polish:
- Editor prompt remains the AccentBar in Theme.Accent.
- One "funny working line" entry softened from "go generics" to
"the code".
Slash command popup and dialogs (login / model / sessions / jump /
btw / changelog / etc.) used to sit flush against the chat or
welcome content above. Add one blank row before the suggest popup
when it is non-empty and one above the dialog block when a dialog
is showing.
Cursor-row math gains a dialogLead offset so popup-side cursor
branches (btwDialog, login dialog, sessionDialog) land in the
right cell now that the dialog starts one row lower.
ListSessions sorted by filename descending. Filenames embed the
creation timestamp, so /sessions, --continue, and the resume
picker were effectively ordered by when each session was first
created, not when the user last worked in it.
Sort by filesystem ModTime descending instead, with filename desc
as a stable tie-break. The session you most recently touched now
sits at the top of every picker and is what --continue resumes,
even when newer-but-idle sessions exist.
When the user types while a turn is in flight the queued messages
render as "sliding in:" chips above the status bar. Previously
they sat flush against the streaming chat content above. Prepend
a single blank row so the chips have air on top.
The trailer blank inside the queue slice is removed because the
bottom-region assembly already inserts a blank above statusLines;
keeping it doubled the gap below the chips.
A drag-drop into VS Code's terminal arrives as a stream of single
KeyRune events (no bracketed paste). The main loop redrew between
every rune, so on a long session the path appeared to type in over
several seconds instead of landing instantly.
Bump the input channel buffer to 256 and, after handling one key,
non-blockingly drain every other key already queued before
issuing one invalidate. The whole burst processes in a single
main-loop pass and the screen repaints once at the end.
Anthropic returns 'At least one of the image dimensions exceed max
allowed size for many-image requests' when a request contains more
than one image and any of them is larger than 2000 px on the
longest side. Single-image requests get a more lenient 8000 px
cap. OpenAI has no such limit, so a session that worked fine on
gpt-* breaks the moment /model swaps to a Claude family while the
transcript still holds high-res screenshots.
Add anthShrinkImageBytesIfTooBig that decodes any outbound image
exceeding 2000 px on the longest side, resamples it with Catmull-
Rom from golang.org/x/image/draw, and re-encodes in the same
format. JPEG stays JPEG, PNG stays PNG, GIF gets re-encoded as
PNG (we'd otherwise need to compose a single-frame gif by hand).
Decode/encode failures fall back to the original bytes so the
caller's flow is untouched and Anthropic's own error message
surfaces if the image is genuinely unusable.
Both Anthropic encode sites \u2014 the top-level message converter
and the tool-result inner converter \u2014 run every image through
the helper, so screenshots from earlier turns also get resized
on subsequent requests instead of failing forever once the
transcript captured them at full size.
Adds golang.org/x/image v0.18.0 (compatible with go 1.22).
Build() emits a trailing blank after every message, so the final
assistant reply (and any optional addendum like /help, the OK
line, or extension notes) ended its block with one blank row.
After we added a leading blank above the status block, that gave
a doubled gap between the last conversation row and the status.
Strip trailing blank rows from the assembled chat slice once
every optional addition has been appended, so the conversation
sits flush against the single leading blank of the status block.
Insert one blank row between the chat area and the status bar
(spinner / model / cwd). Previously the status block sat flush
against the chat content above; the blank gives it air whether
the agent is busy (spinner first) or idle (model line first).
Cursor row math bumped to skip both the new top blank and the
existing blank between statusLines and edLines so the input
cursor still lands on the right cell.
The bridge already mirrored the assistant's text reply into the
paired Telegram chat but had no way to push real attachments. A
turn that came in over Telegram could only ever produce a textual
description of an image, never the image itself.
Add two model-facing tools, registered on the running agent only
while the bridge is connected:
- telegram_send_image(path, caption?) uploads a local image
(png/jpg/gif/webp) as an inline Telegram photo. Telegram
compresses for preview, which is what you usually want for a
screenshot or chart.
- telegram_send_file(path, caption?) uploads any local file as a
document attachment with no compression. Use for non-images or
when the recipient needs the original bytes.
Plumbing:
- Client.SendPhoto multipart upload mirrors SendDocument, hitting
sendPhoto so Telegram renders the image inline.
- Bridge.SendImage / SendDocument resolve the paired chat id and
return a clear error when the bridge is not running or no user
has paired yet.
- A small TelegramSender interface in package tools keeps the
tools package free of any telegram dependency; an adapter in
interactive.go forwards to the live *telegram.Bridge.
- applyTelegramTools mutates the running agent's tool registry on
/telegram connect / disconnect, on /model swaps, and on login
rebuilds. Walks the live registry rather than restoring from a
snapshot so extension or /reload-ext additions survive a later
disconnect; we only add or strip the two telegram entries.
Both tools respect the sandbox, refuse non-image inputs in
send_image, and reject directories. They return a one-line text
result the model can use to confirm the upload ("sent /path/foo.png
to telegram (1.2 MB)").
Two fixes for image-block rendering after the box refactor.
clipBottomClippedImages decides whether to suppress an image escape
this frame by scanning forward for the matching "image - ..." info
line, treating any non-blank intervening row as a blocker. The image
block layout is: escape row + N blank reservation rows + info row.
With tool boxes, the reservation rows are wrapped as "\u2502 \u2026 \u2502"
and the naive whitespace check now sees them as non-blank \u2014 the scan
broke before reaching the info line, decided the image was clipped,
and zeroed out the escape sequence. Result: a tall blank gap inside
the box where the image should be, with only the metadata caption
visible at the bottom (visible on iTerm2 / Kitty / WezTerm).
Add isBoxBlankLine() that strips ANSI escapes, surrounding whitespace,
and \u2502 box edges before checking for emptiness. Use it in both
clipBottomClippedImages and snapViewportStartToImageBlock so the
viewport snap-back also recognises wrapped reservation rows.
Also add one extra blank reservation row between the image footprint
and the muted "image - mime - WxH - sizeKB" caption, so the caption
doesn't sit flush against the last pixel row of the image.
Until now an interactive session's turns lived only in the running
agent's in-memory transcript; the session file on disk got nothing
new until iv.Run returned and the deferred WriteNewTranscript at
the bottom of runInteractive fired. That meant any process death
that bypassed the deferred flush \u2014 closed terminal window (SIGHUP),
system shutdown (SIGTERM), kill -9, an OS / power crash \u2014 took the
entire session with it. The window of data-at-risk was "everything
since the TUI started or last switched session."
Two changes that close the window:
- Add OnMessageAppended / OnUsage hooks on core.Agent, fired right
after each transcript message is appended (user prompt, finalised
assistant turn, tool-results message, the OpenAI image mirror)
and after each usage event arrives. The interactive runtime in
cli.go binds them to sess.AppendMessage / sess.AppendUsage so
every finished turn lands in the session JSONL the moment it's
durable in memory. A persistMu mutex coordinates the agent
goroutine's per-message writes with the TUI goroutine's session
swap (/sessions) and explicit flushes (/session export).
sessBaselineMsgs advances in lock-step so the exit-time
WriteNewTranscript no longer double-writes rows already on disk.
- Install a SIGTERM / SIGHUP handler in runInteractive that flushes
any not-yet-persisted in-flight turn before exiting. SIGINT is
intentionally NOT handled \u2014 the TUI consumes Ctrl+C as a regular
key event for cancel/clear semantics, so installing a SIGINT
notifier here would swallow it. The handler exits with os.Exit(0)
rather than re-raising, because re-raising would skip the chance
to flush and the only at-risk state we care about (the session
file) is already flushed by the time we get there.
Build closures (BuildAgent / BuildAgentFor) are re-wrapped after
the persistence wiring exists so any agent the TUI constructs on
login or /model swap also gets the hooks; otherwise switching
provider would silently revert to the old in-memory-only behaviour.
Print and JSON modes are unaffected: they run a single turn and
already flush via WriteNewTranscript at the end of their handler.
UI polish:
- Add Theme.AccentBar(c) helper that returns the half-block leader
("\u258c ") in colour c. Use it everywhere a speaker / prompt bar is
drawn: main editor prompt, /btw editor + speaker labels, login
code editor, welcome banner, --help headline, and the chat side
speaker headers (you / zot, including the streaming overlay).
Single source of truth for the bar style across the UI.
- Insert one blank row between the status bar and the editor and
one trailing blank below the editor so the input has breathing
room from the surrounding chrome instead of sitting flush against
the status line and the terminal edge. Cursor row math is bumped
+1 to account for the inserted row.
- Status bar narrow split: when the idle status line would exceed
the terminal width, split it into provider/model on one row,
token+cost+context stats on the next, then cwd, instead of
letting the terminal hard-wrap mid-line. Mirrors the existing
busy-prefix split.
Session cost restoration:
- Add core.SessionUsage(path) that scans a session file for the
latest "usage" row and returns its cumulative usage (the running
session total). Old sessions with no usage rows return zero.
- Seed the agent with that cumulative usage on every load path:
/sessions picker, --continue, --resume, --session. Previously
loading a session restored the messages but not the cost, so the
status bar showed \/bin/bash.000 until the next turn produced a fresh
EvUsage event.
- Mirror the seeded cost into i.cumUsage on NewInteractive (CLI
startup loads) and applySessionSelection (in-tui /sessions load)
so the status bar reflects the historical total immediately.
renderToolCall used to prepend a blank line at the top of each of
its three branches (streaming-with-body, finished-no-result,
finished-with-result) so the live box wouldn't sit flush against
the prose / previous tool block above it. That made sense before
each tool result owned its own complete box \u2014 back then the
assistant message also injected a blank before its embedded top
edge, and the two paths matched.
After the box-ownership refactor the transcript path no longer
emits a leading blank (Build()'s natural inter-message separator
provides one), but renderToolCall still did, so a streaming tool
sat two blank rows below the previous content while the same
call, once finalised, sat only one. The visible "snap tighter"
when streaming ended was the doubled gap collapsing.
Drop the three leading blanks. Build()'s inter-message blank is
the single source of vertical spacing now, so streaming and
finalised forms have identical gaps.
Body content used to start on the row immediately under the top
edge and end on the row immediately above the bottom edge, leaving
the box visually cramped against its corners.
Insert a blank box-side row (a \u2502 \u2026 \u2502 with no content) after the
top edge and before the bottom edge in both the transcript path
(renderMessage RoleTool branch) and the live overlay path
(renderToolCall, both streaming and finished-with-body). Empty-
bodied boxes (finished call with no result) stay compact \u2014 the top
and bottom edges sit directly adjacent so a no-output tool doesn't
render as a tall hollow frame.
Each tool result owns its own box, but the assistant message that
batched the originating tool_use blocks now renders to nothing (the
box edges moved to the tool result). Build() was still emitting an
inter-message blank for that empty render, which compounded with
the blank from the next real message and produced two blank rows
between adjacent tool boxes \u2014 visible whenever the model batched
multiple tool calls without prose between them.
Skip the inter-message blank when a message rendered to zero lines
so an empty assistant message no longer contributes a separator
row. Adjacent tool boxes now sit one blank row apart, matching the
spacing between every other pair of chat blocks.
Loading a session via the /sessions dialog used to park the viewport
on the last user turn, intended to show "where you left off". In
practice users open /sessions to resume work and expect to land at
the live tail of the conversation, not somewhere mid-scroll.
Replace the scrollToLastTurn call in applySessionSelection with
scrollToBottom so a /sessions resume snaps to the latest message.
The CLI startup paths (--continue, --resume, --session) still park
on the last turn since they're a different entry point with a
different mental model (boot into a saved session vs. switch
sessions during an active run).
Each tool block now renders as a labelled box:
\u250c\u2500 bash ls -la \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510
\u2502 $ ls -la \u2502
\u2502 total ... \u2502
\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518
instead of the open top/bottom horizontal rules used previously.
The tool name + short args are embedded in the top edge so each
call has a self-contained frame; body lines get vertical edges
with a 1-cell inner gutter; image escapes are passed through
un-bordered so the graphics layer isn't smeared.
Each tool_result message owns its complete box (top + body +
bottom). When the model batches multiple tool_use blocks in one
assistant message, the assistant message emits no box pieces \u2014
each result renders its own adjacent box, looking up the call's
label from a new toolCallLabels map. This avoids the prior bug
where two tool_use blocks in a row produced two unclosed top
edges with no body between them.
The streaming live overlay (renderToolCall) draws the same
top/body/bottom directly so an in-flight write/edit looks the
same as the finalised transcript form. Body lines have their
first 2 leading-indent cells stripped so content lands at a
consistent column inside the box rather than drifting right.
Inter-message blanks come from Build()'s natural separator;
no extra blanks are emitted around boxes, so two consecutive
boxes are separated by exactly one blank row.
normalisePathToken's heuristic only fires for unix-shaped paths
(leading /, ~, or file:// with a / path). On Windows, t.TempDir()
returns paths like C:\Users\RUNNER~1\AppData\Local\Temp\...
which never match those prefixes, so the test inputs never get
quoted and every assertion fails. The function is unix-only by
design (drag-and-drop on macOS/Linux); skip the test on Windows
rather than carry a parallel Windows test that would just
duplicate platform-specific quoting logic.
normalisePathToken used a purely syntactic heuristic ("starts with
/ or ~ and contains / or .") to decide whether a pasted token was
a drag-dropped file path. URL path segments pasted from a browser
address bar (e.g. /de/downloads/dokumentenarchiv) matched that
shape and got collapsed to a [file:N:basename] chip or wrapped in
single quotes, which is never what the user meant.
Add a pathExists check (with ~ / ~/ expansion via os.UserHomeDir)
and gate normalisePathToken on it. Real drag-and-drop from a file
manager still works because those paths exist; URL fragments and
fictional paths fall through untouched.
Rewrote quote_paste_test.go to build fixtures under t.TempDir() so
the existence check has something real to stat, plus regression
cases for URL-segment and non-existent-path pastes.
scrollOffset is measured from the bottom of the chat buffer, so when
the agent appends new lines while the user has scrolled up to read
history, the visible window slides down through the buffer and the
content the user was reading drifts off the top.
Track the previous chat line count and column width across redraws.
While the user is in free-scroll (scrollOffset > 0) and the terminal
hasn't been resized, bump scrollOffset by the chat-length delta so
the visible content stays pinned. Compensation is skipped on resize
(line counts aren't comparable across reflows) and when following
the tail (scrollOffset == 0), where new content should keep pushing
the viewport as before.
Each assistant API message used to draw its own "zot" header, so a
turn that round-tripped through several tool_use / tool_result pairs
showed multiple headers (or, when the first assistant message was
pure tool_use, the header appeared late and earlier tool blocks
visibly snapped underneath it once the model started emitting text).
Track whether the previous non-compaction message in the transcript
belongs to the same turn (assistant or tool role) and suppress the
header on subsequent assistant messages and on the streaming overlay
while a turn is already open. Header now marks the speaker boundary
(you <-> agent), not the underlying API message boundary.
SubmitOrQueue was discarding the images parameter. Now passes them through to startTurnWithImages so photos sent via Telegram are included in the prompt.
Forces full attribute reset before clearing each row when selection highlights are present. VS Code's xterm.js doesn't reliably clear background colors on row overwrite without an explicit reset.
Press r to rename the selected session. Uses append-based rename lines so the active session is not corrupted. Native terminal cursor in rename input. Title shown in picker if set.
Login and logout dialogs show descriptive labels (Anthropic Claude Pro/Max, OpenAI ChatGPT Plus/Pro). Unknown saved providers fall back to an available one instead of crashing. Model picker only shows models from logged-in providers.
Version strings like '0.1.12 (25b2bd4, ...)' were used as-is for GitHub API lookups and comparisons, causing changelog to never show. Now strips to semver only.
Changelog dialog now shows only the changelog section from release notes with headings in accent color. Works for local 0.0.0 builds (fetches latest release). Full-width highlight bars fixed everywhere via erase-to-EOL and trailing ANSI preservation in truncateToWidth. Session ops dialog fixed. README documents the @ file picker.
Type @ to browse files in the working directory. Up/down navigate, right arrow opens directories, left arrow goes back, enter selects. Files insert as [file:name], directories as [dir:name/]. Drag-dropped folders also show as [dir:] chips. Separate counters for file and dir chips.
Drag-dropped or pasted file paths are shown as compact [file:name] chips in the editor instead of the full path. Expanded back to the full quoted path on submit. URLs are never collapsed.
Wraps OAuth clients with a RefreshingClient that checks token expiry before every Stream call. Refreshes transparently and rebuilds the underlying client with the fresh token. Fixes sessions silently dying after the 1-hour token lifetime.
Adds --provider ollama with auto-detection of local ollama at localhost:11434. No API key required for local models. Optional --api-key and --base-url for remote/authenticated instances. Uses the OpenAI chat completions client internally. Unknown models are accepted without catalog entries. Updated README with ollama documentation.
Calls Invalidate() on tool result, assistant message commit, and turn end to prevent stale code fragments from bleeding into the status bar when the diff renderer misses row changes.
Adds baseUrl support in models.json for local models (ollama, vLLM, etc). Migrates all install URLs and references from zot.patriceckhart.com to www.zot.sh.
Reads $ZOT_HOME/models.json at startup and merges user-defined models into the active catalog with highest precedence. Provider keys like openai-codex are normalized. Documented in README.
Reduces gutter padding between line numbers and code, replaces tab characters with 4 spaces in code panels so Go and other tab-indented languages render at consistent width.
Watches for context cancellation in a goroutine and kills the entire process group plus closes the pipe immediately, instead of waiting for cmd.Wait() which deadlocks when child processes hold the pipe open.
Compaction no longer streams the summary into the chat. The spinner shows while compacting and users can queue prompts that fire after completion. The compaction info appears in the status line with ctrl+o to expand. Orphaned tool_result blocks in the preserved tail are stripped to prevent Anthropic rejection.