- DrawLog: invalidate cached bottom rows when selection-highlight
escapes are present so VS Code's terminal doesn't leave stale
background colors on the previous cursor row
- session dialog: hard-clamp row text to terminal width so long
session summaries don't soft-wrap into adjacent rows
- Resize: clear scrollback alongside screen so stale wider content
doesn't bleed through when the terminal is narrowed
The runtime escape hatch was redundant: --no-yolo's per-call dialog
already exposes 'yes-always-this-session' which flips ConfirmGate
into allow-all mode without a separate command. Once a session
starts with --no-yolo, the only way to disable confirmations is now
to either pick the always-this-session option in the dialog or exit
and relaunch.
- slash_suggest.go: drop the /yolo entry from the autocomplete list.
- interactive.go: remove the case "/yolo" dispatch (falls through
to 'unknown command') and delete the orphaned runYoloOn method.
- README: drop the /yolo row from the slash-command table and the
trailing reference in the --no-yolo flag description.
- internal/provider/gemini.go: REST client against
generativelanguage.googleapis.com/v1beta/models/{id}:streamGenerateContent
?alt=sse, mapping our message/tool format to Gemini's Content/Part schema
and translating SSE chunks into the existing assistant-message event
stream. Handles text, tool calls, thought-summary parts, and per-model
thinking config (thinkingBudget for 2.5, thinkingLevel for 3.x with
Gemini-3-Pro pinned to LOW minimum).
- internal/provider/discover.go: DiscoverGoogle pages /v1beta/models and
filters to chat-capable ids (skips embeddings, AQA).
- internal/provider/models.go: catalog entries for gemini-2.5-pro,
2.5-flash, 2.5-flash-lite, 2.0-flash, 2.0-flash-lite.
- internal/auth: 'google' is a recognized provider; API-key probe hits
/v1beta/models with x-goog-api-key. OAuth flows reject google with a
clear 'API-key only' error since Gemini Advanced subscriptions don't
issue API tokens.
- internal/agent: env lookup for GEMINI_API_KEY / GOOGLE_API_KEY,
default model gemini-2.5-pro, NewClient wires provider.NewGemini,
background model discovery, /login + /logout + rescue dialog all
include google.
- README: new ### Google Gemini section with auth model, free-tier
limits, and reasoning-config notes.
Pressing Option+Up while the agent is busy now pops the most recently
queued ('sliding in') message back into the editor so the user can
edit and resend it. Repeated presses keep peeling messages off the
tail of the queue, newest first; each press replaces the editor
contents rather than appending. When the queue is empty the keypress
falls through to the existing scroll-up behavior.
A muted hint row underneath the chips advertises the shortcut, using
the same color as the model info on the status bar so it reads as
ambient metadata.
DrawLog now saves/restores the cursor at the top of the bottom band
instead of relying on relative up-N math that drifted when the
terminal naturally scrolled between frames. This fixes duplicated
transcript blocks with empty gaps (previously only ctrl+l recovered).
Also strip literal carriage returns from pasted/typed editor text
before rendering. A bare \r moves the terminal cursor to column 0
and overwrites the left side of the input row, which looked like
missing highlight segments on continuation lines.
- Pre-turn auto-compact: when the previous turn already pushed
context past the threshold, condense before sending. The user's
prompt is re-queued and fired automatically once compaction
succeeds.
- HTTP 413 handling: a 'payload too large' from the provider is no
longer surfaced as a status_err. Instead the request is retried
after a transparent auto-compact pass.
- Both inline auto-compacts surface a yellow chat note above the
status bar so the user sees the spinner *and* the reason; on
success a status_ok like 'context auto-compacted; sending your
last message' confirms the retry.
- Resume picker (/sessions and startup) now scrolls to the bottom
of the loaded transcript instead of parking at the last user
turn, so the most recent reply is always fully visible.
- Drop the VS Code mouse-capture path: native click-drag selection
beats the wheel-speed boost there.
Loading or exporting a session containing very large JSONL rows
(image blocks, big tool outputs, compacted history) failed with
'bufio.Scanner: token too long' — Scanner caps each token to its
buffer size, even when bumped to 20 MiB. A single oversized row
blocked OpenSession entirely so an existing long session could
not be resumed.
Switch session readers (OpenSession, SessionUsage, describeSession,
sessionHasNoMessages, ExportSession, ImportSession, BranchSession,
firstUserPrompt) to a shared bufio.Reader.ReadBytes-based JSONL
helper that handles arbitrarily long lines. Add a regression test
that opens and exports a session containing a >20 MiB row.
The redraw path rebuilt the full transcript on every key event:
filtered the agent's full message slice, refreshed tool path maps,
walked every message through the per-message render cache, and
re-assembled the entire chat line buffer. With a long session, the
O(N) work per keystroke made typing visibly lag.
Add an idle render cache: the previously built chat is reused when
nothing relevant changed (terminal width, transcript revision,
status notes, help/update banners, expand-all). The agent now
exposes a cheap monotonically increasing Revision() that ticks
whenever messages are appended or replaced, so the cache key stays
trivial. Live turns (busy/streaming/tool-call mutations) keep the
old rebuild path.
iTerm's OSC 1337 and kitty/ghostty's APC G escapes both advance the
cursor by the image's actual rendered cell count, which depends on
the terminal's font cell aspect ratio. Without knowing that ratio,
zot's box renderer couldn't reliably place the closing │ at the box's
right column: spaces-based padding either overshot (cutting the bar
off the screen) or undershot (drawing the bar mid-image).
Use the protocols' "don't move cursor" flags to keep the terminal
cursor at the start of the image after rendering:
- iTerm2: doNotMoveCursor=1
- kitty/ghostty: C=1
With cursor pinned, the row's logical advance is just the leading
indent spaces, and toolBoxSideWithImage can pad to the right edge
column without caring how the image was scaled. The image's graphics
layer paints over the padding spaces visually.
Re-enable inline images in iTerm.app since the new approach removes
the alignment hack that previously misbehaved there.
User message bubble redesign:
- No more "\u258c you" / "\u258c zot" speaker labels; turns are
delimited by a tinted bubble panel for the user side and plain
prose for the assistant side.
- The bubble is a full-width tinted row prefixed by an accent "\u258c "
bar (BG behind the bar matches the panel so bar and panel read as
one continuous coloured strip). One blank tinted row above and
below the message gives the bubble vertical breathing room \u2014 the
closest a terminal can get to css padding-top / padding-bottom.
- Theme.UserBubble + Theme.UserBubbleBG/FG carry the panel colours.
- /btw side-chat reuses the same bubble helper so it matches the
main chat layout.
- One blank line above the slash popup, the dialog block, and the
sliding-in chips so they don't sit flush against the chat above.
- One trailing blank inside the /btw frame after the editor.
Tool box outer/inner geometry:
- toolBoxOuterMargin pulls the box frame in from the terminal
edges so user bubble, assistant prose, and box frames all share
the same left/right column.
- Image-footprint rows (escape + reservation blanks + caption gap)
are now tagged with imageFootprintSentinel by renderImageBlock
and the three box-side wrappers strip the tag and still wrap the
rows in \u2502 \u2026 \u2502 so the box edges stay continuous around the
image. The escape is indented inside the frame instead of
kissing the \u2502.
- toolBoxSide bypasses width measurement / truncation when the row
carries an iTerm OSC 1337 or Kitty APC G escape: visibleWidth
doesn't recognise OSC payloads and was destroying the image
bytes when wrapping turned them into thousand-cell-wide rows.
Markdown / code fences:
- RenderMarkdown emits a FlushLeftSentinel byte at the start of
every fenced code-block line so a future caller can opt those
rows out of prose indent. The current consumer doesn't strip
the sentinel, but stale callers in skills / changelog / btw /
compaction summaries do, so leftover sentinels never leak into
rendered output.
Theme detection:
- New tui.DetectThemeFromBackground(timeout) probes the terminal
via OSC 11, parses rgb:RRRR/GGGG/BBBB, and returns Light if
Rec. 709 luma >= 0.5 else Dark. ZOT_THEME=dark|light overrides
the probe. Falls back to Dark when the terminal doesn't
respond within the timeout (Linux console, certain VS Code
configs, tmux without pass-through).
- cli.go now picks the theme via DetectThemeFromBackground
instead of hard-coding tui.Dark.
- Light theme bubble palette tuned (UserBubbleBG 254 / FG 240) so
user rows stay legible on a light terminal.
Editor / spinner polish:
- Editor prompt remains the AccentBar in Theme.Accent.
- One "funny working line" entry softened from "go generics" to
"the code".
Slash command popup and dialogs (login / model / sessions / jump /
btw / changelog / etc.) used to sit flush against the chat or
welcome content above. Add one blank row before the suggest popup
when it is non-empty and one above the dialog block when a dialog
is showing.
Cursor-row math gains a dialogLead offset so popup-side cursor
branches (btwDialog, login dialog, sessionDialog) land in the
right cell now that the dialog starts one row lower.
ListSessions sorted by filename descending. Filenames embed the
creation timestamp, so /sessions, --continue, and the resume
picker were effectively ordered by when each session was first
created, not when the user last worked in it.
Sort by filesystem ModTime descending instead, with filename desc
as a stable tie-break. The session you most recently touched now
sits at the top of every picker and is what --continue resumes,
even when newer-but-idle sessions exist.
When the user types while a turn is in flight the queued messages
render as "sliding in:" chips above the status bar. Previously
they sat flush against the streaming chat content above. Prepend
a single blank row so the chips have air on top.
The trailer blank inside the queue slice is removed because the
bottom-region assembly already inserts a blank above statusLines;
keeping it doubled the gap below the chips.
A drag-drop into VS Code's terminal arrives as a stream of single
KeyRune events (no bracketed paste). The main loop redrew between
every rune, so on a long session the path appeared to type in over
several seconds instead of landing instantly.
Bump the input channel buffer to 256 and, after handling one key,
non-blockingly drain every other key already queued before
issuing one invalidate. The whole burst processes in a single
main-loop pass and the screen repaints once at the end.
Anthropic returns 'At least one of the image dimensions exceed max
allowed size for many-image requests' when a request contains more
than one image and any of them is larger than 2000 px on the
longest side. Single-image requests get a more lenient 8000 px
cap. OpenAI has no such limit, so a session that worked fine on
gpt-* breaks the moment /model swaps to a Claude family while the
transcript still holds high-res screenshots.
Add anthShrinkImageBytesIfTooBig that decodes any outbound image
exceeding 2000 px on the longest side, resamples it with Catmull-
Rom from golang.org/x/image/draw, and re-encodes in the same
format. JPEG stays JPEG, PNG stays PNG, GIF gets re-encoded as
PNG (we'd otherwise need to compose a single-frame gif by hand).
Decode/encode failures fall back to the original bytes so the
caller's flow is untouched and Anthropic's own error message
surfaces if the image is genuinely unusable.
Both Anthropic encode sites \u2014 the top-level message converter
and the tool-result inner converter \u2014 run every image through
the helper, so screenshots from earlier turns also get resized
on subsequent requests instead of failing forever once the
transcript captured them at full size.
Adds golang.org/x/image v0.18.0 (compatible with go 1.22).