In flat (non-recursive) mode, typing a filter to locate a directory and
then opening it with Right re-applied that same filter inside the
directory. Typing "@eda" then Right to open eda/ showed nothing,
because no child of eda/ matches "eda". The filter the user typed
selected the directory at the current level; it has no meaning one
level deeper.
Clear the text after the last "@" (keeping the bare "@" so the picker
stays open) whenever Right or Left successfully changes the browse
level. The filter was scoped to the level just left, so dropping it
shows the new directory's full contents.
Adds a regression test that opens eda/ after an "@eda" filter and
asserts the directory's contents are listed while the stale filter
would have matched nothing.
The recursive @-picker only read the repo's root .gitignore, so a
nested .gitignore (e.g. .opencode/.gitignore ignoring its own
node_modules) was invisible. WalkDir visits lexically, so a
dot-prefixed vendored tree got walked first and its node_modules
flooded the 5000-entry budget before the walk ever reached deeply
nested source files. The picker then fuzzy-matched against junk and
never surfaced the real target.
- Add ignore.Stack: a per-directory .gitignore chain pushed/popped as
the recursive walk descends, with git-style nearest-file-wins
semantics including nested negations. scanRecursive now prunes
nested-ignored trees like node_modules.
- Raise maxRecursiveEntries 5000 -> 50000 and maxRecursiveDepth
12 -> 24. The bottleneck is per-keystroke fuzzy.Find, not memory:
a fileEntry is ~120 bytes (~6 MB at 50k), and benchmarked
fuzzy.Find latency is ~2ms @ 5k, ~13ms @ 50k, ~21ms @ 100k, so 50k
keeps ranking under one 60Hz frame while holding a large monorepo
once nested-gitignore pruning has done its job.
Verified against the reporting monorepo: the fully-pruned tree is
4397 entries (node_modules=0), scan ~360ms once (cached after),
match ~2.5ms per keystroke, and @pipeline.py now finds
eda/rjg/enk-1150/pipeline.py.
Adds regression tests at both the ignore.Stack layer and the
file_suggest layer, including a repro of the nested-node_modules +
deep-file scenario.
Four entries (bare, us., eu., global.) with 1M context, 128k output,
adaptive thinking, and Bedrock pricing (10/50, cache 1/12.5). The bare
id resolves through the cross-region inference profile logic like the
other anthropic.claude- models. Remove once Bedrock model discovery
picks the id up. Note: the Bedrock Converse client has no thinking-mode
wiring yet, so AdaptiveThinking is informational on this route for now.
Previously gitignore filtering ran only in recursive mode; the default
flat directory browse showed .git/, node_modules/, etc. Apply it in
both modes and make it user-controllable.
- Flat scan() now also skips .git and gitignored entries.
- New respectGitignore flag on the suggester (default on), persisted as
respect_gitignore in config.json, surfaced as a /settings checkbox,
and plumbed through SettingsStore/InteractiveConfig/cli. Toggling
flips the picker live.
- .git is always pruned in recursive mode regardless of the toggle, to
protect the entry budget.
- Tests for flat-mode filtering and the toggle across both modes.
Replace the static recursiveSkipDirs list (which would inevitably drift
as new tools appear) with the project's root .gitignore. Most caches
that bloat a recursive walk \u2014 build outputs, dependency dirs, and IaC
caches like .terraform/.terragrunt-cache \u2014 are already gitignored in
real projects.
- Extract the existing .gitignore matcher from agent/extcmd.go into a
new leaf package, packages/ignore, so packages/agent/modes can share
it without an import cycle. extcmd keeps thin aliases for its tests.
- scanRecursive now loads the root .gitignore and prunes ignored
entries, plus an unconditional .git skip (rarely self-listed).
- Tests: gitignore-driven pruning in the picker, plus unit tests for
the extracted matcher.
No new dependencies.
Add Terraform/Terragrunt/Pulumi/Serverless/CDK provider and module
caches to the recursive walk skip list. These hold copies of
downloaded providers and generated module trees that would otherwise
dominate the entry budget with non-source files.
The @-mention file picker previously did a plain case-insensitive
substring match within a single directory, only reachable nesting via
arrow-key drill-down.
- Rank matches with sahilm/fuzzy (pinned v0.1.1 to avoid the go 1.24.5
directive in v0.1.2, which would exceed CI's Go 1.23).
- Add a recursive mode that walks the whole project tree below cwd,
matching cwd-relative paths (e.g. @foobar finds src/foo/bar.go),
skipping heavy dirs (.git, node_modules, ...) and bounded by entry
and depth caps. Arrow drill-down is disabled in this mode.
- Persist as recursive_file_suggest in config.json, surfaced as a
/settings checkbox, plumbed through SettingsStore/InteractiveConfig/
cli. Toggling live flips the picker without a restart.
- Tests for fuzzy ranking, recursive cross-dir match, heavy-dir
pruning, and cache reset on toggle.
Speculative Anthropic entry (1M context, 128k output, adaptive thinking,
10/50 pricing) so the model resolves on both the api-key and OAuth route
with correct cost tracking and thinking mode. AdaptiveThinking cannot be
expressed via models.json, hence the catalog entry. Remove once the id
is live and discoverable upstream.
OpenRouter enforces input + max_output <= served context_length and
rejects requests where max_tokens equals the whole window, which happens
for models whose catalog MaxOutput is set equal to ContextWindow (e.g.
nemotron-3-super-120B). Two parts:
- discover.go (from #24): prefer top_provider.context_length when it is
smaller than the inflated model-level context_length, so ContextWindow
reflects the limit OpenRouter actually serves.
- openai.go: clamp max_tokens to ContextWindow minus a reserve. The
reserve is derived from the window (window/8, capped at 4096), never
from MaxOutput, so models whose output already fits the window are
untouched and small-window models (gpt-4) are not over-penalized.
Adds buildRequest clamp tests (fits-window no-op, large-window cap,
small-window proportional reserve, floor, explicit-request passthrough)
and an httptest-based DiscoverOpenRouter test for the served-context
preference.
Co-authored-by: Neil-urk12 <neil-urk12@users.noreply.github.com>
Turns omitted MaxTokens on the provider request, so Bedrock applied its
conservative 4096 default and silently truncated long writes/edits with
stopReason=length. In the TUI this read like the interaction timed out.
Thread the resolved model's catalog MaxOutput through to the request:
catalog Model.MaxOutput -> Resolved.MaxOutput -> Agent.MaxTokens
-> provider.Request.MaxTokens
Zero still falls back to each provider's own default, so models without a
catalog MaxOutput are unaffected. The SDK path inherits this via NewAgent.
Also surface StopLength explicitly in the TUI ('response hit the output
limit -- ask it to continue') instead of ending silently.
Tests: TestAgentPropagatesMaxTokens (Agent.MaxTokens reaches the wire) and
TestBedrockBuildRequestMaxTokens (non-zero flows through; zero -> 4096).
- it currently rejects requests where input + max_output exceeds the
serving provider's context lmit (which may be tighter than the
model-level value)
- use the smaller of ContextWindow and MaxOutput as the cap, with a
4096-token input reserve
A turn aborted mid-flight (cancel, connection drop, dev-server
ECONNREFUSED) can leave an assistant tool_use block with no matching
tool_result in the live transcript. repairToolUseResultPairs already
fixes this, but only ran in OpenSession (load time), so an in-process
abort left the transcript broken until restart. The next request was
then rejected by Anthropic/OpenAI with 'tool_use ids were found without
tool_result blocks'.
Run the same repair on the outbound messages in oneTurn. It is pure and
a no-op on valid transcripts, so there is no hot-path cost beyond a
single linear scan and no behavior change for healthy sessions.
Allow extensions to emit an open_panel frame at any time, not just as
the action of a command_response. This makes it possible to build
approval gates, secret collection, and freeform user-input prompts
directly inside tool handlers.
Changes:
- extproto: add OpenPanelFromExt wire type
- extensions/manager: route spontaneous open_panel frames to hooks.OpenPanel
- ext/ext.go: add Extension.OpenPanel() SDK method
- tests: TestSpontaneousOpenPanel (manager), TestOpenPanelEmitsCorrectFrame,
TestBlockingToolWaitsForPanelKey, TestBlockingToolDenied (SDK)
- docs/plans: add spontaneous-panel.md design doc
The blocking tool pattern (open panel → block on channel → key event →
tool_result) requires no additional wire changes; it falls out of
standard Go concurrency on the extension side.
Part 3 (intercept timeout for built-in tool gating) is out of scope
and tracked separately.
Bedrock's Converse API returns HTTP 400 with "toolConfig field must be
defined when using toolUse and toolResult content blocks" whenever the
message history contains toolUse or toolResult blocks but toolConfig is
absent from the request.
The /btw side-chat sends the frozen main transcript as context with no
tools defined. If the main conversation included tool calls the serialised
messages will contain toolUse/toolResult blocks, triggering the 400.
Fix: add bedrockMessagesHaveToolBlocks() to detect this case and, when
req.Tools is empty but tool blocks are present in the history, inject a
minimal stub toolConfig with an inert placeholder tool. Bedrock accepts
the request and the stub can never be invoked since no tool_use stop
reason can fire when the advertised tool list is effectively empty.
Adds Converse API cachePoint blocks at the system prompt boundary and on
the last user message for Bedrock Claude models (PriceCacheWrite > 0),
mirroring the Anthropic provider's caching strategy. Nova models are
excluded (automatic caching).
Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>
Place Bedrock Converse API cachePoint blocks at the system prompt
boundary and after the last user message on every request to Claude
models (those with PriceCacheWrite > 0 in the catalog).
This mirrors the existing Anthropic provider strategy (cache_control:
ephemeral on system, tools, and last user message) using Bedrock's
equivalent syntax: a {"cachePoint":{"type":"default"}} content block
appended to the relevant arrays.
Changes:
- bedrockRequest.System widened from []map[string]string to
[]map[string]interface{} to accommodate mixed text/cachePoint blocks
- bedrockCachePoint: shared sentinel content block var
- bedrockModelSupportsCaching: gates on PriceCacheWrite > 0; strips
geo prefixes before catalog lookup; falls back to anthropic.claude-
prefix check for unknown models (cachePoint is silently ignored by
the API if unsupported)
- buildRequest: resolves model ID before caching check; injects
cachePoint into system array and calls bedrockTagLastUserCache
- bedrockTagLastUserCache: appends cachePoint to last user message
Nova models (PriceCacheWrite == 0) are excluded — they use Bedrock's
automatic caching and don't need explicit markers.
Tests: 8 new cases covering model detection, Claude vs Nova presence/
absence, multi-turn last-message targeting, no-system safety,
nil/empty panic safety, and JSON wire shape.
A long provider error (e.g. a Bedrock 400 with an embedded JSON body)
was appended as one line and clipped at the terminal edge. Wrap v.Err to
the build width via wrapLine (which hard-breaks unbreakable blobs), keep
the marker on the first row, and indent continuation rows so the whole
message is readable.
Newer Bedrock models (Anthropic Claude 4.x, DeepSeek) reject invocation
by their bare foundation-model ID with on-demand throughput, demanding a
cross-region inference-profile ID instead (HTTP 400). Rewrite such IDs
at request time by prepending the region-matched geo prefix
(us/eu/apac/us-gov), so selecting anthropic.claude-sonnet-4-5-... in a
us-east-1 setup invokes us.anthropic.claude-sonnet-4-5-...
Already-prefixed IDs, ARNs, and families that don't need a profile are
left untouched, preserving explicit choices and custom application
inference profiles.
On VS Code's xterm.js the transcript is taller than the viewport, so an
in-place clear (home + erase-to-end) only wipes the visible rows and the
scrolled-away part lingers in retained scrollback, stacking a duplicate
on the next full repaint.
- Clear() (Ctrl+L) now emits \x1b[3J under keepScrollback to actually
drop that scrollback, then homes and repaints. Accepts VS Code's
viewport-snap since the user explicitly asked for a clean screen.
- Overlay close (esc on a dialog, slash/file popup dismissal) now runs
the same Clear() so closing a picker purges the stale overlay rows
instead of leaving them in scrollback.
- Resize() does the same purge under keepScrollback; previously it
skipped the wipe and left a half-repainted old-width frame until the
user pressed Ctrl+L.
Other terminals keep their no-snap clear path.
- Shell escape: typing "!cmd" runs it via the bash tool's shell in the
session cwd, honoring the /jail sandbox. Output is parked below the
transcript as a styled terminal-log block until the next prompt or
/clear, so it never enters the model conversation. Shares busy state
with the agent: esc cancels it and no turn or other escape can start
while one is in flight.
- VS Code terminal: full repaints used \x1b[2J, which xterm.js scrolls
into scrollback and duplicates the frame. Clear in place via cursor
home + erase-to-end under keepScrollback; Clear()/Resize() no longer
eagerly wipe. Force a viewport-safe Invalidate on slash/file popup
open and close transitions there.
- Restore the live tool-call overlay behavior (keep in-flight boxes
visible until the tool_result reaches the transcript) and drop the
forced repaint at turn start.
- Document the shell escape in the README.
Map short/alternate provider names (bedrock -> amazon-bedrock, vertex,
gemini, azure, copilot, codex, ...) to their canonical ids in Resolve so
an alias is never treated as unknown and silently downgraded to
anthropic. Add a region-aware hint to Bedrock 403 responses on the
bearer route.
- User themes from $ZOT_HOME/themes/*.json with partial overrides
(colors, syntax, spinner) and dark/light fallback.
- /settings color-theme picker; selection persisted in config.json.
- Theme-only extensions: extension.json plus theme.json (or
themes/theme.json) load without spawning a subprocess.
- write-zot-themes built-in skill and docs/themes.md.
- README, extensions docs, and embedded docs index updated.
On the openai-codex (Responses API) route a tool result serialized to a
string-only function_call_output, dropping ImageBlock content, and the
agent loop's tool-image mirror only fired for provider "openai". So
images returned by read reached the TUI but never the model, which then
correctly reported it received no image content.
Extend the mirror to fire for "openai-codex" too (its client already
serializes user-message images as input_image, so the bytes arrive),
and have the codex tool-result serializer emit a short placeholder for
an image-only result instead of an empty output the API may reject.
Adds a test covering both behaviors.
Opus 4.7+ only support adaptive thinking: explicit thinking budgets
(thinking:{type:enabled,budget_tokens:N}) and non-default sampling
params return 400. The Anthropic client now sends thinking:{type:
adaptive} plus output_config.effort and omits temperature for these
models, while older models keep the budget-based path. Adaptive models
are detected via a new Model.AdaptiveThinking flag with an id-substring
fallback so the same family reached through an Anthropic-Messages proxy
is handled too.
For adaptive Anthropic models served over the OpenAI-compatible chat-
completions wire (openrouter, opencode, ...), reasoning_effort now maps
maximum -> xhigh instead of clamping to high, preserving the model's
full reasoning ceiling. Adds AnthropicAdaptiveEffort and
OpenAICompatAnthropicEffort with tests.
Active() captured Catalog into a package var initializer, which runs
before the init() functions in catalog_builtin.go/extra_models.go append
the extended catalog. The picker therefore only ever saw the curated
seed list, dropping openrouter and every other extra provider. Defer the
Catalog read to call time so Active() reflects the fully-assembled list.
Also make the model dialog filter strictly by logged-in providers: an
empty credential set now yields an empty picker (with a /login hint)
instead of dumping the entire ~900-model catalog.
Anthropic shipped claude-opus-4-8 today (2026-05-28). Pricing and
limits are identical to the 4.7 line per models.dev:
- 1,000,000 token context window
- 128,000 token max output
- reasoning supported
- $5.00 / $25.00 per 1M input/output tokens
- $0.50 / $6.25 per 1M cache read/write tokens
Mirror the same provider topology zot already uses for 4.5, 4.6, and
4.7, so the new model shows up everywhere users have an existing
Opus route configured:
- packages/provider/models.go: anthropic (speculative block,
matching how 4.5/4.6/4.7 are listed)
- packages/provider/catalog_builtin.go:
* amazon-bedrock: anthropic.claude-opus-4-8 plus the five
regional cross-region inference profiles
(us./eu./global./jp./au.). AU keeps its 3.3x surcharge
($16.50 / $82.50) consistent with 4.6/4.7 AU rows.
* cloudflare-ai-gateway
* github-copilot (Copilot pricing is $0/$0, ctx 144k,
output 64k, matching the 4.7 Copilot row)
* opencode
* openrouter: standard route plus the 6x 'Fast' SKU
($30/$150/$3/$37.50) consistent with 4.6/4.7
* vercel-ai-gateway
Vertex (google-vertex / google-vertex-anthropic) is deliberately
skipped: zot's google-vertex provider is Gemini-only today and there
is no google-vertex-anthropic provider wired up. Earlier Opus
versions skip Vertex for the same reason, so 4.8 stays consistent.
Tests:
- go build ./... clean
- go vet ./packages/provider/... clean
- go test ./packages/provider/... pass
After the binary swap succeeds, zot update now walks
$ZOT_HOME/extensions/ and runs git pull --ff-only on every
extension that is a git checkout.
Per-extension behaviour:
- disabled extensions: skipped
- no .git/ directory: skipped (no remote to pull from)
- dirty worktree: stashed (--include-untracked) before the pull,
popped after; conflict on pop leaves markers in place with a
warning rather than discarding the runtime state
- diverged / offline / any git failure: reported as failed and the
next extension is processed
- timeout per extension: 60s
- no build step is ever executed; authors commit the runnable
artifact, or the user rebuilds manually and /reload-ext
zot update itself never aborts because of an extension. The
binary swap is the source of truth for success.
Implementation in packages/agent/extupdate.go (~150 LoC), 13 unit
tests covering each branch including stash+pop with untracked
runtime files, diverged history, unreachable remote, and the
mixed-state scenario. README's Extensions section documents the
new behaviour.
Single Go module, four top-level packages under packages/. Import
paths become github.com/patriceckhart/zot/packages/<name>; downstream
consumers can depend on individual packages without pulling the rest.
Layout:
packages/provider/ LLM clients + catalog
packages/provider/auth/ credential store + OAuth + login server
packages/core/ agent loop, sessions, cost
packages/tui/ terminal toolkit + chat view
packages/agent/ CLI wiring, system prompt
extensions/ extproto/ modes/ tools/ skills/ swarm/
sdk/ (was pkg/zotcore, package renamed zotcore -> sdk)
ext/ (was pkg/zotext, package renamed zotext -> ext)
internal/ and pkg/ removed. The internal/assets logo moved into
packages/provider/auth/assets.
Public Go SDK identifiers renamed:
pkg/zotcore (package zotcore) -> packages/agent/sdk (package sdk)
pkg/zotext (package zotext) -> packages/agent/ext (package ext)
This breaks Go-based extensions and embedders; the JSON wire protocol
for extensions and RPC is unchanged, so non-Go extensions, already-
built extension binaries, and zot rpc consumers are unaffected.
Docs, examples, and the built-in write-zot-extension skill updated
for the new paths and identifiers. Shadow-bug fixes in code samples
(ext := ext.New -> e := ext.New).