Commit graph

34 commits

Author SHA1 Message Date
patriceckhart
f7bf4a9d41 chore: gofmt spontaneous panel files
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>
2026-06-08 19:41:28 +02:00
Raymond Gasper
2d46ef9b09 feat(panels): spontaneous open_panel frame for human-in-the-loop tool gates (#19)
Allow extensions to emit an open_panel frame at any time, not just as
the action of a command_response. This makes it possible to build
approval gates, secret collection, and freeform user-input prompts
directly inside tool handlers.

Changes:
- extproto: add OpenPanelFromExt wire type
- extensions/manager: route spontaneous open_panel frames to hooks.OpenPanel
- ext/ext.go: add Extension.OpenPanel() SDK method
- tests: TestSpontaneousOpenPanel (manager), TestOpenPanelEmitsCorrectFrame,
  TestBlockingToolWaitsForPanelKey, TestBlockingToolDenied (SDK)
- docs/plans: add spontaneous-panel.md design doc

The blocking tool pattern (open panel → block on channel → key event →
tool_result) requires no additional wire changes; it falls out of
standard Go concurrency on the extension side.

Part 3 (intercept timeout for built-in tool gating) is out of scope
and tracked separately.
2026-06-08 12:13:55 -04:00
Raymond Gasper
fec5ae0bf1 fix(bedrock): inject stub toolConfig when history has tool blocks
Bedrock's Converse API returns HTTP 400 with "toolConfig field must be
defined when using toolUse and toolResult content blocks" whenever the
message history contains toolUse or toolResult blocks but toolConfig is
absent from the request.

The /btw side-chat sends the frozen main transcript as context with no
tools defined. If the main conversation included tool calls the serialised
messages will contain toolUse/toolResult blocks, triggering the 400.

Fix: add bedrockMessagesHaveToolBlocks() to detect this case and, when
req.Tools is empty but tool blocks are present in the history, inject a
minimal stub toolConfig with an inert placeholder tool. Bedrock accepts
the request and the stub can never be invoked since no tool_use stop
reason can fire when the advertised tool list is effectively empty.
2026-06-08 10:56:03 -04:00
patriceckhart
7eb8a65637 Merge #17: bedrock prompt caching via cachePoint markers
Adds Converse API cachePoint blocks at the system prompt boundary and on
the last user message for Bedrock Claude models (PriceCacheWrite > 0),
mirroring the Anthropic provider's caching strategy. Nova models are
excluded (automatic caching).

Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>
2026-06-08 15:40:01 +02:00
Raymond Gasper
cc03a4c18a provider/bedrock: add prompt caching via cachePoint markers
Place Bedrock Converse API cachePoint blocks at the system prompt
boundary and after the last user message on every request to Claude
models (those with PriceCacheWrite > 0 in the catalog).

This mirrors the existing Anthropic provider strategy (cache_control:
ephemeral on system, tools, and last user message) using Bedrock's
equivalent syntax: a {"cachePoint":{"type":"default"}} content block
appended to the relevant arrays.

Changes:
- bedrockRequest.System widened from []map[string]string to
  []map[string]interface{} to accommodate mixed text/cachePoint blocks
- bedrockCachePoint: shared sentinel content block var
- bedrockModelSupportsCaching: gates on PriceCacheWrite > 0; strips
  geo prefixes before catalog lookup; falls back to anthropic.claude-
  prefix check for unknown models (cachePoint is silently ignored by
  the API if unsupported)
- buildRequest: resolves model ID before caching check; injects
  cachePoint into system array and calls bedrockTagLastUserCache
- bedrockTagLastUserCache: appends cachePoint to last user message

Nova models (PriceCacheWrite == 0) are excluded — they use Bedrock's
automatic caching and don't need explicit markers.

Tests: 8 new cases covering model detection, Claude vs Nova presence/
absence, multi-turn last-message targeting, no-system safety,
nil/empty panic safety, and JSON wire shape.
2026-06-08 09:28:43 -04:00
patriceckhart
956b0a24e2 Merge remote-tracking branch 'origin/main' into pr-11 2026-06-08 15:23:22 +02:00
patriceckhart
eef2714dea Scan all known providers in credential fallback (adopts #16) 2026-06-08 15:22:40 +02:00
patriceckhart
323df7f6d3 Discover env-only bedrock in credential fallback scan 2026-06-08 15:17:22 +02:00
Patric Eckhart
ab6d543626
Merge branch 'main' into openrouter-live-models 2026-06-08 07:47:42 +02:00
patriceckhart
a7ef8c22a1 Respect ollama model baseUrl before default
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-08 07:41:05 +02:00
patriceckhart
7da9114a05 Gate live tool rendering behind preceding stream text
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-07 16:58:39 +02:00
patriceckhart
63e33d9aa9 Add clear_notes extension frame and clear notes on new prompt
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-07 11:10:02 +02:00
patriceckhart
30cff8843d Respect gitignore when installing extensions 2026-06-07 10:25:50 +02:00
patriceckhart
10fde8fd0e Fix ext install with relative path source 2026-06-07 10:18:41 +02:00
patriceckhart
84fd98ea74 Normalize Bedrock tool results
Some checks failed
ci / test (macos-latest) (push) Has been cancelled
ci / test (ubuntu-latest) (push) Has been cancelled
ci / test (windows-latest) (push) Has been cancelled
2026-06-05 16:05:38 +02:00
Dawid Piotrkowski
88b93da57f provider: discover OpenRouter models live, drop baked-in catalog 2026-06-05 15:20:54 +02:00
patriceckhart
7a7bf0b52c Fix Bedrock streaming and provider setup docs
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-05 08:31:54 +02:00
patriceckhart
498b769c07 Add OpenRouter NVIDIA Nemotron Ultra models
Some checks are pending
ci / test (macos-latest) (push) Waiting to run
ci / test (ubuntu-latest) (push) Waiting to run
ci / test (windows-latest) (push) Waiting to run
2026-06-04 20:39:35 +02:00
patriceckhart
95a36c270e Remove stale OpenAI Codex model entries 2026-06-04 20:23:32 +02:00
patriceckhart
4dffc8529b Word-wrap provider error rows instead of truncating
A long provider error (e.g. a Bedrock 400 with an embedded JSON body)
was appended as one line and clipped at the terminal edge. Wrap v.Err to
the build width via wrapLine (which hard-breaks unbreakable blobs), keep
the marker on the first row, and indent continuation rows so the whole
message is readable.
2026-06-04 19:25:16 +02:00
patriceckhart
d8976c94df Send Bedrock inference-profile IDs for on-demand models
Newer Bedrock models (Anthropic Claude 4.x, DeepSeek) reject invocation
by their bare foundation-model ID with on-demand throughput, demanding a
cross-region inference-profile ID instead (HTTP 400). Rewrite such IDs
at request time by prepending the region-matched geo prefix
(us/eu/apac/us-gov), so selecting anthropic.claude-sonnet-4-5-... in a
us-east-1 setup invokes us.anthropic.claude-sonnet-4-5-...

Already-prefixed IDs, ARNs, and families that don't need a profile are
left untouched, preserving explicit choices and custom application
inference profiles.
2026-06-04 19:25:16 +02:00
patriceckhart
4bcdf8804b Purge VS Code scrollback on clear, overlay close, and resize
On VS Code's xterm.js the transcript is taller than the viewport, so an
in-place clear (home + erase-to-end) only wipes the visible rows and the
scrolled-away part lingers in retained scrollback, stacking a duplicate
on the next full repaint.

- Clear() (Ctrl+L) now emits \x1b[3J under keepScrollback to actually
  drop that scrollback, then homes and repaints. Accepts VS Code's
  viewport-snap since the user explicitly asked for a clean screen.
- Overlay close (esc on a dialog, slash/file popup dismissal) now runs
  the same Clear() so closing a picker purges the stale overlay rows
  instead of leaving them in scrollback.
- Resize() does the same purge under keepScrollback; previously it
  skipped the wipe and left a half-repainted old-width frame until the
  user pressed Ctrl+L.

Other terminals keep their no-snap clear path.
2026-06-04 19:16:21 +02:00
patriceckhart
cde9298410 Add !command shell escape and fix VS Code terminal repaints
- Shell escape: typing "!cmd" runs it via the bash tool's shell in the
  session cwd, honoring the /jail sandbox. Output is parked below the
  transcript as a styled terminal-log block until the next prompt or
  /clear, so it never enters the model conversation. Shares busy state
  with the agent: esc cancels it and no turn or other escape can start
  while one is in flight.
- VS Code terminal: full repaints used \x1b[2J, which xterm.js scrolls
  into scrollback and duplicates the frame. Clear in place via cursor
  home + erase-to-end under keepScrollback; Clear()/Resize() no longer
  eagerly wipe. Force a viewport-safe Invalidate on slash/file popup
  open and close transitions there.
- Restore the live tool-call overlay behavior (keep in-flight boxes
  visible until the tool_result reaches the transcript) and drop the
  forced repaint at turn start.
- Document the shell escape in the README.
2026-06-04 18:05:17 +02:00
patriceckhart
ec5eb20ce9 feat(provider): alias common provider names and clarify Bedrock 403
Some checks failed
ci / test (macos-latest) (push) Has been cancelled
ci / test (ubuntu-latest) (push) Has been cancelled
ci / test (windows-latest) (push) Has been cancelled
Map short/alternate provider names (bedrock -> amazon-bedrock, vertex,
gemini, azure, copilot, codex, ...) to their canonical ids in Resolve so
an alias is never treated as unknown and silently downgraded to
anthropic. Add a region-aware hint to Bedrock 403 responses on the
bearer route.
2026-06-03 18:13:22 +02:00
patriceckhart
ea58887bfa Fix login dialog cursor alignment
Some checks failed
ci / test (macos-latest) (push) Has been cancelled
ci / test (ubuntu-latest) (push) Has been cancelled
ci / test (windows-latest) (push) Has been cancelled
2026-05-31 13:51:13 +02:00
patriceckhart
917da8c414 Add optional theme background support 2026-05-30 19:01:55 +02:00
patriceckhart
16b95cb974 Retry transient provider stream errors 2026-05-30 15:25:33 +02:00
patriceckhart
dfd25012b6 Add JSON theming, theme-only extensions, and docs
- User themes from $ZOT_HOME/themes/*.json with partial overrides
  (colors, syntax, spinner) and dark/light fallback.
- /settings color-theme picker; selection persisted in config.json.
- Theme-only extensions: extension.json plus theme.json (or
  themes/theme.json) load without spawning a subprocess.
- write-zot-themes built-in skill and docs/themes.md.
- README, extensions docs, and embedded docs index updated.
2026-05-30 11:34:42 +02:00
patriceckhart
ecb3b022cc fix(provider): deliver tool-result images to the OpenAI Responses route
On the openai-codex (Responses API) route a tool result serialized to a
string-only function_call_output, dropping ImageBlock content, and the
agent loop's tool-image mirror only fired for provider "openai". So
images returned by read reached the TUI but never the model, which then
correctly reported it received no image content.

Extend the mirror to fire for "openai-codex" too (its client already
serializes user-message images as input_image, so the bytes arrive),
and have the codex tool-result serializer emit a short placeholder for
an image-only result instead of an empty output the API may reject.
Adds a test covering both behaviors.
2026-05-29 14:21:51 +02:00
patriceckhart
124d679982 feat(provider): adaptive thinking + xhigh effort for Opus 4.7/4.8
Opus 4.7+ only support adaptive thinking: explicit thinking budgets
(thinking:{type:enabled,budget_tokens:N}) and non-default sampling
params return 400. The Anthropic client now sends thinking:{type:
adaptive} plus output_config.effort and omits temperature for these
models, while older models keep the budget-based path. Adaptive models
are detected via a new Model.AdaptiveThinking flag with an id-substring
fallback so the same family reached through an Anthropic-Messages proxy
is handled too.

For adaptive Anthropic models served over the OpenAI-compatible chat-
completions wire (openrouter, opencode, ...), reasoning_effort now maps
maximum -> xhigh instead of clamping to high, preserving the model's
full reasoning ceiling. Adds AnthropicAdaptiveEffort and
OpenAICompatAnthropicEffort with tests.
2026-05-29 14:21:38 +02:00
patriceckhart
edbbcc1086 fix(provider): include full catalog in model picker, gate by credentials
Active() captured Catalog into a package var initializer, which runs
before the init() functions in catalog_builtin.go/extra_models.go append
the extended catalog. The picker therefore only ever saw the curated
seed list, dropping openrouter and every other extra provider. Defer the
Catalog read to call time so Active() reflects the fully-assembled list.

Also make the model dialog filter strictly by logged-in providers: an
empty credential set now yields an empty picker (with a /login hint)
instead of dumping the entire ~900-model catalog.
2026-05-29 11:26:10 +02:00
patriceckhart
5470345d15 feat(provider): add Claude Opus 4.8 to anthropic + bedrock + gateway catalogs
Anthropic shipped claude-opus-4-8 today (2026-05-28). Pricing and
limits are identical to the 4.7 line per models.dev:

  - 1,000,000 token context window
  - 128,000 token max output
  - reasoning supported
  - $5.00 / $25.00 per 1M input/output tokens
  - $0.50 / $6.25 per 1M cache read/write tokens

Mirror the same provider topology zot already uses for 4.5, 4.6, and
4.7, so the new model shows up everywhere users have an existing
Opus route configured:

  - packages/provider/models.go: anthropic (speculative block,
    matching how 4.5/4.6/4.7 are listed)
  - packages/provider/catalog_builtin.go:
      * amazon-bedrock: anthropic.claude-opus-4-8 plus the five
        regional cross-region inference profiles
        (us./eu./global./jp./au.). AU keeps its 3.3x surcharge
        ($16.50 / $82.50) consistent with 4.6/4.7 AU rows.
      * cloudflare-ai-gateway
      * github-copilot (Copilot pricing is $0/$0, ctx 144k,
        output 64k, matching the 4.7 Copilot row)
      * opencode
      * openrouter: standard route plus the 6x 'Fast' SKU
        ($30/$150/$3/$37.50) consistent with 4.6/4.7
      * vercel-ai-gateway

Vertex (google-vertex / google-vertex-anthropic) is deliberately
skipped: zot's google-vertex provider is Gemini-only today and there
is no google-vertex-anthropic provider wired up. Earlier Opus
versions skip Vertex for the same reason, so 4.8 stays consistent.

Tests:
  - go build ./... clean
  - go vet ./packages/provider/... clean
  - go test ./packages/provider/... pass
2026-05-28 21:34:12 +02:00
patriceckhart
3ce114c8de feat(update): fast-forward installed extensions during zot update
After the binary swap succeeds, zot update now walks
$ZOT_HOME/extensions/ and runs git pull --ff-only on every
extension that is a git checkout.

Per-extension behaviour:
- disabled extensions: skipped
- no .git/ directory: skipped (no remote to pull from)
- dirty worktree: stashed (--include-untracked) before the pull,
  popped after; conflict on pop leaves markers in place with a
  warning rather than discarding the runtime state
- diverged / offline / any git failure: reported as failed and the
  next extension is processed
- timeout per extension: 60s
- no build step is ever executed; authors commit the runnable
  artifact, or the user rebuilds manually and /reload-ext

zot update itself never aborts because of an extension. The
binary swap is the source of truth for success.

Implementation in packages/agent/extupdate.go (~150 LoC), 13 unit
tests covering each branch including stash+pop with untracked
runtime files, diverged history, unreachable remote, and the
mixed-state scenario. README's Extensions section documents the
new behaviour.
2026-05-27 09:37:59 +02:00
patriceckhart
fa7d8d8be5 refactor: split source into packages/{provider,core,tui,agent}
Single Go module, four top-level packages under packages/. Import
paths become github.com/patriceckhart/zot/packages/<name>; downstream
consumers can depend on individual packages without pulling the rest.

Layout:
  packages/provider/     LLM clients + catalog
  packages/provider/auth/ credential store + OAuth + login server
  packages/core/         agent loop, sessions, cost
  packages/tui/          terminal toolkit + chat view
  packages/agent/        CLI wiring, system prompt
    extensions/ extproto/ modes/ tools/ skills/ swarm/
    sdk/  (was pkg/zotcore, package renamed zotcore -> sdk)
    ext/  (was pkg/zotext, package renamed zotext -> ext)

internal/ and pkg/ removed. The internal/assets logo moved into
packages/provider/auth/assets.

Public Go SDK identifiers renamed:
  pkg/zotcore (package zotcore) -> packages/agent/sdk (package sdk)
  pkg/zotext  (package zotext)  -> packages/agent/ext (package ext)

This breaks Go-based extensions and embedders; the JSON wire protocol
for extensions and RPC is unchanged, so non-Go extensions, already-
built extension binaries, and zot rpc consumers are unaffected.

Docs, examples, and the built-in write-zot-extension skill updated
for the new paths and identifiers. Shadow-bug fixes in code samples
(ext := ext.New -> e := ext.New).
2026-05-27 09:07:15 +02:00