clawdie/zot - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/patriceckhart/zot.git synced 2026-06-26 21:36:31 +02:00

Author	SHA1	Message	Date
patriceckhart	ffca64d4fd	Merge #24 : clamp max_tokens to fit context window (proportional reserve) Fixes OpenRouter rejecting requests where max_tokens equals the served context window. Prefer top_provider.context_length on discovery and clamp max_tokens to ContextWindow minus a proportional reserve (window/8 capped at 4096). Reworked from the original PR so the reserve derives from the window, not MaxOutput: models whose output already fits are untouched and small-window models are not over-penalized. Co-authored-by: Neil-urk12 <neil-urk12@users.noreply.github.com>	2026-06-09 19:29:54 +02:00
patriceckhart	d2fa18270d	fix(provider): clamp max_tokens to fit context window with proportional reserve OpenRouter enforces input + max_output <= served context_length and rejects requests where max_tokens equals the whole window, which happens for models whose catalog MaxOutput is set equal to ContextWindow (e.g. nemotron-3-super-120B). Two parts: - discover.go (from #24): prefer top_provider.context_length when it is smaller than the inflated model-level context_length, so ContextWindow reflects the limit OpenRouter actually serves. - openai.go: clamp max_tokens to ContextWindow minus a reserve. The reserve is derived from the window (window/8, capped at 4096), never from MaxOutput, so models whose output already fits the window are untouched and small-window models (gpt-4) are not over-penalized. Adds buildRequest clamp tests (fits-window no-op, large-window cap, small-window proportional reserve, floor, explicit-request passthrough) and an httptest-based DiscoverOpenRouter test for the served-context preference. Co-authored-by: Neil-urk12 <neil-urk12@users.noreply.github.com>	2026-06-09 19:29:48 +02:00
patriceckhart	b68008327d	Merge remote-tracking branch 'origin/main' into pr-24	2026-06-09 19:22:05 +02:00
patriceckhart	c2c9a5ea28	Merge #23 : request model's full output-token budget per turn Thread the resolved model's catalog MaxOutput through to provider.Request.MaxTokens so each turn requests the model's full output capacity. Fixes Bedrock silently truncating long writes/edits at its 4096 default (stopReason=length). Other providers already defaulted to MaxOutput on a zero request, so this is a no-op for them. Also surfaces StopLength explicitly in the TUI instead of ending silently. Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-09 18:38:10 +02:00
patriceckhart	a373e82896	style: drop em-dashes from output-token-budget strings/comments Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-09 18:38:09 +02:00
Raymond Gasper	3cf22fc32b	fix: request model's full output-token budget per turn Turns omitted MaxTokens on the provider request, so Bedrock applied its conservative 4096 default and silently truncated long writes/edits with stopReason=length. In the TUI this read like the interaction timed out. Thread the resolved model's catalog MaxOutput through to the request: catalog Model.MaxOutput -> Resolved.MaxOutput -> Agent.MaxTokens -> provider.Request.MaxTokens Zero still falls back to each provider's own default, so models without a catalog MaxOutput are unaffected. The SDK path inherits this via NewAgent. Also surface StopLength explicitly in the TUI ('response hit the output limit -- ask it to continue') instead of ending silently. Tests: TestAgentPropagatesMaxTokens (Agent.MaxTokens reaches the wire) and TestBedrockBuildRequestMaxTokens (non-zero flows through; zero -> 4096).	2026-06-09 12:24:04 -04:00
Neil Vallecer	bd648be324	refactor: simplify OpenRouter context window selection - collapse if/else-if into a single condition - same behavior (no change in functionality)	2026-06-09 23:45:31 +08:00
Neil Vallecer	1425e68636	fix(provider): clamp max_tokens to fit OpenRouter provider context window - it currently rejects requests where input + max_output exceeds the serving provider's context lmit (which may be tighter than the model-level value) - use the smaller of ContextWindow and MaxOutput as the cap, with a 4096-token input reserve	2026-06-09 23:25:37 +08:00
patriceckhart	3d031dde26	Merge #21 : repair dangling tool_use on every request, not just load Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details Run repairToolUseResultPairs on outbound messages in oneTurn so an in-process aborted turn (cancel, connection drop, ECONNREFUSED) no longer leaves a dangling tool_use that gets rejected by Anthropic/OpenAI on the next request. Pure and idempotent, no-op on valid transcripts.	2026-06-09 12:58:39 +02:00
patriceckhart	b25b860b09	fix(core): repair dangling tool_use on every request, not just load A turn aborted mid-flight (cancel, connection drop, dev-server ECONNREFUSED) can leave an assistant tool_use block with no matching tool_result in the live transcript. repairToolUseResultPairs already fixes this, but only ran in OpenSession (load time), so an in-process abort left the transcript broken until restart. The next request was then rejected by Anthropic/OpenAI with 'tool_use ids were found without tool_result blocks'. Run the same repair on the outbound messages in oneTurn. It is pure and a no-op on valid transcripts, so there is no hot-path cost beyond a single linear scan and no behavior change for healthy sessions.	2026-06-09 12:56:31 +02:00
patriceckhart	f7bf4a9d41	chore: gofmt spontaneous panel files Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-08 19:41:28 +02:00
patriceckhart	af6c526cd7	Merge #20 : spontaneous open_panel for extension panels Allow extensions to open panels outside slash-command responses, enabling human-in-the-loop tool gates and secret/input collection patterns. Removes the review-only spontaneous panel plan before merge. Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-08 19:37:16 +02:00
patriceckhart	e6d8408a4f	chore: remove spontaneous panel review plan Remove the review-only planning document before merging PR #20. Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-08 19:37:04 +02:00
Raymond Gasper	17fc959c41	docs(examples): remove phase number references from approve/secret READMEs	2026-06-08 12:39:59 -04:00
Raymond Gasper	e9f98b3578	fix(approve): clearer panel focus indicator and unhandled key feedback (#19 ) - Footer now shows '● this panel has focus' so users know keypresses are going to the panel, not the editor - Prompt line uses '› approve this action? [y/n]' cursor glyph - Unhandled keys re-render with '› unrecognised key — press y or n' instead of silently swallowing the input	2026-06-08 12:35:16 -04:00
Raymond Gasper	5ffdafa5d8	docs+examples: spontaneous open_panel docs and approve/secret example extensions (#19 ) - docs/extensions.md: add open_panel spontaneous frame section with blocking tool pattern explanation, concurrent-panel note, and references to new examples; add approve/secret to See also list; add roadmap entry - examples/extensions/approve/: approve_action tool — opens a y/n panel from inside the tool handler, blocks until user responds - examples/extensions/secret/: fetch_with_password tool — masked password input panel, secret never leaves the extension process	2026-06-08 12:18:06 -04:00
Raymond Gasper	2d46ef9b09	feat(panels): spontaneous open_panel frame for human-in-the-loop tool gates (#19 ) Allow extensions to emit an open_panel frame at any time, not just as the action of a command_response. This makes it possible to build approval gates, secret collection, and freeform user-input prompts directly inside tool handlers. Changes: - extproto: add OpenPanelFromExt wire type - extensions/manager: route spontaneous open_panel frames to hooks.OpenPanel - ext/ext.go: add Extension.OpenPanel() SDK method - tests: TestSpontaneousOpenPanel (manager), TestOpenPanelEmitsCorrectFrame, TestBlockingToolWaitsForPanelKey, TestBlockingToolDenied (SDK) - docs/plans: add spontaneous-panel.md design doc The blocking tool pattern (open panel → block on channel → key event → tool_result) requires no additional wire changes; it falls out of standard Go concurrency on the extension side. Part 3 (intercept timeout for built-in tool gating) is out of scope and tracked separately.	2026-06-08 12:13:55 -04:00
patriceckhart	6938d13e90	Merge #18 : bedrock /btw chats fail from invalid toolConfig Inject a stub toolConfig when the message history contains toolUse or toolResult blocks but req.Tools is empty (e.g. the /btw side-chat sends the frozen main transcript). Bedrock's Converse API otherwise rejects the request with HTTP 400. Bedrock-only; other providers unaffected. Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-08 17:25:02 +02:00
Raymond Gasper	fec5ae0bf1	fix(bedrock): inject stub toolConfig when history has tool blocks Bedrock's Converse API returns HTTP 400 with "toolConfig field must be defined when using toolUse and toolResult content blocks" whenever the message history contains toolUse or toolResult blocks but toolConfig is absent from the request. The /btw side-chat sends the frozen main transcript as context with no tools defined. If the main conversation included tool calls the serialised messages will contain toolUse/toolResult blocks, triggering the 400. Fix: add bedrockMessagesHaveToolBlocks() to detect this case and, when req.Tools is empty but tool blocks are present in the history, inject a minimal stub toolConfig with an inert placeholder tool. Bedrock accepts the request and the stub can never be invoked since no tool_use stop reason can fire when the advertised tool list is effectively empty.	2026-06-08 10:56:03 -04:00
patriceckhart	7eb8a65637	Merge #17 : bedrock prompt caching via cachePoint markers Adds Converse API cachePoint blocks at the system prompt boundary and on the last user message for Bedrock Claude models (PriceCacheWrite > 0), mirroring the Anthropic provider's caching strategy. Nova models are excluded (automatic caching). Co-authored-by: Raymond Gasper <raymondgasper@fastmail.com>	2026-06-08 15:40:01 +02:00
Raymond Gasper	cc03a4c18a	provider/bedrock: add prompt caching via cachePoint markers Place Bedrock Converse API cachePoint blocks at the system prompt boundary and after the last user message on every request to Claude models (those with PriceCacheWrite > 0 in the catalog). This mirrors the existing Anthropic provider strategy (cache_control: ephemeral on system, tools, and last user message) using Bedrock's equivalent syntax: a {"cachePoint":{"type":"default"}} content block appended to the relevant arrays. Changes: - bedrockRequest.System widened from []map[string]string to []map[string]interface{} to accommodate mixed text/cachePoint blocks - bedrockCachePoint: shared sentinel content block var - bedrockModelSupportsCaching: gates on PriceCacheWrite > 0; strips geo prefixes before catalog lookup; falls back to anthropic.claude- prefix check for unknown models (cachePoint is silently ignored by the API if unsupported) - buildRequest: resolves model ID before caching check; injects cachePoint into system array and calls bedrockTagLastUserCache - bedrockTagLastUserCache: appends cachePoint to last user message Nova models (PriceCacheWrite == 0) are excluded — they use Bedrock's automatic caching and don't need explicit markers. Tests: 8 new cases covering model detection, Claude vs Nova presence/ absence, multi-turn last-message targeting, no-system safety, nil/empty panic safety, and JSON wire shape.	2026-06-08 09:28:43 -04:00
patriceckhart	f209a339d0	Merge remote-tracking branch 'origin/main' into pr-6	2026-06-08 15:24:27 +02:00
patriceckhart	956b0a24e2	Merge remote-tracking branch 'origin/main' into pr-11	2026-06-08 15:23:22 +02:00
patriceckhart	eef2714dea	Scan all known providers in credential fallback (adopts #16 )	2026-06-08 15:22:40 +02:00
patriceckhart	323df7f6d3	Discover env-only bedrock in credential fallback scan	2026-06-08 15:17:22 +02:00
Patric Eckhart	ab6d543626	Merge branch 'main' into openrouter-live-models	2026-06-08 07:47:42 +02:00
Patric Eckhart	3bdfea48c3	Merge branch 'main' into feat/issue-templates	2026-06-08 07:47:17 +02:00
patriceckhart	a7ef8c22a1	Respect ollama model baseUrl before default Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details	2026-06-08 07:41:05 +02:00
patriceckhart	7da9114a05	Gate live tool rendering behind preceding stream text Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details	2026-06-07 16:58:39 +02:00
patriceckhart	63e33d9aa9	Add clear_notes extension frame and clear notes on new prompt Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details	2026-06-07 11:10:02 +02:00
patriceckhart	30cff8843d	Respect gitignore when installing extensions	2026-06-07 10:25:50 +02:00
patriceckhart	10fde8fd0e	Fix ext install with relative path source	2026-06-07 10:18:41 +02:00
patriceckhart	84fd98ea74	Normalize Bedrock tool results Some checks failed ci / test (macos-latest) (push) Has been cancelled Details ci / test (ubuntu-latest) (push) Has been cancelled Details ci / test (windows-latest) (push) Has been cancelled Details	2026-06-05 16:05:38 +02:00
Dawid Piotrkowski	88b93da57f	provider: discover OpenRouter models live, drop baked-in catalog	2026-06-05 15:20:54 +02:00
patriceckhart	7a7bf0b52c	Fix Bedrock streaming and provider setup docs Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details	2026-06-05 08:31:54 +02:00
patriceckhart	498b769c07	Add OpenRouter NVIDIA Nemotron Ultra models Some checks are pending ci / test (macos-latest) (push) Waiting to run Details ci / test (ubuntu-latest) (push) Waiting to run Details ci / test (windows-latest) (push) Waiting to run Details	2026-06-04 20:39:35 +02:00
patriceckhart	95a36c270e	Remove stale OpenAI Codex model entries	2026-06-04 20:23:32 +02:00
Patric Eckhart	4d6cbeb211	Merge branch 'main' into feat/issue-templates	2026-06-04 19:47:42 +02:00
patriceckhart	4dffc8529b	Word-wrap provider error rows instead of truncating A long provider error (e.g. a Bedrock 400 with an embedded JSON body) was appended as one line and clipped at the terminal edge. Wrap v.Err to the build width via wrapLine (which hard-breaks unbreakable blobs), keep the marker on the first row, and indent continuation rows so the whole message is readable.	2026-06-04 19:25:16 +02:00
patriceckhart	d8976c94df	Send Bedrock inference-profile IDs for on-demand models Newer Bedrock models (Anthropic Claude 4.x, DeepSeek) reject invocation by their bare foundation-model ID with on-demand throughput, demanding a cross-region inference-profile ID instead (HTTP 400). Rewrite such IDs at request time by prepending the region-matched geo prefix (us/eu/apac/us-gov), so selecting anthropic.claude-sonnet-4-5-... in a us-east-1 setup invokes us.anthropic.claude-sonnet-4-5-... Already-prefixed IDs, ARNs, and families that don't need a profile are left untouched, preserving explicit choices and custom application inference profiles.	2026-06-04 19:25:16 +02:00
patriceckhart	4bcdf8804b	Purge VS Code scrollback on clear, overlay close, and resize On VS Code's xterm.js the transcript is taller than the viewport, so an in-place clear (home + erase-to-end) only wipes the visible rows and the scrolled-away part lingers in retained scrollback, stacking a duplicate on the next full repaint. - Clear() (Ctrl+L) now emits \x1b[3J under keepScrollback to actually drop that scrollback, then homes and repaints. Accepts VS Code's viewport-snap since the user explicitly asked for a clean screen. - Overlay close (esc on a dialog, slash/file popup dismissal) now runs the same Clear() so closing a picker purges the stale overlay rows instead of leaving them in scrollback. - Resize() does the same purge under keepScrollback; previously it skipped the wipe and left a half-repainted old-width frame until the user pressed Ctrl+L. Other terminals keep their no-snap clear path.	2026-06-04 19:16:21 +02:00
patriceckhart	cde9298410	Add !command shell escape and fix VS Code terminal repaints - Shell escape: typing "!cmd" runs it via the bash tool's shell in the session cwd, honoring the /jail sandbox. Output is parked below the transcript as a styled terminal-log block until the next prompt or /clear, so it never enters the model conversation. Shares busy state with the agent: esc cancels it and no turn or other escape can start while one is in flight. - VS Code terminal: full repaints used \x1b[2J, which xterm.js scrolls into scrollback and duplicates the frame. Clear in place via cursor home + erase-to-end under keepScrollback; Clear()/Resize() no longer eagerly wipe. Force a viewport-safe Invalidate on slash/file popup open and close transitions there. - Restore the live tool-call overlay behavior (keep in-flight boxes visible until the tool_result reaches the transcript) and drop the forced repaint at turn start. - Document the shell escape in the README.	2026-06-04 18:05:17 +02:00
patriceckhart	ec5eb20ce9	feat(provider): alias common provider names and clarify Bedrock 403 Some checks failed ci / test (macos-latest) (push) Has been cancelled Details ci / test (ubuntu-latest) (push) Has been cancelled Details ci / test (windows-latest) (push) Has been cancelled Details Map short/alternate provider names (bedrock -> amazon-bedrock, vertex, gemini, azure, copilot, codex, ...) to their canonical ids in Resolve so an alias is never treated as unknown and silently downgraded to anthropic. Add a region-aware hint to Bedrock 403 responses on the bearer route.	2026-06-03 18:13:22 +02:00
Neil Vallecer	c7ca5aedfc	feat: add issue templates	2026-06-02 02:41:32 +08:00
patriceckhart	ea58887bfa	Fix login dialog cursor alignment Some checks failed ci / test (macos-latest) (push) Has been cancelled Details ci / test (ubuntu-latest) (push) Has been cancelled Details ci / test (windows-latest) (push) Has been cancelled Details	2026-05-31 13:51:13 +02:00
patriceckhart	917da8c414	Add optional theme background support	2026-05-30 19:01:55 +02:00
patriceckhart	16b95cb974	Retry transient provider stream errors	2026-05-30 15:25:33 +02:00
patriceckhart	dfd25012b6	Add JSON theming, theme-only extensions, and docs - User themes from $ZOT_HOME/themes/*.json with partial overrides (colors, syntax, spinner) and dark/light fallback. - /settings color-theme picker; selection persisted in config.json. - Theme-only extensions: extension.json plus theme.json (or themes/theme.json) load without spawning a subprocess. - write-zot-themes built-in skill and docs/themes.md. - README, extensions docs, and embedded docs index updated.	2026-05-30 11:34:42 +02:00
patriceckhart	ecb3b022cc	fix(provider): deliver tool-result images to the OpenAI Responses route On the openai-codex (Responses API) route a tool result serialized to a string-only function_call_output, dropping ImageBlock content, and the agent loop's tool-image mirror only fired for provider "openai". So images returned by read reached the TUI but never the model, which then correctly reported it received no image content. Extend the mirror to fire for "openai-codex" too (its client already serializes user-message images as input_image, so the bytes arrive), and have the codex tool-result serializer emit a short placeholder for an image-only result instead of an empty output the API may reject. Adds a test covering both behaviors.	2026-05-29 14:21:51 +02:00
patriceckhart	124d679982	feat(provider): adaptive thinking + xhigh effort for Opus 4.7/4.8 Opus 4.7+ only support adaptive thinking: explicit thinking budgets (thinking:{type:enabled,budget_tokens:N}) and non-default sampling params return 400. The Anthropic client now sends thinking:{type: adaptive} plus output_config.effort and omits temperature for these models, while older models keep the budget-based path. Adaptive models are detected via a new Model.AdaptiveThinking flag with an id-substring fallback so the same family reached through an Anthropic-Messages proxy is handled too. For adaptive Anthropic models served over the OpenAI-compatible chat- completions wire (openrouter, opencode, ...), reasoning_effort now maps maximum -> xhigh instead of clamping to high, preserving the model's full reasoning ceiling. Adds AnthropicAdaptiveEffort and OpenAICompatAnthropicEffort with tests.	2026-05-29 14:21:38 +02:00

1 2 3 4 5 ...

319 commits