zot/packages at v0.2.24 - clawdie/zot - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/patriceckhart/zot.git synced 2026-06-26 21:36:31 +02:00

History

patriceckhart d2fa18270d fix(provider): clamp max_tokens to fit context window with proportional reserve OpenRouter enforces input + max_output <= served context_length and rejects requests where max_tokens equals the whole window, which happens for models whose catalog MaxOutput is set equal to ContextWindow (e.g. nemotron-3-super-120B). Two parts: - discover.go (from #24): prefer top_provider.context_length when it is smaller than the inflated model-level context_length, so ContextWindow reflects the limit OpenRouter actually serves. - openai.go: clamp max_tokens to ContextWindow minus a reserve. The reserve is derived from the window (window/8, capped at 4096), never from MaxOutput, so models whose output already fits the window are untouched and small-window models (gpt-4) are not over-penalized. Adds buildRequest clamp tests (fits-window no-op, large-window cap, small-window proportional reserve, floor, explicit-request passthrough) and an httptest-based DiscoverOpenRouter test for the served-context preference. Co-authored-by: Neil-urk12 <neil-urk12@users.noreply.github.com>		2026-06-09 19:29:48 +02:00
..
agent	style: drop em-dashes from output-token-budget strings/comments	2026-06-09 18:38:09 +02:00
core	style: drop em-dashes from output-token-budget strings/comments	2026-06-09 18:38:09 +02:00
provider	fix(provider): clamp max_tokens to fit context window with proportional reserve	2026-06-09 19:29:48 +02:00
tui	Word-wrap provider error rows instead of truncating	2026-06-04 19:25:16 +02:00