fix(anthropic): drop claude-code identity cache marker

oauth requests now exceed anthropic's 4-breakpoint cache_control
limit when the conversation has 2+ user messages. previous layout
emitted 5 markers: identity + system + tools + 2 user messages.

drop the marker on the small claude-code identity line. it's a few
tokens and gets folded into the cached prefix implicitly when the
request matches turn-over-turn anyway. budget now: system + tools +
last 2 user messages = 4. fits.

reproduces the user-reported error:
  anthropic: http 400 ... A maximum of 4 blocks with cache_control
  may be provided. Found 5.

verified by sending two consecutive prompts through zot rpc on an
oauth credential -- first turn returns the assistant message
cleanly, second turn does too instead of 400ing.
This commit is contained in:
patriceckhart 2026-04-19 12:39:33 +02:00
parent ebc5dad18c
commit 3ff6d9e6b7

View file

@ -184,11 +184,16 @@ func (c *anthropicClient) buildRequest(req Request) (*anthRequest, error) {
// System prompt assembly differs between api-key and OAuth modes.
// OAuth requests MUST begin with the Claude Code identity line or
// Anthropic rejects them (429 rate_limit_error with zero tokens used).
//
// Cache budget: anthropic caps cache_control to 4 breakpoints per
// request. We spend them on (system prompt) + (tools tail) + (last
// two user messages). The claude-code identity line stays uncached
// because it's a few tokens and gets folded into the larger prefix
// implicitly anyway.
if c.oauthTok != "" {
out.System = []anthSystemBlock{{
Type: "text",
Text: claudeCodeIdentity,
CacheControl: &anthCacheCtrl{Type: "ephemeral"},
Type: "text",
Text: claudeCodeIdentity,
}}
if req.System != "" {
out.System = append(out.System, anthSystemBlock{