hermes-bsd

History

Zeejay f8ba265340 fix(aux): trigger fallback on 429 rate-limit errors in auxiliary client When a provider returns a 429 rate-limit error (not billing-related), the auxiliary client's call_llm/async_call_llm previously did NOT trigger the fallback chain. This caused auxiliary tasks like session_search to exhaust all 3 retries against the same rate-limited endpoint, losing session metadata that depended on the summarization completing. Root cause: `_is_payment_error()` only matched 429s containing billing keywords ("credits", "insufficient funds", etc.). Provider-specific rate-limit messages like Nous's "Hold up for a bit, you've exceeded the rate limit on your API key" didn't match, so `_is_payment_error` returned False, `_is_connection_error` returned False, and `should_fallback` was False — all retries hit the same rate-limited provider. Fix: - New `_is_rate_limit_error()` function that detects 429 + rate-limit keywords, generic 429 without billing keywords, and OpenAI SDK `RateLimitError` class instances (which may omit .status_code). - Updated `should_fallback` in both `call_llm` and `async_call_llm` to include `_is_rate_limit_error`. - Updated the max_tokens retry path to also check for rate-limit errors. - Updated the reason string to include "rate limit". This complements the Nous rate guard (PR #10568) which prevents new calls to Nous when already rate-limited — this fix handles the case where a request is already in flight when the 429 arrives. Related: #8023, #12554, #11034 Co-authored-by: Zeejay <zjtan1@gmail.com>		2026-05-05 10:15:57 -07:00
..
transports	fix(codex-transport): preserve request override headers for xai responses	2026-05-03 15:25:45 -07:00
__init__.py
account_usage.py
anthropic_adapter.py	fix(anthropic): restrict fast mode to Opus 4.6 (Anthropic API contract)	2026-05-04 06:23:52 -07:00
auxiliary_client.py	fix(aux): trigger fallback on 429 rate-limit errors in auxiliary client	2026-05-05 10:15:57 -07:00
bedrock_adapter.py
codex_responses_adapter.py
context_compressor.py	fix(compaction): mark end of context summary in role=user fallback	2026-05-05 04:51:29 -07:00
context_engine.py
context_references.py
copilot_acp_client.py	fix(ci): stabilize main test suite regressions (#17660 )	2026-04-29 23:18:55 -07:00
credential_pool.py	fix: prefer ~/.hermes/.env over os.environ when seeding credential pool	2026-05-02 02:00:32 -07:00
credential_sources.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
curator.py	fix(curator): prevent false-positive consolidation from substring matching	2026-05-04 01:21:23 -07:00
curator_backup.py	fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )	2026-05-02 01:29:57 -07:00
display.py	fix(guardrails): preserve display _detect_tool_failure semantics	2026-04-30 20:43:15 -07:00
error_classifier.py	fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s	2026-05-05 04:25:18 -07:00
file_safety.py
gemini_cloudcode_adapter.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
gemini_native_adapter.py	fix(gemini): extract usageMetadata from streaming chunks for token tracking	2026-05-04 02:33:30 -07:00
gemini_schema.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_code_assist.py	chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )	2026-04-28 06:46:45 -07:00
google_oauth.py	fix(google_oauth): close TOCTOU window when saving credentials	2026-05-04 03:16:19 -07:00
i18n.py	feat(i18n): add display.language for static message translation (zh/ja/de/es) (#20231 )	2026-05-05 08:03:07 -07:00
image_gen_provider.py
image_gen_registry.py
image_routing.py
insights.py
lmstudio_reasoning.py	feat(agent): add lmstudio integration	2026-04-28 12:27:36 -07:00
manual_compression_feedback.py	fix(compression): include system prompt + tool schemas in token estimates (#18265 )	2026-04-30 23:03:54 -07:00
memory_manager.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
memory_provider.py	feat(memory): notify providers on mid-process session_id rotation (#17409 )	2026-04-29 04:57:22 -07:00
model_metadata.py	fix(context): honor model.context_length for Ollama num_ctx and all display paths	2026-04-30 04:31:23 -07:00
models_dev.py	feat(minimax-oauth): full integration with peer OAuth providers	2026-04-29 09:53:42 -07:00
moonshot_schema.py	fix(moonshot): also strip nullable/enum after anyOf collapse	2026-04-30 23:14:31 -07:00
nous_rate_guard.py
onboarding.py	docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )	2026-04-29 08:08:36 -07:00
prompt_builder.py	fix: add PLATFORM_HINTS entry for api_server platform	2026-05-05 05:46:16 -07:00
prompt_caching.py
rate_limit_tracker.py
redact.py	fix(redact): add code_file param to skip false-positive ENV/JSON patterns	2026-05-04 04:56:28 -07:00
retry_utils.py
shell_hooks.py
skill_commands.py	fix(skills): rescan skill_commands cache when platform scope changes (#18739 )	2026-05-02 01:36:53 -07:00
skill_preprocessing.py
skill_utils.py	fix(skills): exclude .archive from skill index walk	2026-04-30 04:59:22 -07:00
subdirectory_hints.py
think_scrubber.py	fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )	2026-05-05 04:33:38 -07:00
title_generator.py	fix: improve telegram topic mode setup	2026-05-04 12:07:17 -07:00
tool_guardrails.py	fix(guardrails): preserve display _detect_tool_failure semantics	2026-04-30 20:43:15 -07:00
trajectory.py
usage_pricing.py	fix(usage_pricing): add MiniMax-M2.7 pricing for minimax and minimax-cn providers	2026-04-29 04:56:50 -07:00