hermes-bsd/tests/run_agent
Tranquil-Flow b1adb95038 fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern
The ChatGPT Codex backend (chatgpt.com/backend-api/codex) has historically
silently dropped certain model requests: the connection is accepted but no
stream events are emitted and no error is raised. PR #31967 lowered the
implicit stale-call default from 300s to 90s so fallbacks kick in faster,
but users still see an opaque "No response from provider for 90s
(non-streaming, ...)" message that gives no path forward.

This patch adds a narrow heuristic — gpt-5.5 family on the Codex backend
via codex_responses api_mode — that substitutes the generic timeout
message with actionable text naming the gpt-5.4-codex workaround and
pointing at #21444 for symptom history.

Changes:

- run_agent.py — new ``AIAgent._codex_silent_hang_hint(model=...)`` method.
  Returns ``None`` for any request that does not match all three guards
  (codex_responses api_mode, openai-codex provider or chatgpt.com Codex
  base URL, gpt-5.5-family model name with word-boundary regex anchoring
  to avoid false-positives on e.g. ``gpt-5.50``).
- agent/chat_completion_helpers.py — the non-stream stale-call site
  consults the hint via ``getattr(...)`` so the call site stays robust
  if the helper is ever removed or stubbed in tests. Hint is appended to
  both the ``_emit_status`` warning and the ``TimeoutError`` message so
  the user sees it in their terminal AND it lands in any retry-loop
  diagnostics.
- tests/run_agent/test_codex_silent_hang_hint.py — 10 regression tests
  covering positive cases (bare gpt-5.5, vendor-prefixed openai/gpt-5.5,
  gpt-5.5-codex SKU, model=None fallback to self.model) and negative
  cases (gpt-5.4-codex workaround, gpt-5.50 false-positive guard,
  non-codex api_mode, non-codex provider, empty/None model, unrelated
  models on Codex).

Does NOT fix the backend-side issue (that's an upstream OpenAI/ChatGPT
problem we cannot patch from here). Only converts an opaque timeout into
text that names the workaround so users do not have to dig through logs
or wait for a forum post to learn what to do.

Closes #22046
2026-05-25 04:49:22 -07:00
..
__init__.py
conftest.py ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861) 2026-05-19 17:27:24 -07:00
test_413_compression.py test+polish(compression): pin anti-thrash gate and gateway session_id persistence 2026-05-25 01:44:46 -07:00
test_860_dedup.py refactor(gateway): stop writing JSONL in append_to_transcript / rewrite_transcript 2026-05-20 13:00:57 -07:00
test_1630_context_overflow_loop.py
test_31273_402_not_retried.py fix(agent): abort on HTTP 402 after pool rotation and fallback fail (#31443) 2026-05-24 15:14:13 -07:00
test_agent_guardrails.py
test_anthropic_prompt_cache_policy.py fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778) 2026-05-12 20:46:04 -07:00
test_anthropic_third_party_oauth_guard.py
test_anthropic_truncation_continuation.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_api_max_retries_config.py
test_async_httpx_del_neuter.py fix(dashboard): UI polish — modals, layout, consistency, test fixes 2026-05-12 13:59:22 -04:00
test_background_review.py fix(run_agent): isolate background review fork from external memory plugins (#27190) 2026-05-16 20:33:38 -07:00
test_background_review_cache_parity.py chore: trim verbose comments/docstrings, add AUTHOR_MAP entry 2026-05-21 12:49:21 +05:30
test_background_review_summary.py
test_background_review_toolset_restriction.py chore: trim verbose comments/docstrings, add AUTHOR_MAP entry 2026-05-21 12:49:21 +05:30
test_callable_api_key.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
test_codex_app_server_integration.py fix(codex-runtime): retire wedged sessions + post-tool watchdog + OAuth refresh classify (#25769) 2026-05-14 07:55:09 -07:00
test_codex_multimodal_tool_result.py
test_codex_silent_hang_hint.py fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern 2026-05-25 04:49:22 -07:00
test_codex_xai_oauth_recovery.py test(xai-oauth): regression coverage for the bad-credentials disambiguator (#29344) 2026-05-23 02:48:13 -07:00
test_commit_memory_session_context_engine.py
test_compress_focus_plugin_fallback.py
test_compression_boundary.py
test_compression_boundary_hook.py fix(tests): catch up six stale tests after compression/aux/kanban changes (#28465) 2026-05-18 21:43:59 -07:00
test_compression_feasibility.py perf(compression): defer feasibility check to first compression attempt (#28957) 2026-05-19 17:27:17 -07:00
test_compression_persistence.py
test_compression_trigger_excludes_reasoning.py
test_compressor_fallback_update.py
test_concurrent_interrupt.py
test_context_token_tracking.py refactor(session-log): delete _save_session_log and all callers 2026-05-20 11:44:10 -07:00
test_copilot_native_vision_headers.py
test_create_openai_client_kwargs_isolation.py
test_create_openai_client_proxy_env.py
test_create_openai_client_reuse.py fix(force_close_tcp_sockets): shutdown only, do not release FD (#29507) 2026-05-23 02:31:10 -07:00
test_deepseek_reasoning_content_echo.py
test_deepseek_v4_thinking_live.py
test_dict_tool_call_args.py
test_empty_response_recovery_persistence.py refactor(session-log): delete _save_session_log and all callers 2026-05-20 11:44:10 -07:00
test_exit_cleanup_interrupt.py
test_file_mutation_verifier.py fix: classify landed file mutations with diagnostics 2026-05-13 06:46:23 -07:00
test_image_rejection_fallback.py fix(agent): catch ChatGPT-account Codex data-URL rejection so images are stripped instead of cascading to compression (#23602) 2026-05-11 07:37:22 -07:00
test_image_shrink_recovery.py
test_init_fallback_on_exhausted_pool.py
test_interactive_interrupt.py
test_interrupt_propagation.py
test_invalid_context_length_warning.py
test_iteration_budget_race.py
test_jsondecodeerror_retryable.py refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards 2026-05-16 22:55:49 -07:00
test_last_reasoning_per_turn.py
test_long_context_tier_429.py
test_materialize_data_url_cleanup.py fix(misc): three small defensive fixes from PR #1974 2026-05-10 22:28:01 -07:00
test_memory_nudge_counter_hydration.py refactor(run_agent): review fixes — keyword-forward __init__, drop dead code, tighten guards 2026-05-16 22:55:49 -07:00
test_memory_provider_init.py
test_memory_sync_interrupted.py
test_message_sequence_repair.py
test_multimodal_tool_content_recovery.py fix(agent): recover from providers rejecting list-type tool content (#27344) (#30259) 2026-05-21 23:40:16 -07:00
test_openai_client_lifecycle.py fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults 2026-05-25 01:47:55 -07:00
test_partial_stream_finish_reason.py test(streaming): pin partial-stream-stub finish_reason + continuation contract 2026-05-24 04:35:15 -07:00
test_percentage_clamp.py
test_plugin_context_engine_init.py fix(compressor): ABC compliance — total_tokens, api_mode, logger consistency 2026-05-23 17:38:19 -07:00
test_primary_runtime_restore.py fix(agent): reset _fallback_index at turn start even when no fallback activated 2026-05-16 17:12:48 -07:00
test_provider_attribution_headers.py feat(nvidia): add NIM billing origin header 2026-05-15 14:06:51 -07:00
test_provider_fallback.py
test_provider_parity.py fix(tests): stabilize xai env and provider parity 2026-05-17 11:55:25 -07:00
test_real_interrupt_subagent.py
test_redirect_stdout_issue.py
test_repair_tool_call_arguments.py
test_repair_tool_call_name.py
test_review_prompt_class_first.py fix(review): tell background reviewer not to capture transient env failures as skills (#23004) 2026-05-09 22:51:25 -07:00
test_run_agent.py fix(security): redact credentials before persistence in session capture 2026-05-24 17:58:25 -07:00
test_run_agent_codex_responses.py fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults 2026-05-25 01:47:55 -07:00
test_run_agent_multimodal_prologue.py
test_sequential_chats_live.py
test_session_id_env.py feat: expose HERMES_SESSION_ID to agent tools via ContextVar + env (#23847) 2026-05-12 00:16:45 +05:30
test_session_meta_filtering.py
test_session_reset_fix.py
test_steer.py
test_stream_drop_logging.py feat(stream-retry): add upstream + timing diagnostics to drop log (#23005) 2026-05-09 22:49:35 -07:00
test_stream_interrupt_retry.py
test_streaming.py fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184) 2026-05-16 17:09:41 -07:00
test_streaming_tool_call_repair.py chore: remove Atropos RL environments and tinker-atropos integration (#26106) 2026-05-15 10:36:38 +05:30
test_strict_api_validation.py
test_strip_reasoning_tags_cli.py
test_switch_model_context.py test(ci): stabilize shared optional dependency baselines 2026-05-13 17:32:22 -07:00
test_switch_model_fallback_prune.py
test_thinking_only_sanitizer.py
test_tls_fd_recycle_corruption.py test(tls-fd-recycle): pin shutdown-only + thread-aware close contract (#29507) 2026-05-23 02:31:10 -07:00
test_token_persistence_non_cli.py
test_tool_arg_coercion.py
test_tool_call_args_sanitizer.py ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861) 2026-05-19 17:27:24 -07:00
test_tool_call_guardrail_runtime.py test(guardrail): assert halt message reaches stream_delta_callback 2026-05-24 07:38:24 -07:00
test_tool_executor_contextvar_propagation.py refactor(run_agent): extract tool execution to agent/tool_executor.py 2026-05-16 18:24:05 -07:00
test_tool_name_db_persistence.py fix(agent): set tool_name on tool-result messages at construction time 2026-05-19 20:49:11 +01:00
test_unicode_ascii_codec.py
test_vision_aware_preprocessing.py fix(agent): resolve supports_vision override for named custom providers 2026-05-20 23:27:10 -07:00