hermes-bsd/tests
Teknium d6785dc4d4
fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609)
Three fixes for the (empty) response bug affecting open reasoning models:

1. Allow retries after prefill exhaustion — models like mimo-v2-pro always
   populate reasoning fields via OpenRouter, so the old 'not _has_structured'
   guard on the retry path blocked retries for EVERY reasoning model after
   the 2 prefill attempts.  Now: 2 prefills + 3 retries = 6 total attempts
   before (empty).

2. Reset prefill/retry counters on tool-call recovery — the counters
   accumulated across the entire conversation, never resetting during
   tool-calling turns.  A model cycling empty→prefill→tools→empty burned
   both prefill attempts and the third empty got zero recovery.  Now
   counters reset when prefill succeeds with tool calls.

3. Strip think blocks before _truly_empty check — inline <think> content
   made the string non-empty, skipping both retry paths.

Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models.
Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead
of proper function calls, causing content=None + tool_calls=None + reasoning
with embedded <tool_call> XML.  Prefill recovery works but counter
accumulation caused permanent (empty) in long sessions.
2026-04-12 15:38:11 -07:00
..
acp fix(acp): declare session load and resume capabilities in initialize response (#6985) 2026-04-10 03:45:36 -07:00
agent feat: add WSL environment hint to system prompt (#8285) 2026-04-12 02:26:28 -07:00
cli fix(cli): restore stacked tool progress scrollback in TUI (#8201) 2026-04-11 23:22:34 -07:00
cron feat(cron): support Discord thread_id in deliver targets 2026-04-10 03:20:05 -07:00
e2e refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
environments/benchmarks
fakes
gateway fix: salvage follow-ups for Feishu QR onboarding (#7706) 2026-04-12 13:05:56 -07:00
hermes_cli fix: improve profile creation UX — seed SOUL.md + credential warning (#8553) 2026-04-12 12:22:34 -07:00
honcho_plugin feat(honcho): add opt-in initOnSessionStart for tools mode and respect explicit peerName (#6995) 2026-04-11 00:43:27 -07:00
integration
plugins feat(hindsight): feature parity, setup wizard, and config improvements 2026-04-08 23:54:15 -07:00
run_agent fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609) 2026-04-12 15:38:11 -07:00
skills fix(migration): don't auto-archive OpenClaw source directory 2026-04-12 00:33:54 -07:00
tools fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228) 2026-04-12 00:36:22 -07:00
__init__.py
conftest.py fix(tests): fix several failing/flaky tests on main (#6777) 2026-04-09 13:17:06 -07:00
run_interrupt_test.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py fix(gateway): reject file paths in get_command() + file-drop tests (#7356) 2026-04-10 13:06:02 -07:00
test_cli_skin_integration.py fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974) 2026-04-07 17:59:42 -07:00
test_ctx_halving_fix.py fix(compaction): don't halve context_length on output-cap-too-large errors 2026-04-09 11:27:41 -07:00
test_empty_model_fallback.py fix: fall back to provider's default model when model config is empty (#8303) 2026-04-12 03:53:30 -07:00
test_evidence_store.py
test_hermes_constants.py fix: profile paths broken in Docker — profiles go to /root/.hermes instead of mounted volume (#7170) 2026-04-10 05:53:10 -07:00
test_hermes_logging.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
test_hermes_state.py fix(state): orphan children instead of cascade-deleting in prune/delete (#6513) 2026-04-09 02:41:56 -07:00
test_honcho_client_config.py
test_ipv4_preference.py feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196) 2026-04-11 23:12:11 -07:00
test_mcp_serve.py
test_minisweagent_path.py
test_model_picker_scroll.py fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974) 2026-04-07 17:59:42 -07:00
test_model_tools.py
test_model_tools_async_bridge.py
test_ollama_num_ctx.py fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983) 2026-04-07 22:23:28 -07:00
test_packaging_metadata.py
test_project_metadata.py refactor(matrix): swap matrix-nio for mautrix-python dependency 2026-04-10 21:15:59 -07:00
test_retry_utils.py feat(agent): add jittered retry backoff 2026-04-08 00:41:36 -07:00
test_sql_injection.py
test_subprocess_home_isolation.py fix: per-profile subprocess HOME isolation (#4426) (#7357) 2026-04-10 13:37:45 -07:00
test_timezone.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_utils_truthy_values.py