hermes-bsd/tests
Teknium 80a676658c fix(cli): surface self-improvement review summaries from bg thread
When the self-improvement background review fires after a turn, it runs
in a bg thread and emits a '  💾 <summary>' line to announce what it
saved to memory or skills. Two problems made this invisible to users
even when the review successfully modified a skill:

1. The print went through `_cprint` (prompt_toolkit's print_formatted_text)
   on a bg thread while the CLI's PromptSession was live. Direct
   print_formatted_text races with the input-area redraw and the line
   can land behind/above the prompt, scrolled off without the user
   seeing it.

2. The message said only '💾 Skill created.' / '💾 Memory updated'
   with no indication that the self-improvement loop was the one doing
   this. Users who did catch the line couldn't tell the background
   review from some other agent action.

Fixes:

- `_cprint` now detects when it's called from a non-app thread with a
  running prompt_toolkit Application, and routes through
  `run_in_terminal` via `loop.call_soon_threadsafe`. That pauses the
  input, prints the line above the prompt, and redraws — the normal
  prompt_toolkit contract for bg-thread output. Direct-print fallback
  preserved for the no-app / same-thread / import-error paths. Affects
  every bg-thread emission, not just the review summary (curator
  summaries and auxiliary failure prints benefit too).

- The summary now reads '  💾 Self-improvement review: <summary>' in
  both the CLI and the gateway `background_review_callback` path, so
  the origin is unambiguous.

Tests:
- New `tests/cli/test_cprint_bg_thread.py` covers all five routing
  branches (no app, app-not-running, cross-thread schedule, same-thread
  direct, app-loop-attribute-error, import-error).
- New case in `tests/run_agent/test_background_review.py` asserts the
  attributed prefix shows up in both `_safe_print` and
  `background_review_callback`.

Live E2E: exercised _cprint from a bg thread inside a real Application
event loop; confirmed get_app_or_none() sees the app, call_soon_threadsafe
schedules run_in_terminal, and the inner _pt_print runs.
2026-04-30 14:07:22 -07:00
..
acp test(acp): accept prompt persistence kwargs in mocks 2026-04-30 10:47:23 -07:00
acp_adapter fix(acp): advertise and forward image prompts 2026-04-30 10:31:16 -07:00
agent fix(deepseek): preserve v4 reasoning_content on replay 2026-04-30 11:18:39 -07:00
cli fix(cli): surface self-improvement review summaries from bg thread 2026-04-30 14:07:22 -07:00
cron fix(cron): surface agent run_conversation failure flags as job failure 2026-04-30 03:27:37 -07:00
e2e
environments/benchmarks
fakes
gateway test(gateway): pin cleanup invariants for #17758 in-band drain hand-off 2026-04-30 05:00:25 -07:00
hermes_cli feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
hermes_state
honcho_plugin
integration
openviking_plugin fix(openviking): pre-check fs/stat to route file URIs before hitting directory-only endpoints 2026-04-30 02:35:29 -07:00
plugins feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
run_agent fix(cli): surface self-improvement review summaries from bg thread 2026-04-30 14:07:22 -07:00
skills
stress feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
tools feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
tui_gateway fix(tui): responsive /compress with live progress + CLI-parity feedback (#17661) 2026-04-29 18:01:18 -07:00
website
__init__.py
conftest.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
run_interrupt_test.py
test_account_usage.py
test_atomic_replace_symlinks.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py
test_cli_manual_compress.py
test_cli_skin_integration.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_get_tool_definitions_cache_isolation.py fix(tools): isolate get_tool_definitions quiet_mode cache + dedup LCM injection (#17335) 2026-04-30 04:32:06 -07:00
test_hermes_constants.py
test_hermes_logging.py
test_hermes_state.py
test_honcho_client_config.py
test_install_sh_setup_wizard_tty_probe.py
test_ipv4_preference.py
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py test(cli): cover minimax-oauth resolution, refresh, menu wiring 2026-04-29 09:53:42 -07:00
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools.py fix(plugins): stop firing pre_tool_call hook twice per tool execution (#17611) 2026-04-29 12:43:39 -07:00
test_model_tools_async_bridge.py
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py
test_project_metadata.py
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_timezone.py
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_tool_result_hook.py
test_tui_gateway_server.py fix(tui): respect max turns config 2026-04-30 12:26:57 -05:00
test_utils_truthy_values.py
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py
test_yuanbao_proto.py