hermes-bsd/tests/tools
Chris Danis 28f4d6db63 fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s
MCP servers commonly emit JSON Schema `pattern` (e.g. `\\d{4}-\\d{2}-\\d{2}`
for date-time params) and `format` keywords. llama.cpp's
`json-schema-to-grammar` converter rejects regex escape classes
(\\d/\\w/\\s) and most format values, returning HTTP 400
"parse: error parsing grammar: unknown escape at \\d" — the whole request
fails.

Cloud providers (OpenAI, Anthropic, OpenRouter, Gemini) accept these
keywords fine and use them as prompting hints. Stripping unconditionally
loses useful hints for every cloud user to fix a llama.cpp-only bug.

Approach: classify the llama.cpp grammar-parse 400 in the error
classifier, and on match do a one-shot in-place strip of pattern/format
from `self.tools`, then retry. Follows the existing
`thinking_signature` recovery pattern. Cloud users hit zero overhead;
llama.cpp users pay one failed request per session.

Changes
- agent/error_classifier.py: new `FailoverReason.llama_cpp_grammar_pattern`
  + narrow HTTP-400 branch matching "error parsing grammar",
  "json-schema-to-grammar", or "unable to generate parser ... template".
- tools/schema_sanitizer.py: new `strip_pattern_and_format()` helper —
  reactive, walks schema nodes, skips property names (search_files.pattern
  survives). Returns strip count for logging.
- run_agent.py: new one-shot recovery block in the retry loop. Strips,
  logs, continues. Falls through to normal retry if nothing to strip.
- tests: 4 classifier tests (3 variants + 1 non-400 negative), 7 strip
  tests including the property-name preservation and idempotency checks.

Co-authored-by: Chris Danis <cdanis@gmail.com>
2026-05-05 04:25:18 -07:00
..
__init__.py
test_accretion_caps.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_ansi_strip.py
test_approval.py fix(approval): close remaining prompt_toolkit deadlock vectors (#15216) 2026-04-27 06:42:32 -07:00
test_approval_heartbeat.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_approval_plugin_hooks.py feat(plugins): add pre_approval_request / post_approval_response hooks (#16776) 2026-04-27 20:08:33 -07:00
test_base_environment.py fix(env): pass -- to cd for hyphen-prefixed workdirs 2026-05-04 04:45:03 -07:00
test_browser_camofox.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_browser_camofox_persistence.py docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976) 2026-04-16 04:07:11 -07:00
test_browser_camofox_state.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_browser_cdp_override.py Support browser CDP URL from config 2026-04-17 16:05:04 -07:00
test_browser_cdp_tool.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_browser_chromium_check.py fix(browser): detect missing Chromium and fail fast with actionable error (#17039) 2026-04-28 07:03:44 -07:00
test_browser_cleanup.py
test_browser_cloud_fallback.py fix(browser): runtime fallback to local Chromium when cloud provider fails 2026-04-16 04:19:34 -07:00
test_browser_console.py fix(browser): honor auxiliary.vision.temperature for screenshot analysis\n\n- mirror the vision tool's config bridge in browser_vision 2026-04-20 00:32:09 -07:00
test_browser_content_none_guard.py
test_browser_hardening.py
test_browser_homebrew_paths.py fix(browser): allow CDP override to pass requirement checks 2026-05-04 03:12:30 -07:00
test_browser_hybrid_routing.py feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136) 2026-04-26 09:57:58 -07:00
test_browser_orphan_reaper.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_browser_secret_exfil.py
test_browser_ssrf_local.py fix(security): treat quoted false as false in browser SSRF guards 2026-04-26 18:27:13 -07:00
test_browser_supervisor.py feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540) 2026-04-23 22:23:37 -07:00
test_browser_supervisor_healthcheck.py test(browser_supervisor): cover cache-hit healthcheck on dead thread/loop 2026-04-30 20:33:33 -07:00
test_budget_config.py
test_checkpoint_manager.py feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303) 2026-04-26 19:05:52 -07:00
test_clarify_tool.py
test_clipboard.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_code_execution.py fix(tools): serialize concurrent hermes_tools RPC calls from execute_code 2026-04-30 03:31:16 -07:00
test_code_execution_modes.py feat(execute_code): add project/strict execution modes, default to project (#11971) 2026-04-18 01:46:25 -07:00
test_command_guards.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_config_null_guard.py
test_credential_files.py
test_credential_pool_env_fallback.py fix(auth): hoist get_env_value import + strengthen .env fallback tests 2026-04-26 08:32:09 -07:00
test_cron_approval_mode.py feat(approval): hardline blocklist for unrecoverable commands (#15878) 2026-04-25 22:07:12 -07:00
test_cron_prompt_injection.py
test_cronjob_tools.py fix(cron): accept list-form deliver values so deliver=['telegram'] works (#17456) 2026-04-29 06:35:34 -07:00
test_daytona_environment.py
test_debug_helpers.py
test_delegate.py fix(delegation): honor provider override for subagents 2026-05-04 05:22:35 -07:00
test_delegate_subagent_timeout_diagnostic.py feat(delegate): diagnostic dump when a subagent times out with 0 API calls (#15105) 2026-04-24 04:58:32 -07:00
test_delegate_toolset_scope.py
test_discord_tool.py test(discord_tool): add regression test for per-token capability cache 2026-04-30 20:37:12 -07:00
test_docker_environment.py feat(docker): run container as host user to avoid root-owned bind mounts 2026-04-29 16:16:43 +10:00
test_docker_find.py feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066) 2026-04-14 21:20:37 -07:00
test_dockerfile_pid1_reaping.py chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 2026-04-29 21:56:51 -07:00
test_env_passthrough.py fix(env_passthrough): reject Hermes provider credentials from skill passthrough (#13523) 2026-04-21 06:14:25 -07:00
test_feishu_tools.py feat: add Feishu document comment intelligent reply with 3-tier access control 2026-04-17 19:04:11 -07:00
test_file_operations.py fix(file-ops): allow file search in hidden roots 2026-05-04 12:37:09 -07:00
test_file_operations_edge_cases.py fix(file_ops): resolve search_files path/line collision for hyphenated numeric filenames 2026-05-04 12:37:47 -07:00
test_file_ops_cwd_tracking.py fix(file-ops): follow terminal env's live cwd in _exec instead of init-time cached cwd (#11912) 2026-04-17 19:26:40 -07:00
test_file_read_guards.py fix(file-tools): escalate to BLOCKED on repeated read_file dedup stubs (#16382) 2026-04-27 00:17:26 -07:00
test_file_staleness.py fix(file_tools): resolve bookkeeping paths against live terminal cwd 2026-04-23 15:11:52 -07:00
test_file_state_registry.py feat(delegate): cross-agent file state coordination for concurrent subagents (#13718) 2026-04-21 16:41:26 -07:00
test_file_sync.py
test_file_sync_back.py fix(ci): stabilize current main test regressions 2026-04-30 06:36:50 -07:00
test_file_sync_perf.py
test_file_tools.py fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096) 2026-05-03 08:52:41 -07:00
test_file_tools_container_config.py fix(docker): pass docker_mount_cwd_to_workspace and docker_forward_env to container_config in file_tools 2026-04-20 00:58:16 -07:00
test_file_tools_live.py
test_file_write_safety.py
test_force_dangerous_override.py
test_fuzzy_match.py fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage 2026-04-21 02:03:46 -07:00
test_hardline_blocklist.py fix: address self-review findings for Vercel Sandbox salvage 2026-04-29 07:22:33 -07:00
test_heartbeat_stale_thresholds.py test: add unit tests for heartbeat stale threshold increase 2026-05-04 05:08:51 -07:00
test_hidden_dir_filter.py
test_homeassistant_tool.py
test_image_generation.py feat(image-gen): add GPT Image 2 to FAL catalog (#13677) 2026-04-21 13:35:31 -07:00
test_image_generation_env.py Normalize FAL_KEY env handling (ignore whitespace-only values) 2026-04-21 02:04:21 -07:00
test_image_generation_plugin_dispatch.py fix(image-gen): force-refresh plugin providers in long-lived sessions 2026-04-23 03:01:18 -07:00
test_init_session_cwd_respect.py fix(cli): respect terminal.cwd config in local terminal backend 2026-04-28 22:16:08 -07:00
test_interrupt.py fix: resolve remaining 4 CI test failures (#9543) 2026-04-14 02:18:38 -07:00
test_kanban_tools.py revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718) (#19721) 2026-05-04 05:04:01 -07:00
test_llm_content_none_guard.py
test_local_background_child_hang.py fix(environments): use incremental UTF-8 decoder in select-based drain 2026-04-19 11:27:50 -07:00
test_local_env_blocklist.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_local_env_cwd_recovery.py fix(local): test root as ancestor candidate; use real pipe for fake stdout 2026-05-04 15:31:47 -07:00
test_local_interrupt_cleanup.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_local_shell_init.py fix(terminal): auto-source ~/.profile and ~/.bash_profile so n/nvm PATH survives (#14534) 2026-04-23 05:15:37 -07:00
test_local_tempdir.py
test_managed_browserbase_and_modal.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_media_gateways.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_modal_environment.py fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501) 2026-04-15 13:29:05 -07:00
test_managed_server_tool_support.py
test_managed_tool_gateway.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_mcp_circuit_breaker.py test(mcp): add failing tests for circuit-breaker recovery 2026-04-21 05:19:03 -07:00
test_mcp_dynamic_discovery.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_mcp_oauth.py fix(mcp): decouple AnyUrl import from mcp dependency 2026-05-04 04:42:18 -07:00
test_mcp_oauth_bidirectional.py fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025) (#12717) 2026-04-19 16:31:07 -07:00
test_mcp_oauth_cold_load_expiry.py fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025) (#12717) 2026-04-19 16:31:07 -07:00
test_mcp_oauth_integration.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_oauth_manager.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_probe.py
test_mcp_reconnect_signal.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_stability.py fix(cron): reap orphaned MCP stdio subprocesses after each tick 2026-04-26 18:21:20 -07:00
test_mcp_structured_content.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_mcp_tool.py fix(mcp): preserve nullable schema coercion 2026-04-28 04:58:03 -07:00
test_mcp_tool_401_handling.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_tool_issue_948.py
test_mcp_tool_session_expired.py fix(mcp): reconnect on terminated sessions 2026-05-03 15:23:33 -07:00
test_memory_tool.py
test_memory_tool_import_fallback.py fix(tools): keep memory tool available when fcntl is unavailable 2026-04-14 10:18:05 -07:00
test_mixture_of_agents_tool.py chore(release): map devorun author + convert MoA defaults test to invariant 2026-04-23 15:14:11 -07:00
test_modal_bulk_upload.py
test_modal_sandbox_fixes.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_modal_snapshot_isolation.py
test_notify_on_complete.py
test_osv_check.py
test_parse_env_var.py guard terminal_tool import-time env parsing 2026-04-22 14:45:50 -07:00
test_patch_parser.py
test_process_registry.py fix(process): reconcile session.exited against real child exit in poll/wait (#17430) 2026-04-29 04:59:21 -07:00
test_read_loop_detection.py
test_registry.py test(toolsets): include kanban in expected post-#17805 toolset assertions 2026-04-30 19:43:03 -07:00
test_resolve_path.py fix(file_tools): resolve bookkeeping paths against live terminal cwd 2026-04-23 15:11:52 -07:00
test_rl_training_tool.py
test_schema_sanitizer.py fix(tool-schemas): reactive strip of pattern/format on llama.cpp grammar 400s 2026-05-05 04:25:18 -07:00
test_search_hidden_dirs.py
test_send_message_missing_platforms.py fix(send_message): deliver Matrix media via adapter 2026-04-15 17:37:43 -07:00
test_send_message_tool.py feat(gateway/signal): add support for multiple images sending 2026-04-30 04:28:08 -07:00
test_session_search.py fix(session-search): report source from resolved parent, not FTS5 child session (#15909) 2026-05-04 05:07:40 -07:00
test_shared_container_task_id.py feat(terminal): collapse subagent task_ids to shared container (#16177) 2026-04-26 11:55:02 -07:00
test_signal_media.py feat(send_message): add media delivery support for Signal 2026-04-20 13:24:15 -07:00
test_singularity_preflight.py
test_skill_env_passthrough.py
test_skill_improvements.py
test_skill_manager_tool.py fix(curator): only mark agent-created for background-review sediment (#19621) 2026-05-04 02:42:16 -07:00
test_skill_provenance.py fix(curator): only mark agent-created for background-review sediment (#19621) 2026-05-04 02:42:16 -07:00
test_skill_size_limits.py
test_skill_usage.py fix(skills): keep manual skills out of curator 2026-05-04 02:19:28 -07:00
test_skill_view_path_check.py
test_skill_view_traversal.py
test_skills_guard.py feat(skills-guard): gate agent-created scanner on config.skills.guard_agent_created (default off) 2026-04-23 06:20:47 -07:00
test_skills_hub.py test(skills): add bytes-vs-str equivalence and on-disk hash parity tests 2026-05-04 01:28:12 -07:00
test_skills_hub_clawhub.py
test_skills_sync.py feat(skills_sync): surface collision with reset-hint 2026-04-23 05:09:08 -07:00
test_skills_tool.py fix: address self-review findings for Vercel Sandbox salvage 2026-04-29 07:22:33 -07:00
test_slash_confirm.py feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation 2026-04-29 21:56:47 -07:00
test_spotify_client.py refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174) 2026-04-24 07:06:11 -07:00
test_ssh_bulk_upload.py test(ssh): update tar pipe assertion for --no-overwrite-dir 2026-04-30 04:32:28 -07:00
test_ssh_environment.py fix(tools): keep SSH ControlMaster socket path under macOS 104-byte limit 2026-04-20 03:07:32 -07:00
test_symlink_prefix_confusion.py
test_sync_back_backends.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
test_terminal_compound_background.py fix(terminal): rewrite A && B & to A && { B & } to prevent subshell leak 2026-04-19 16:53:11 -07:00
test_terminal_config_env_sync.py feat(docker): run container as host user to avoid root-owned bind mounts 2026-04-29 16:16:43 +10:00
test_terminal_exit_semantics.py
test_terminal_foreground_timeout_cap.py terminal: steer long-lived server commands to background mode 2026-04-19 16:47:20 -07:00
test_terminal_none_command_guard.py
test_terminal_output_transform_hook.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_terminal_requirements.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_terminal_timeout_output.py
test_terminal_tool.py fix(terminal): skip sudo prompt when local NOPASSWD sudo works 2026-04-30 20:38:09 -07:00
test_terminal_tool_pty_fallback.py
test_terminal_tool_requirements.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_threaded_process_handle.py
test_tirith_security.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_todo_tool.py
test_tool_backend_helpers.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
test_tool_call_parsers.py
test_tool_output_limits.py feat(skills): add design-md skill for Google's DESIGN.md spec (#14876) 2026-04-23 21:51:19 -07:00
test_tool_result_storage.py fix(file-tools): cap read_file result size to prevent context window overflow 2026-05-04 03:14:59 -07:00
test_transcription.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_transcription_dotenv_fallback.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_transcription_tools.py fix(test): add skip marker for transcription tests requiring faster_whisper 2026-05-04 04:41:36 -07:00
test_tts_command_providers.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_dotenv_fallback.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_tts_gemini.py feat(tts): add Google Gemini TTS provider (#11229) 2026-04-16 14:23:16 -07:00
test_tts_kittentts.py feat(tts): complete KittenTTS integration (tools/setup/docs/tests) 2026-04-21 01:28:32 -07:00
test_tts_max_text_length.py fix(tts): use per-provider input-character caps instead of global 4000 (#13743) 2026-04-21 17:49:39 -07:00
test_tts_mistral.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_piper.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_speed.py fix(tts): update MiniMax API endpoint to v1/text_to_speech 2026-05-04 12:36:09 -07:00
test_url_safety.py fix(security): treat quoted false as false in browser SSRF guards 2026-04-26 18:27:13 -07:00
test_vercel_sandbox_environment.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_video_analyze.py feat: add video_analyze tool for native video understanding (#19301) 2026-05-04 00:04:36 +05:30
test_vision_tools.py test: cover vision config temperature wiring\n\n- add regression tests for auxiliary.vision.temperature and timeout\n- add bugkill3r to AUTHOR_MAP for the salvaged commit 2026-04-20 00:32:09 -07:00
test_voice_cli_integration.py fix(cli): avoid voice TTS restart race 2026-05-04 01:36:07 -07:00
test_voice_mode.py
test_watch_patterns.py fix(terminal): three-layer defense against watch_patterns notification spam (#15642) 2026-04-25 06:41:58 -07:00
test_web_tools_config.py feat(web): expose search result limit 2026-04-28 02:09:30 -07:00
test_web_tools_tavily.py
test_website_policy.py
test_windows_compat.py
test_write_deny.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_yolo_mode.py fix(approval): harden YOLO mode env parsing against quoted-bool strings 2026-04-30 20:37:37 -07:00
test_zombie_process_cleanup.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00