hermes-bsd/tests/tools
Teknium d3c167b644
fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290)
* fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint

Adds a soft guard so an agent running under one Hermes profile cannot
silently edit a different profile's skills/plugins/cron/memories.
Three layers:

A. agent/file_safety.classify_cross_profile_target
   Classifies a write target against the active HERMES_HOME. Returns
   a {active_profile, target_profile, area, target_path} dict when the
   path lands in another profile's scoped area. PROFILE_SCOPED_AREAS =
   (skills, plugins, cron, memories). get_cross_profile_warning()
   wraps it into a model-facing error string that names both profiles,
   names the area, and points at the cross_profile=True bypass.

   Defense-in-depth, NOT a security boundary — the terminal tool runs
   as the same OS user and can write any of these paths directly. The
   guard exists to prevent confused-agent corruption, not to stop a
   determined attacker. SECURITY.md §3.2 (terminal-bypass posture)
   still applies.

   Wired into tools/file_tools.write_file_tool and patch_tool with a
   cross_profile=False kwarg. WRITE_FILE_SCHEMA and PATCH_SCHEMA both
   advertise cross_profile so the model can pass it after explicit
   user direction. patch_tool extracts target paths from V4A patch
   bodies before checking (same shape as the existing sensitive-path
   check).

   skill_manage is already scoped to the active profile's SKILLS_DIR
   by construction, so no extra guard wiring is needed there. The
   D-side error message (below) still names other profiles when the
   skill exists elsewhere.

B. agent/system_prompt
   One deterministic line near the environment-hints block names the
   active profile and tells the model not to modify another profile's
   skills/plugins/cron/memories without explicit direction. Profile
   name is stable for the lifetime of the AIAgent, so the line is
   prompt-cache-safe.

D. tools/skill_manager_tool._skill_not_found_error
   Replaces the bare "Skill 'X' not found." with a message that:
     - names the active profile,
     - searches OTHER profiles' skills dirs for the same name,
     - names the profile(s) where the skill exists and the path,
     - suggests `hermes -p <name>` to switch profiles, or
       cross_profile=True for an explicit edit.

   All 5 "not found" sites in skill_manager_tool (edit, patch, delete,
   write_file, remove_file) now go through the helper.

Reference incident (May 2026): a hermes-security profile session
edited skills under both ~/.hermes/profiles/hermes-security/skills/
AND ~/.hermes/skills/ (the default profile's skills) without
realizing the second path belonged to a different profile. Three of
the four skill files needed manual restoration afterward.

What this PR does NOT do:

  * No hard block. The terminal tool can still touch any of these
    paths with no guard — same posture as the dangerous-command
    approval flow. SECURITY.md §3.2 applies.
  * No regex sweep on terminal commands for cross-profile paths.
    That direction is a Skills-Guard-style arms race (cd + relative
    paths, base64, etc.) and would false-positive on legitimate
    cross-profile reads. Filed as a follow-up.
  * No on-disk path migration. ~/.hermes/skills/ remains the
    default profile's skills dir; this PR is about telling the
    agent about that boundary, not changing the layout.

Tests:
  tests/agent/test_file_safety_cross_profile.py (16 tests)
    - _resolve_active_profile_name covers default/named/failure paths
    - classify_cross_profile_target covers all four scoped areas,
      both directions (default → named, named → default, named → named),
      non-Hermes paths, and root-level config files
    - get_cross_profile_warning covers in-profile no-op, cross-profile
      message shape, and the defense-in-depth self-documentation

  tests/tools/test_cross_profile_guard.py (12 tests)
    - write_file: in-profile allow, cross-profile block, cross_profile=True
      bypass, non-Hermes pass-through
    - patch: replace-mode block, cross_profile=True bypass, V4A patch
      path extraction
    - skill_manage: error names the other profile (single + multiple),
      missing-everywhere falls back to skills_list hint
    - system prompt: contract-level checks (both branches present,
      cross_profile=True mentioned, ~/.hermes/profiles/ referenced)

All 207 existing tests in file_safety/file_operations/skill_manager
still pass. 10 system-prompt tests still pass.

E2E verified: the exact incident scenario (security profile editing
default's hermes-agent-dev skill) is now blocked with the warning
message; cross_profile=True unblocks.

* fix(code_execution): add cross_profile to write_file/patch stubs

The cross_profile kwarg added to write_file_tool/patch_tool needs to
flow through the execute_code sandbox stubs in _TOOL_STUBS so the
test_stubs_cover_all_schema_params drift test passes. Without this,
scripts running inside execute_code couldn't pass cross_profile=True
through hermes_tools.write_file().

Caught by CI on PR #31290.
2026-05-24 00:38:17 -07:00
..
__init__.py
conftest.py test(tools): centralize disable_lazy_stt_install fixture in conftest 2026-05-22 03:33:01 -07:00
test_accretion_caps.py
test_ansi_strip.py
test_approval.py fix(approval): pin 'silence is not consent' contract on timeout/deny (#24912) (#30879) 2026-05-23 02:59:13 -07:00
test_approval_heartbeat.py
test_approval_plugin_hooks.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_base_environment.py
test_browser_camofox.py
test_browser_camofox_persistence.py
test_browser_camofox_state.py
test_browser_cdp_override.py
test_browser_cdp_tool.py
test_browser_chromium_check.py fix(install): skip browser download when system chromium exists 2026-05-13 22:07:02 -07:00
test_browser_cleanup.py
test_browser_cloud_fallback.py
test_browser_cloud_provider_cache.py
test_browser_console.py
test_browser_content_none_guard.py
test_browser_eval_supervisor_path.py
test_browser_hardening.py
test_browser_homebrew_paths.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_browser_hybrid_routing.py
test_browser_lightpanda.py
test_browser_orphan_reaper.py fix(browser): use process-tree termination for daemon cleanup 2026-05-23 20:30:29 -07:00
test_browser_secret_exfil.py fix(test): deflake two intermittent CI failures 2026-05-22 19:46:18 -07:00
test_browser_ssrf_local.py
test_browser_supervisor.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_browser_supervisor_healthcheck.py
test_budget_config.py
test_checkpoint_manager.py
test_clarify_gateway.py fix(gateway): enable text-intercept for multi-choice clarify fallback (#25567) 2026-05-14 07:59:12 -07:00
test_clarify_tool.py
test_clipboard.py fix(clipboard): reject non-png clipboard images when png normalization fails 2026-05-13 22:54:21 -07:00
test_code_execution.py
test_code_execution_modes.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_code_execution_windows_env.py
test_command_guards.py
test_computer_use.py fix(computer-use): skip capture_after when action failed (ok=False) 2026-05-22 01:19:01 -07:00
test_computer_use_capture_routing.py test(computer_use): end-to-end regression for capture routing (#24015) 2026-05-21 17:38:19 -07:00
test_computer_use_vision_routing.py test(computer_use): cover capture vision-routing helper 2026-05-21 17:38:19 -07:00
test_config_null_guard.py
test_credential_files.py
test_credential_pool_env_fallback.py
test_cron_approval_mode.py
test_cron_prompt_injection.py
test_cronjob_tools.py fix(cron): allow emoji ZWJ sequences in prompts 2026-05-19 00:10:43 -07:00
test_cross_profile_guard.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
test_daytona_environment.py
test_debug_helpers.py
test_delegate.py test(delegation): add regression test for runtime missing 'provider' key 2026-05-17 11:40:05 -07:00
test_delegate_composite_toolsets.py
test_delegate_subagent_timeout_diagnostic.py
test_delegate_toolset_scope.py
test_discord_tool.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_docker_environment.py
test_docker_find.py
test_dockerfile_node_modules_perms.py
test_dockerfile_pid1_reaping.py fix(docker): preload messaging gateway deps 2026-05-17 11:51:46 -07:00
test_env_passthrough.py
test_feishu_tools.py
test_file_operations.py fix(file-safety): also write-deny <root>/control-files in profile mode 2026-05-22 04:32:14 -07:00
test_file_operations_edge_cases.py
test_file_ops_cwd_tracking.py
test_file_read_guards.py
test_file_staleness.py
test_file_state_registry.py
test_file_sync.py
test_file_sync_back.py
test_file_sync_perf.py
test_file_tools.py
test_file_tools_container_config.py
test_file_tools_live.py
test_file_write_safety.py
test_force_dangerous_override.py
test_fuzzy_match.py
test_hardline_blocklist.py
test_heartbeat_stale_thresholds.py
test_hidden_dir_filter.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_homeassistant_tool.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_image_generation.py
test_image_generation_env.py feat(image-gen): actionable setup message when no FAL backend is reachable (#26222) 2026-05-15 01:33:13 -07:00
test_image_generation_plugin_dispatch.py
test_init_session_cwd_respect.py
test_interrupt.py
test_kanban_codex_lane_skill.py docs: add kanban codex lane skill 2026-05-18 21:01:14 -07:00
test_kanban_tools.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_lazy_deps.py fix(update): refresh lazy-installed backends on hermes update (#25766) 2026-05-14 08:03:40 -07:00
test_llm_content_none_guard.py feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
test_local_background_child_hang.py
test_local_env_blocklist.py
test_local_env_cwd_recovery.py
test_local_env_windows_msys.py fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call 2026-05-15 16:25:31 -07:00
test_local_interrupt_cleanup.py
test_local_shell_init.py
test_local_tempdir.py
test_managed_browserbase_and_modal.py fix(browser): self-review pass — dead-import, log levels, future-proofing 2026-05-17 04:04:15 -07:00
test_managed_media_gateways.py
test_managed_modal_environment.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_managed_tool_gateway.py
test_mcp_cancelled_error_propagation.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_mcp_circuit_breaker.py
test_mcp_dynamic_discovery.py
test_mcp_empty_error_message.py
test_mcp_image_content.py
test_mcp_invalid_url.py fix(mcp): validate remote URLs up-front with a clear error (#27105) 2026-05-16 13:06:56 -07:00
test_mcp_oauth.py fix(mcp-oauth): print SSH tunnel hint in _redirect_handler 2026-05-17 02:29:37 -07:00
test_mcp_oauth_bidirectional.py
test_mcp_oauth_cold_load_expiry.py
test_mcp_oauth_integration.py
test_mcp_oauth_manager.py
test_mcp_oauth_metadata.py
test_mcp_probe.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
test_mcp_reconnect_signal.py
test_mcp_sse_transport.py
test_mcp_stability.py fix(mcp): use module-level time so test patches do not race background sleepers 2026-05-17 13:33:26 -07:00
test_mcp_structured_content.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
test_mcp_tool.py fix(mcp): prevent parallel-safe prefix collisions 2026-05-17 11:41:26 -07:00
test_mcp_tool_401_handling.py
test_mcp_tool_issue_948.py
test_mcp_tool_session_expired.py
test_mcp_utility_capability_gating.py
test_memory_tool.py fix(memory): guard against external drift in MEMORY.md/USER.md (#26045) (#30877) 2026-05-23 02:51:29 -07:00
test_memory_tool_import_fallback.py
test_memory_tool_schema.py
test_microsoft_graph_auth.py
test_microsoft_graph_client.py
test_mixture_of_agents_tool.py
test_modal_bulk_upload.py
test_modal_sandbox_fixes.py
test_modal_snapshot_isolation.py
test_notify_on_complete.py fix(terminal): warn at call time when background=true runs silently (#31289) 2026-05-23 21:02:14 -07:00
test_osv_check.py
test_parse_env_var.py
test_patch_parser.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
test_pr_6656_regressions.py fix(skills): make content_hash filename-sensitive too (symmetric with bundle_content_hash) 2026-05-22 19:59:24 -07:00
test_process_registry.py fix(process_registry): use taskkill /T /F for tree-kill on Windows 2026-05-23 20:30:29 -07:00
test_read_loop_detection.py
test_registry.py test(ci): stabilize shared optional dependency baselines 2026-05-13 17:32:22 -07:00
test_resolve_path.py
test_schema_sanitizer.py fix(xai-responses): strip enum values containing '/' from tool schemas 2026-05-18 10:37:35 -07:00
test_search_hidden_dirs.py
test_send_message_missing_platforms.py
test_send_message_telegram_proxy.py test+release: align send_message mocks for MessageEntity import; map @fonhal 2026-05-18 22:19:50 -07:00
test_send_message_tool.py Fix unsafe gateway media path delivery 2026-05-23 01:40:35 -07:00
test_session_search.py feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
test_shared_container_task_id.py
test_signal_media.py
test_singularity_preflight.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_skill_env_passthrough.py
test_skill_improvements.py
test_skill_manager_tool.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
test_skill_provenance.py
test_skill_size_limits.py
test_skill_usage.py
test_skill_view_path_check.py
test_skill_view_traversal.py
test_skills_ast_audit.py refactor(skills): slim AST diagnostic to single entry point 2026-05-23 17:47:26 -07:00
test_skills_guard.py fix(skills_guard): explain why --force is rejected on dangerous verdicts 2026-05-23 02:37:30 -07:00
test_skills_hub.py fix(skills-hub): deduplicate search results by identifier, not name 2026-05-20 15:04:01 -07:00
test_skills_hub_browse_sh.py fix(browse-sh): fetch SKILL.md via /api/skills/{slug}+skillMdUrl 2026-05-19 14:17:38 -07:00
test_skills_hub_clawhub.py
test_skills_sync.py
test_skills_tool.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
test_slash_confirm.py
test_spotify_client.py
test_ssh_bulk_upload.py fix(ssh): keep bulk sync extraction scoped to .hermes 2026-05-21 19:17:51 -07:00
test_ssh_environment.py
test_symlink_prefix_confusion.py
test_sync_back_backends.py
test_terminal_compound_background.py
test_terminal_config_env_sync.py
test_terminal_exit_semantics.py
test_terminal_foreground_timeout_cap.py
test_terminal_none_command_guard.py
test_terminal_output_transform_hook.py
test_terminal_requirements.py
test_terminal_task_cwd.py
test_terminal_timeout_output.py
test_terminal_tool.py
test_terminal_tool_pty_fallback.py
test_terminal_tool_requirements.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_threaded_process_handle.py
test_tirith_security.py test: keep tirith checks hermetic 2026-05-23 02:20:14 -07:00
test_todo_tool.py
test_tool_backend_helpers.py
test_tool_output_limits.py
test_tool_result_storage.py
test_transcription.py test(tools): centralize disable_lazy_stt_install fixture in conftest 2026-05-22 03:33:01 -07:00
test_transcription_dotenv_fallback.py test(tools): centralize disable_lazy_stt_install fixture in conftest 2026-05-22 03:33:01 -07:00
test_transcription_tools.py test(tools): centralize disable_lazy_stt_install fixture in conftest 2026-05-22 03:33:01 -07:00
test_tts_command_providers.py
test_tts_dotenv_fallback.py fix(xai-http): preserve ~/.hermes/.env fallback and XAI_STT_BASE_URL precedence 2026-05-15 12:11:32 -07:00
test_tts_gemini.py
test_tts_kittentts.py test(ci): stabilize shared optional dependency baselines 2026-05-13 17:32:22 -07:00
test_tts_max_text_length.py
test_tts_mistral.py
test_tts_opus_routing.py fix(tts): keep native audio outside Telegram voice delivery 2026-05-18 22:29:45 -07:00
test_tts_piper.py
test_tts_speed.py fix(tts): align MiniMax TTS defaults with current API and add GroupId support 2026-05-13 22:04:28 -07:00
test_tts_xai_speech_tags.py Add opt-in xAI TTS speech tag pauses 2026-05-20 09:22:28 -07:00
test_url_safety.py fix(url_safety): block IPv4-mapped IPv6 addresses to prevent SSRF bypass 2026-05-18 10:51:15 -07:00
test_vercel_sandbox_environment.py
test_video_analyze.py
test_video_generation_dispatch.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
test_video_generation_dynamic_schema.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
test_video_generation_tool_surface_matrix.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_vision_native_fast_path.py
test_vision_tools.py
test_voice_cli_integration.py fix(voice): chunk oversized CLI recordings 2026-05-21 14:17:39 -07:00
test_voice_mode.py test: keep tirith checks hermetic 2026-05-23 02:20:14 -07:00
test_watch_patterns.py
test_web_providers.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_web_providers_brave_free.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_web_providers_ddgs.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_web_providers_searxng.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_web_providers_xai.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
test_web_tools_config.py refactor(web): dispatch all three tools through web_search_registry 2026-05-13 22:31:28 -07:00
test_web_tools_tavily.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_website_policy.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_windows_compat.py
test_windows_native_support.py
test_write_deny.py test: use subprocesses for each test file (#29016) 2026-05-21 16:40:04 +05:30
test_x_search_tool.py fix(x_search): surface degraded results + validate dates 2026-05-21 02:38:45 +05:30
test_yolo_mode.py
test_zombie_process_cleanup.py ci(tests): add pytest-timeout 60s hard cap to break suite-teardown deadlock (#28861) 2026-05-19 17:27:24 -07:00