hermes-bsd/tools
Teknium 0dee92df22
feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269)
Hardens the context window against Brainworm-class promptware attacks
(see #496). Three changes:

1. tools/threat_patterns.py — single source of truth for injection/promptware
   patterns. Replaces the duplicated pattern lists in prompt_builder.py and
   memory_tool.py. Adds ~15 new Brainworm/C2 patterns (node registration,
   heartbeat/beacon, pull tasking, anti-forensic disk avoidance, identity
   override, known framework names). Three scopes — 'all' (narrow, classic
   injection), 'context' (adds promptware/role-play, broader detection),
   'strict' (adds persistence/SSH-backdoor patterns for user-mediated writes).

2. MemoryStore.load_from_disk() now scans entries at snapshot-build time.
   Poisoned entries are replaced with [BLOCKED: ...] placeholders in the
   frozen system-prompt snapshot. Live state keeps the original so the
   user can still inspect + remove via memory(action=read/remove). Scan is
   deterministic from disk bytes — prefix-cache invariant holds.

3. make_tool_result_message() wraps results from high-risk tools
   (web_extract, web_search, browser_*, mcp_*) in
   <untrusted_tool_result source="...">...</untrusted_tool_result>
   delimiters with framing prose telling the model the content is data,
   not instructions. Architectural defense against indirect injection
   from poisoned web pages, GitHub issues, MCP responses — does NOT
   regex-scan tool results (pattern arms race + per-iteration latency).
   Multimodal content lists pass through unwrapped to preserve adapter
   compatibility.

Pattern philosophy: anchor on C2-specific vocabulary or unambiguous attack
behavior, NOT on bossy English. Dropped patterns suggested in #496 that
would have tripped legitimate content: standalone 'you are obligated to',
'do not respond immediately', 'you must X' without a C2-verb anchor.

Validation:
- 257/257 targeted tests pass (test_threat_patterns + test_memory_tool +
  test_tool_dispatch_helpers + test_prompt_builder)
- E2E run with real Brainworm payload: blocked from AGENTS.md context-file
  path, blocked from MEMORY.md snapshot, wrapped in delimiters when
  arriving via web_extract. Legitimate 'you must follow conventions'
  phrasing not flagged.

Explicitly NOT in this PR (per #496 discussion):
- Per-tool-result regex scanning (pattern arms race)
- SessionBehaviorMonitor / polling-loop detection (wrong layer)
- Outbound network gating (Docker backend already covers this)
- security.context_scanning warn|block knob (current behavior is always
  block-with-placeholder — there's no warn mode that makes sense)

Closes #496 for Phase 1 + the architectural delimiter piece of Phase 2.
Phase 3 stays in tracking issue territory.
2026-05-25 14:52:24 -07:00
..
computer_use fix(computer-use): skip capture_after when action failed (ok=False) 2026-05-22 01:19:01 -07:00
environments feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop 2026-05-24 18:05:33 -07:00
neutts_samples
__init__.py
ansi_strip.py
approval.py fix(approval): harden YOLO bypass, LLM parsing, auto-approve audit, pipe pattern (#23835) 2026-05-25 03:35:33 -07:00
binary_extensions.py
browser_camofox.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_camofox_state.py
browser_cdp_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_dialog_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_supervisor.py
browser_tool.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
budget_config.py
checkpoint_manager.py
clarify_gateway.py
clarify_tool.py
code_execution_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
computer_use_tool.py
credential_files.py
cronjob_tools.py fix(cron): allow emoji ZWJ sequences in prompts 2026-05-19 00:10:43 -07:00
debug_helpers.py
delegate_tool.py
discord_tool.py
env_passthrough.py harden(env_passthrough): apply GHSA-rhgp-j443-p4rf filter to config.yaml path (#27794) 2026-05-25 03:35:23 -07:00
fal_common.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
feishu_doc_tool.py
feishu_drive_tool.py
file_operations.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
file_state.py
file_tools.py fix: reject read_file symlinks to blocking devices (#10133) 2026-05-25 05:07:38 -07:00
fuzzy_match.py
homeassistant_tool.py
image_generation_tool.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
interrupt.py
kanban_tools.py feat(kanban): stamp originating ACP session_id on tasks 2026-05-18 21:15:21 -07:00
lazy_deps.py
managed_tool_gateway.py
mcp_oauth.py feat(mcp-oauth): accept 'skip' at paste prompt to bypass auth without disabling server (#32069) 2026-05-25 05:37:30 -07:00
mcp_oauth_manager.py
mcp_tool.py fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450) 2026-05-24 04:44:59 -07:00
memory_tool.py feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269) 2026-05-25 14:52:24 -07:00
microsoft_graph_auth.py
microsoft_graph_client.py
mixture_of_agents_tool.py
neutts_synth.py
openrouter_client.py
osv_check.py
patch_parser.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
path_security.py
process_registry.py feat(cli): show live background terminal-process count in status bar (#32061) 2026-05-25 05:35:02 -07:00
registry.py
schema_sanitizer.py
send_message_tool.py refactor(gateway): migrate Mattermost adapter to bundled plugin 2026-05-24 18:05:33 -07:00
session_search_tool.py
skill_manager_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
skill_provenance.py
skill_usage.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_ast_audit.py refactor(skills): slim AST diagnostic to single entry point 2026-05-23 17:47:26 -07:00
skills_guard.py Harden Skills Guard multi-word prompt patterns (#26852) 2026-05-25 01:51:27 -07:00
skills_hub.py fix(skills): guard uninstall lock paths 2026-05-25 06:13:36 -07:00
skills_sync.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_tool.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
slash_confirm.py
terminal_tool.py fix(terminal): warn at call time when background=true runs silently (#31289) 2026-05-23 21:02:14 -07:00
threat_patterns.py feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269) 2026-05-25 14:52:24 -07:00
tirith_security.py
todo_tool.py
tool_backend_helpers.py
tool_output_limits.py
tool_result_storage.py
transcription_tools.py fix(transcription): reject symlinked audio inputs (#10082) 2026-05-25 05:07:45 -07:00
tts_tool.py fix(tts): prevent double [pause] in xAI auto speech tags for multi-paragraph text 2026-05-25 14:30:06 -07:00
url_safety.py
video_generation_tool.py
vision_tools.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
voice_mode.py fix(voice): chunk oversized CLI recordings 2026-05-21 14:17:39 -07:00
web_tools.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
website_policy.py
x_search_tool.py fix(x_search): surface degraded results + validate dates 2026-05-21 02:38:45 +05:30
xai_http.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
yuanbao_tools.py Fix unsafe gateway media path delivery 2026-05-23 01:40:35 -07:00