hermes-bsd/tools
teknium1 d3ffbc6409 feat(stt): add stt.providers.<name> command-provider registry
Mirror of the TTS command-provider registry (PR #17843) for STT. Lets any
shell-driven ASR engine — Doubao ASR, NVIDIA Parakeet, whisper.cpp builds,
SenseVoice, curl pipelines — become an STT backend with zero Python.
Complements the legacy HERMES_LOCAL_STT_COMMAND escape hatch (preserved
untouched via the built-in local_command path) and the
register_transcription_provider() Python plugin hook also shipped in this
PR.

Resolution order (mirrors TTS exactly):

  1. Built-in (local, local_command, groq, openai, mistral, xai)
     → native handler. Always wins.
  2. stt.providers.<name>: type: command  → command-provider runner.
  3. Plugin-registered TranscriptionProvider → plugin dispatch.
  4. No match → 'No STT provider available'.

Files
-----
- tools/transcription_tools.py: BUILTIN_STT_PROVIDERS frozenset retained;
  added _resolve_command_stt_provider_config, _transcribe_command_stt,
  and local helpers for template rendering, shell-quote context, and
  process-tree termination. Helpers are documented as mirrors of their
  tts_tool.py counterparts (kept local to avoid cross-tool private
  import). Wire-in is one insertion point in transcribe_audio() after
  the xai elif and before the plugin dispatcher. Plugin dispatcher
  additionally defensively short-circuits when a same-name command
  config exists (command-wins-over-plugin invariant).

- tests/tools/test_transcription_command_providers.py: 50 new tests
  covering resolution (builtin precedence, type/command gating,
  case-insensitive lookup, legacy stt.<name> back-compat), helpers
  (timeout fallback, format validation, iter, has-any), template
  rendering (shell-quote contexts, doubled-brace preservation),
  end-to-end via _transcribe_command_stt (output_path read, stdout
  fallback, timeout, nonzero exit envelope, model override,
  language precedence), and dispatcher integration via the real
  transcribe_audio() including command-wins-over-plugin and
  builtin-shadow-rejection.

- tests/plugins/transcription/check_parity_vs_main.py: extended from
  10 to 13 scenarios. New cases: command-provider-installed,
  command-vs-plugin-same-name (verifies command wins precedence),
  explicit-openai-with-command-shadow (verifies built-in wins).
  Adds command_provider dispatch_kind detection via transcript prefix
  (CMD: vs PLUGIN:) so command-provider scenarios can be distinguished
  from plugin scenarios even when sharing a provider name.

- website/docs/user-guide/features/tts.md: new 'STT custom command
  providers' section symmetric to the TTS section — example config,
  placeholder grammar table (input_path / output_path / output_dir /
  format / language / model), transcript-read-back semantics (file
  first, then stdout fallback), optional keys table, behavior notes,
  security note. Updated 'Python plugin providers (STT)' to include
  the new 'When to pick which (STT)' decision table and updated
  resolution-order section (now 4 layers instead of 3).

Verification
------------
189/189 STT targeted tests + 50/50 new command-provider tests pass.
Combined sweep: tests/tools/ 5576/5576, tests/agent/ + tests/hermes_cli/
8623/8623 — zero regressions across 14,199 tests.

Parity harness: 13 scenarios, 9 OK + 4 expected diffs
(no_provider_error → plugin, plugin_unavailable, command_provider × 2).

E2E live-verified in an isolated HERMES_HOME with a real .wav file:

  command:                    → dispatched to stt.providers.my-fake-cli
  plugin:                     → dispatched to registered TranscriptionProvider
  command-wins-over-plugin:   → command provider beats same-name plugin
  builtin-wins-over-command:  → built-in OpenAI handler fires;
                                stt.providers.openai: type: command
                                does NOT hijack it.
2026-05-25 01:41:19 -07:00
..
computer_use fix(computer-use): skip capture_after when action failed (ok=False) 2026-05-22 01:19:01 -07:00
environments feat(docker): remove gosu from bundled image; s6-setuidgid handles privilege drop 2026-05-24 18:05:33 -07:00
neutts_samples
__init__.py
ansi_strip.py
approval.py fix(approval): pin 'silence is not consent' contract on timeout/deny (#24912) (#30879) 2026-05-23 02:59:13 -07:00
binary_extensions.py
browser_camofox.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_camofox_state.py
browser_cdp_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_dialog_tool.py feat: auto-launch Chromium-family browser for CDP 2026-05-19 22:34:05 -07:00
browser_supervisor.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
browser_tool.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
budget_config.py chore: remove Atropos RL environments and tinker-atropos integration (#26106) 2026-05-15 10:36:38 +05:30
checkpoint_manager.py
clarify_gateway.py
clarify_tool.py
code_execution_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
computer_use_tool.py
credential_files.py
cronjob_tools.py fix(cron): allow emoji ZWJ sequences in prompts 2026-05-19 00:10:43 -07:00
debug_helpers.py
delegate_tool.py fix(delegation): preserve configured_provider name when runtime returns 'custom' 2026-05-17 11:40:05 -07:00
discord_tool.py
env_passthrough.py
fal_common.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
feishu_doc_tool.py
feishu_drive_tool.py
file_operations.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
file_state.py
file_tools.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
fuzzy_match.py
homeassistant_tool.py
image_generation_tool.py refactor(image_gen): port FAL backend to plugins/image_gen/fal 2026-05-22 04:10:45 -07:00
interrupt.py
kanban_tools.py feat(kanban): stamp originating ACP session_id on tasks 2026-05-18 21:15:21 -07:00
lazy_deps.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
managed_tool_gateway.py
mcp_oauth.py fix(security): guard os.chmod(parent) against / and top-level dirs 2026-05-20 22:56:55 -07:00
mcp_oauth_manager.py
mcp_tool.py fix(mcp): raise ImportError instead of NameError when stdio SDK missing (#31450) 2026-05-24 04:44:59 -07:00
memory_tool.py fix(memory): guard against external drift in MEMORY.md/USER.md (#26045) (#30877) 2026-05-23 02:51:29 -07:00
microsoft_graph_auth.py
microsoft_graph_client.py
mixture_of_agents_tool.py
neutts_synth.py
openrouter_client.py
osv_check.py
patch_parser.py fix(lint): skip per-file shell linter when LSP will handle the file (#29054) 2026-05-20 01:46:40 -05:00
path_security.py
process_registry.py fix(process_registry): use taskkill /T /F for tree-kill on Windows 2026-05-23 20:30:29 -07:00
registry.py security: sanitize tool error strings before injecting into model context (#26823) 2026-05-16 00:57:39 -07:00
schema_sanitizer.py fix(xai-responses): strip enum values containing '/' from tool schemas 2026-05-18 10:37:35 -07:00
send_message_tool.py refactor(gateway): migrate Mattermost adapter to bundled plugin 2026-05-24 18:05:33 -07:00
session_search_tool.py feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) 2026-05-17 23:28:45 -07:00
skill_manager_tool.py fix(profiles): cross-profile soft guard on file-write tools + system-prompt hint (#31290) 2026-05-24 00:38:17 -07:00
skill_provenance.py
skill_usage.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_ast_audit.py refactor(skills): slim AST diagnostic to single entry point 2026-05-23 17:47:26 -07:00
skills_guard.py fix(skills_guard): explain why --force is rejected on dangerous verdicts 2026-05-23 02:37:30 -07:00
skills_hub.py fix(skills,pairing): path traversal guard in uninstall, lock list_pending, hash file paths 2026-05-22 19:59:24 -07:00
skills_sync.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
skills_tool.py fix(skills): prune dependency/venv dirs from all skill scanners (#30042) 2026-05-21 14:18:02 -07:00
slash_confirm.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
terminal_tool.py fix(terminal): warn at call time when background=true runs silently (#31289) 2026-05-23 21:02:14 -07:00
tirith_security.py fix(tirith): suppress .app lookalike_tld false positives in warn verdicts 2026-05-18 10:20:07 -07:00
todo_tool.py
tool_backend_helpers.py
tool_output_limits.py
tool_result_storage.py
transcription_tools.py feat(stt): add stt.providers.<name> command-provider registry 2026-05-25 01:41:19 -07:00
tts_tool.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
url_safety.py fix(url_safety): block IPv4-mapped IPv6 addresses to prevent SSRF bypass 2026-05-18 10:51:15 -07:00
video_generation_tool.py chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355) 2026-05-17 02:29:41 -07:00
vision_tools.py fix(vision): route auxiliary.vision.provider=openai to api.openai.com, skip text-only main (#31452) 2026-05-24 15:01:28 -07:00
voice_mode.py fix(voice): chunk oversized CLI recordings 2026-05-21 14:17:39 -07:00
web_tools.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
website_policy.py
x_search_tool.py fix(x_search): surface degraded results + validate dates 2026-05-21 02:38:45 +05:30
xai_http.py feat(web): add xAI Web Search provider plugin 2026-05-19 19:27:34 -07:00
yuanbao_tools.py Fix unsafe gateway media path delivery 2026-05-23 01:40:35 -07:00