hermes-bsd/website/docs/reference/tools-reference.md

271 lines
22 KiB
Markdown
Raw Normal View History

---
sidebar_position: 3
title: "Built-in Tools Reference"
description: "Authoritative reference for Hermes built-in tools, grouped by toolset"
---
# Built-in Tools Reference
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
This page documents Hermes' built-in tools, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
**Quick counts (current registry):** ~70 tools — 10 browser tools (core) + 2 CDP-gated browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, 5 Feishu tools, 7 Spotify tools (registered by the bundled `spotify` plugin), 5 Yuanbao tools, 7 kanban tools (registered when the kanban dispatcher spawns the agent), 2 Discord tools, and a handful of standalone tools (`memory`, `clarify`, `delegate_task`, `execute_code`, `cronjob`, `session_search`, `skill_view`/`skill_manage`/`skills_list`, `text_to_speech`, `image_generate`, `video_generate`, `vision_analyze`, `video_analyze`, `mixture_of_agents`, `send_message`, `todo`, `computer_use`, `process`).
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
:::tip MCP Tools
In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with the prefix `mcp_<server>_` (e.g., `mcp_github_create_issue` for the `github` MCP server). See [MCP Integration](/user-guide/features/mcp) for configuration.
docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393) Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
:::
## `browser` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `browser_back` | Navigate back to the previous page in browser history. Requires browser_navigate to be called first. | — |
| `browser_click` | Click on an element identified by its ref ID from the snapshot (e.g., '@e5'). The ref IDs are shown in square brackets in the snapshot output. Requires browser_navigate and browser_snapshot to be called first. | — |
| `browser_console` | Get browser console output and JavaScript errors from the current page. Returns console.log/warn/error/info messages and uncaught JS exceptions. Use this to detect silent JavaScript errors, failed API calls, and application warnings. Requi… | — |
| `browser_get_images` | Get a list of all images on the current page with their URLs and alt text. Useful for finding images to analyze with the vision tool. Requires browser_navigate to be called first. | — |
| `browser_navigate` | Navigate to a URL in the browser. Initializes the session and loads the page. Must be called before other browser tools. For simple information retrieval, prefer web_search or web_extract (faster, cheaper). Use browser tools when you need… | — |
| `browser_press` | Press a keyboard key. Useful for submitting forms (Enter), navigating (Tab), or keyboard shortcuts. Requires browser_navigate to be called first. | — |
| `browser_scroll` | Scroll the page in a direction. Use this to reveal more content that may be below or above the current viewport. Requires browser_navigate to be called first. | — |
| `browser_snapshot` | Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs (like @e1, @e2) for browser_click and browser_type. full=false (default): compact view with interactive elements. full=true: comp… | — |
| `browser_type` | Type text into an input field identified by its ref ID. Clears the field first, then types the new text. Requires browser_navigate and browser_snapshot to be called first. | — |
| `browser_vision` | Take a screenshot of the current page and analyze it with vision AI. Use this when you need to visually understand what's on the page - especially useful for CAPTCHAs, visual verification challenges, complex layouts, or when the text snaps… | — |
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
## `browser` toolset (CDP-gated tools)
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) Broad drift audit against origin/main (b52b63396). Reference pages (most user-visible drift): - slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer that were missing; drop non-existent /terminal-setup; fix /q footnote (resolves to /queue, not /quit); extend CLI-only list with all 24 CLI-only commands in the registry - cli-commands: add dedicated sections for hermes curator / fallback / hooks (new subcommands not previously documented); remove stale hermes honcho standalone section (the plugin registers dynamically via hermes memory); list curator/fallback/hooks in top-level table; fix completion to include fish - toolsets-reference: document the real 52-toolset count; split browser vs browser-cdp; add discord / discord_admin / spotify / yuanbao; correct hermes-cli tool count from 36 to 38; fix misleading claim that hermes-homeassistant adds tools (it's identical to hermes-cli) - tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao, 2 Discord toolsets; move browser_cdp/browser_dialog to their own browser-cdp toolset section - environment-variables: add 40+ user-facing HERMES_* vars that were undocumented (--yolo, --accept-hooks, --ignore-*, inference model override, agent/stream/checkpoint timeouts, OAuth trace, per-platform batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs, gateway restart/connect timeouts); dedupe the Cron Scheduler section; replace stale QQ_SANDBOX with QQ_PORTAL_HOST User-guide (top level): - cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20) - configuration.md: display.platforms is the canonical per-platform override key; tool_progress_overrides is deprecated and auto-migrated - profiles.md: model.default is the config key, not model.model - sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8 - checkpoints-and-rollback.md: destructive-command list now matches _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd) - docker.md: the container runs as non-root hermes (UID 10000) via gosu; fix install command (uv pip); add missing --insecure on the dashboard compose example (required for non-loopback bind) - security.md: systemctl danger pattern also matches 'restart' - index.md: built-in tool count 47 -> 68 - integrations/index.md: 6 STT providers, 8 memory providers - integrations/providers.md: drop fictional dashscope/qwen aliases Features: - overview.md: 9 image models (not 8), 9 TTS providers (not 5), 8 memory providers (Supermemory was missing) - tool-gateway.md: 9 image models - tools.md: extend common-toolsets list with search / messaging / spotify / discord / debugging / safe - fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan, tencent-tokenhub, azure-foundry) - plugins.md: Available Hooks table now includes on_session_finalize, on_session_reset, subagent_stop - built-in-plugins.md: add the 7 bundled plugins the page didn't mention (spotify, google_meet, three image_gen providers, two dashboard examples) - web-dashboard.md: add --insecure and --tui flags - cron.md: hermes cron create takes positional schedule/prompt, not flags Messaging: - telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch. - discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default is 2.0, not 0.1 - dingtalk.md: document DINGTALK_REQUIRE_MENTION / FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL / ALLOW_ALL_USERS that the adapter supports - bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env var; the setting lives in platforms.bluebubbles.extra only - qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and QQ_GROUP_ALLOWED_USERS - wecom-callback.md: replace 'hermes gateway start' (service-only) with 'hermes gateway' for first-time setup Developer-guide: - architecture.md: refresh tool/toolset counts (61/52), terminal backend count (7), line counts for run_agent.py (~13.7k), cli.py (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform adapter count 18 -> 20 - agent-loop.md: run_agent.py line count 10.7k -> 13.7k - tools-runtime.md: add vercel_sandbox backend - adding-tools.md: remove stale 'Discovery import added to model_tools.py' checklist item (registry auto-discovery) - adding-platform-adapters.md: mark send_typing / get_chat_info as concrete base methods; only connect/disconnect/send are abstract - acp-internals.md: ACP sessions now persist to SessionDB (~/.hermes/state.db); acp.run_agent call uses use_unstable_protocol=True - cron-internals.md: gateway runs scheduler in a dedicated background thread via _start_cron_ticker, not on a maintenance cycle; locking is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows) - gateway-internals.md: gateway/run.py ~12k lines - provider-runtime.md: cron DOES support fallback (run_job reads fallback_providers from config) - session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations 10 and 11 (trigram FTS, inline-mode FTS5 re-index); add api_call_count column to Sessions DDL; document messages_fts_trigram and state_meta in the architecture tree - context-compression-and-caching.md: remove the obsolete 'context pressure warnings' section (warnings were removed for causing models to give up early) - context-engine-plugin.md: compress() signature now includes focus_topic param - extending-the-cli.md: _build_tui_layout_children signature now includes model_picker_widget; add to default layout Also fixed three pre-existing broken links/anchors the build warned about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and tips#background-tasks, nix-setup.md -> #container-aware-cli). Regenerated per-skill pages via website/scripts/generate-skill-docs.py so catalog tables and sidebar are consistent with current SKILL.md frontmatter. docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
These two tools live in the `browser` toolset but only register when a Chrome DevTools Protocol endpoint is reachable at session start — via `/browser connect`, `browser.cdp_url` config, a Browserbase session, or Camofox.
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) Broad drift audit against origin/main (b52b63396). Reference pages (most user-visible drift): - slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer that were missing; drop non-existent /terminal-setup; fix /q footnote (resolves to /queue, not /quit); extend CLI-only list with all 24 CLI-only commands in the registry - cli-commands: add dedicated sections for hermes curator / fallback / hooks (new subcommands not previously documented); remove stale hermes honcho standalone section (the plugin registers dynamically via hermes memory); list curator/fallback/hooks in top-level table; fix completion to include fish - toolsets-reference: document the real 52-toolset count; split browser vs browser-cdp; add discord / discord_admin / spotify / yuanbao; correct hermes-cli tool count from 36 to 38; fix misleading claim that hermes-homeassistant adds tools (it's identical to hermes-cli) - tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao, 2 Discord toolsets; move browser_cdp/browser_dialog to their own browser-cdp toolset section - environment-variables: add 40+ user-facing HERMES_* vars that were undocumented (--yolo, --accept-hooks, --ignore-*, inference model override, agent/stream/checkpoint timeouts, OAuth trace, per-platform batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs, gateway restart/connect timeouts); dedupe the Cron Scheduler section; replace stale QQ_SANDBOX with QQ_PORTAL_HOST User-guide (top level): - cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20) - configuration.md: display.platforms is the canonical per-platform override key; tool_progress_overrides is deprecated and auto-migrated - profiles.md: model.default is the config key, not model.model - sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8 - checkpoints-and-rollback.md: destructive-command list now matches _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd) - docker.md: the container runs as non-root hermes (UID 10000) via gosu; fix install command (uv pip); add missing --insecure on the dashboard compose example (required for non-loopback bind) - security.md: systemctl danger pattern also matches 'restart' - index.md: built-in tool count 47 -> 68 - integrations/index.md: 6 STT providers, 8 memory providers - integrations/providers.md: drop fictional dashscope/qwen aliases Features: - overview.md: 9 image models (not 8), 9 TTS providers (not 5), 8 memory providers (Supermemory was missing) - tool-gateway.md: 9 image models - tools.md: extend common-toolsets list with search / messaging / spotify / discord / debugging / safe - fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan, tencent-tokenhub, azure-foundry) - plugins.md: Available Hooks table now includes on_session_finalize, on_session_reset, subagent_stop - built-in-plugins.md: add the 7 bundled plugins the page didn't mention (spotify, google_meet, three image_gen providers, two dashboard examples) - web-dashboard.md: add --insecure and --tui flags - cron.md: hermes cron create takes positional schedule/prompt, not flags Messaging: - telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch. - discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default is 2.0, not 0.1 - dingtalk.md: document DINGTALK_REQUIRE_MENTION / FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL / ALLOW_ALL_USERS that the adapter supports - bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env var; the setting lives in platforms.bluebubbles.extra only - qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and QQ_GROUP_ALLOWED_USERS - wecom-callback.md: replace 'hermes gateway start' (service-only) with 'hermes gateway' for first-time setup Developer-guide: - architecture.md: refresh tool/toolset counts (61/52), terminal backend count (7), line counts for run_agent.py (~13.7k), cli.py (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform adapter count 18 -> 20 - agent-loop.md: run_agent.py line count 10.7k -> 13.7k - tools-runtime.md: add vercel_sandbox backend - adding-tools.md: remove stale 'Discovery import added to model_tools.py' checklist item (registry auto-discovery) - adding-platform-adapters.md: mark send_typing / get_chat_info as concrete base methods; only connect/disconnect/send are abstract - acp-internals.md: ACP sessions now persist to SessionDB (~/.hermes/state.db); acp.run_agent call uses use_unstable_protocol=True - cron-internals.md: gateway runs scheduler in a dedicated background thread via _start_cron_ticker, not on a maintenance cycle; locking is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows) - gateway-internals.md: gateway/run.py ~12k lines - provider-runtime.md: cron DOES support fallback (run_job reads fallback_providers from config) - session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations 10 and 11 (trigram FTS, inline-mode FTS5 re-index); add api_call_count column to Sessions DDL; document messages_fts_trigram and state_meta in the architecture tree - context-compression-and-caching.md: remove the obsolete 'context pressure warnings' section (warnings were removed for causing models to give up early) - context-engine-plugin.md: compress() signature now includes focus_topic param - extending-the-cli.md: _build_tui_layout_children signature now includes model_picker_widget; add to default layout Also fixed three pre-existing broken links/anchors the build warned about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and tips#background-tasks, nix-setup.md -> #container-aware-cli). Regenerated per-skill pages via website/scripts/generate-skill-docs.py so catalog tables and sidebar are consistent with current SKILL.md frontmatter. docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `browser_cdp` | Send a raw Chrome DevTools Protocol command. Escape hatch for browser operations not covered by the higher-level `browser_*` tools. See https://chromedevtools.github.io/devtools-protocol/ | CDP endpoint |
| `browser_dialog` | Respond to a native JavaScript dialog (alert / confirm / prompt / beforeunload). Call `browser_snapshot` first — pending dialogs appear in its `pending_dialogs` field. Then call `browser_dialog(action='accept'\|'dismiss')`. | CDP endpoint |
## `clarify` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `clarify` | Ask the user a question when you need clarification, feedback, or a decision before proceeding. Supports two modes: 1. **Multiple choice** — provide up to 4 choices. The user picks one or types their own answer via a 5th 'Other' option. 2.… | — |
## `code_execution` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `execute_code` | Run a Python script that can call Hermes tools programmatically. Use this when you need 3+ tool calls with processing logic between them, need to filter/reduce large tool outputs before they enter your context, need conditional branching (… | — |
## `cronjob` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `cronjob` | Unified scheduled-task manager. Use `action="create"`, `"list"`, `"update"`, `"pause"`, `"resume"`, `"run"`, or `"remove"` to manage jobs. Supports skill-backed jobs with one or more attached skills, and `skills=[]` on update clears attached skills. Cron runs happen in fresh sessions with no current-chat context. | — |
## `delegation` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `delegate_task` | Spawn one or more subagents to work on tasks in isolated contexts. Each subagent gets its own conversation, terminal session, and toolset. Only the final summary is returned -- intermediate tool results never enter your context window. TWO… | — |
docs: correctness audit — fix wrong values, add missing coverage (#11972) Comprehensive audit of every reference/messaging/feature doc page against the live code registries (PROVIDER_REGISTRY, OPTIONAL_ENV_VARS, COMMAND_REGISTRY, TOOLSETS, tool registry, on-disk skills). Every fix was verified against code before writing. ### Wrong values fixed (users would paste-and-fail) - reference/environment-variables.md: - DASHSCOPE_BASE_URL default was `coding-intl.dashscope.aliyuncs.com/v1` \u2192 actual `dashscope-intl.aliyuncs.com/compatible-mode/v1`. - MINIMAX_BASE_URL and MINIMAX_CN_BASE_URL defaults were `/v1` \u2192 actual `/anthropic` (Hermes calls MiniMax via its Anthropic Messages endpoint). - reference/toolsets-reference.md MCP example used the non-existent nested `mcp: servers:` key \u2192 real key is the flat `mcp_servers:`. - reference/skills-catalog.md listed ~20 bundled skills that no longer exist on disk (all moved to `optional-skills/`). Regenerated the whole bundled section from `skills/**/SKILL.md` \u2014 79 skills, accurate paths and names. - messaging/slack.md ":::info" callout claimed Slack has no `free_response_channels` equivalent; both the env var and the yaml key are in fact read. - messaging/qqbot.md documented `QQ_MARKDOWN_SUPPORT` as an env var, but the adapter only reads `extra.markdown_support` from config.yaml. Removed the env var row and noted config-only nature. - messaging/qqbot.md `hermes setup gateway` \u2192 `hermes gateway setup`. ### Missing coverage added - Providers: AWS Bedrock and Qwen Portal (qwen-oauth) \u2014 both in PROVIDER_REGISTRY but undocumented everywhere. Added sections to integrations/providers.md, rows to quickstart.md and fallback-providers.md. - integrations/providers.md "Fallback Model" provider list now includes gemini, google-gemini-cli, qwen-oauth, xai, nvidia, ollama-cloud, bedrock. - reference/cli-commands.md `--provider` enum and HERMES_INFERENCE_PROVIDER enum in env-vars now include the same set. - reference/slash-commands.md: added `/agents` (alias `/tasks`) and `/copy`. Removed duplicate rows for `/snapshot`, `/fast` (\u00d72), `/debug`. - reference/tools-reference.md: fixed "47 built-in tools" \u2192 52. Added `feishu_doc` and `feishu_drive` toolset sections. - reference/toolsets-reference.md: added `feishu_doc` / `feishu_drive` core rows + all missing `hermes-<platform>` toolsets in the platform table (bluebubbles, dingtalk, feishu, qqbot, wecom, wecom-callback, weixin, homeassistant, webhook, gateway). Fixed the `debugging` composite to describe the actual `includes=[...]` mechanism. - reference/optional-skills-catalog.md: added `fitness-nutrition`. - reference/environment-variables.md: added NOUS_BASE_URL, NOUS_INFERENCE_BASE_URL, NVIDIA_API_KEY/BASE_URL, OLLAMA_API_KEY/BASE_URL, XAI_API_KEY/BASE_URL, MISTRAL_API_KEY, AWS_REGION/AWS_PROFILE, BEDROCK_BASE_URL, HERMES_QWEN_BASE_URL, DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, TELEGRAM_REPLY_TO_MODE, MATRIX_DEVICE_ID, MATRIX_REACTIONS, QQBOT_HOME_CHANNEL_NAME, QQ_SANDBOX. - messaging/discord.md: documented DISCORD_ALLOWED_CHANNELS, DISCORD_PROXY, HERMES_DISCORD_TEXT_BATCH_DELAY_SECONDS and HERMES_DISCORD_TEXT_BATCH_SPLIT _DELAY_SECONDS (all actively read by the adapter). - messaging/matrix.md: documented MATRIX_REACTIONS (default true). - messaging/telegram.md: removed the redundant second Webhook Mode section that invented a `telegram.webhook_mode: true` yaml key the adapter does not read. - user-guide/features/hooks.md: added `on_session_finalize` and `on_session_reset` (both emitted via invoke_hook but undocumented). - user-guide/features/api-server.md: documented GET /health/detailed, the `/api/jobs/*` CRUD surface, POST /v1/runs, and GET /v1/runs/{id}/events (10 routes that were live but undocumented). - user-guide/features/fallback-providers.md: added `approval` and `title_generation` auxiliary-task rows; added gemini, bedrock, qwen-oauth to the supported-providers table. - user-guide/features/tts.md: "seven providers" \u2192 "eight" (post-xAI add oversight in #11942). - user-guide/configuration.md: TTS provider enum gains `xai` and `gemini`; yaml example block gains `mistral:`, `gemini:`, `xai:` subsections. Auxiliary-provider enum now enumerates all real registry entries. - reference/faq.md: stale AIAgent/config examples bumped from `nous/hermes-3-llama-3.1-70b` and `claude-sonnet-4.6` to `claude-opus-4.7`. ### Docs-site integrity - guides/build-a-hermes-plugin.md referenced two nonexistent hooks (`pre_api_request`, `post_api_request`). Replaced with the real `on_session_finalize` / `on_session_reset` entries. - messaging/open-webui.md and features/api-server.md had pre-existing broken links to `/docs/user-guide/features/profiles` (actual path is `/docs/user-guide/profiles`). Fixed. - reference/skills-catalog.md had one `<1%` literal that MDX parsed as a JSX tag. Escaped to `&lt;1%`. ### False positives filtered out (not changed, verified correct) - `/set-home` is a registered alias of `/sethome` \u2014 docs were fine. - `hermes setup gateway` is valid syntax (`hermes setup \<section\>`); changed in qqbot.md for cross-doc consistency, not as a bug fix. - Telegram reactions "disabled by default" matches code (default `"false"`). - Matrix encryption "opt-in" matches code (empty env default \u2192 disabled). - `pre_api_request` / `post_api_request` hooks do NOT exist in current code; documented instead the real `on_session_finalize` / `on_session_reset`. - SIGNAL_IGNORE_STORIES is already in env-vars.md (subagent missed it). Validation: - `docusaurus build` \u2014 passes (only pre-existing nix-setup anchor warning). - `ascii-guard lint docs` \u2014 124 files, 0 errors. - 22 files changed, +317 / \u2212158.
2026-04-18 01:45:48 -07:00
## `feishu_doc` toolset
Scoped to the Feishu document-comment intelligent-reply handler (`gateway/platforms/feishu_comment.py`). Not exposed on `hermes-cli` or the regular Feishu chat adapter.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `feishu_doc_read` | Read the full text content of a Feishu/Lark document (Docx, Doc, or Sheet) given its file_type and token. | Feishu app credentials |
## `feishu_drive` toolset
Scoped to the Feishu document-comment handler. Drives comment read/write operations on drive files.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `feishu_drive_add_comment` | Add a top-level comment on a Feishu/Lark document or file. | Feishu app credentials |
| `feishu_drive_list_comments` | List whole-document comments on a Feishu/Lark file, most recent first. | Feishu app credentials |
| `feishu_drive_list_comment_replies` | List replies on a specific Feishu comment thread (whole-doc or local-selection). | Feishu app credentials |
| `feishu_drive_reply_comment` | Post a reply on a Feishu comment thread, with optional `@`-mention. | Feishu app credentials |
## `file` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `patch` | Targeted find-and-replace edits in files. Use this instead of sed/awk in terminal. Uses fuzzy matching (9 strategies) so minor whitespace/indentation differences won't break it. Returns a unified diff. Auto-runs syntax checks after editing… | — |
| `read_file` | Read a text file with line numbers and pagination. Use this instead of cat/head/tail in terminal. Output format: 'LINE_NUM\|CONTENT'. Suggests similar filenames if not found. Use offset and limit for large files. NOTE: Cannot read images o… | — |
| `search_files` | Search file contents or find files by name. Use this instead of grep/rg/find/ls in terminal. Ripgrep-backed, faster than shell equivalents. Content search (target='content'): Regex search inside files. Output modes: full matches with line… | — |
| `write_file` | Write content to a file, completely replacing existing content. Use this instead of echo/cat heredoc in terminal. Creates parent directories automatically. OVERWRITES the entire file — use 'patch' for targeted edits. | — |
## `homeassistant` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `ha_call_service` | Call a Home Assistant service to control a device. Use ha_list_services to discover available services and their parameters for each domain. | — |
| `ha_get_state` | Get the detailed state of a single Home Assistant entity, including all attributes (brightness, color, temperature setpoint, sensor readings, etc.). | — |
| `ha_list_entities` | List Home Assistant entities. Optionally filter by domain (light, switch, climate, sensor, binary_sensor, cover, fan, etc.) or by area name (living room, kitchen, bedroom, etc.). | — |
| `ha_list_services` | List available Home Assistant services (actions) for device control. Shows what actions can be performed on each device type and what parameters they accept. Use this to discover how to control devices found via ha_list_entities. | — |
feat(computer-use): cua-driver backend, universal any-model schema Background macOS desktop control via cua-driver MCP — does NOT steal the user's cursor or keyboard focus, works with any tool-capable model. Replaces the Anthropic-native `computer_20251124` approach from the abandoned #4562 with a generic OpenAI function-calling schema plus SOM (set-of-mark) captures so Claude, GPT, Gemini, and open models can all drive the desktop via numbered element indices. - `tools/computer_use/` package — swappable ComputerUseBackend ABC + CuaDriverBackend (stdio MCP client to trycua/cua's cua-driver binary). - Universal `computer_use` tool with one schema for all providers. Actions: capture (som/vision/ax), click, double_click, right_click, middle_click, drag, scroll, type, key, wait, list_apps, focus_app. - Multimodal tool-result envelope (`_multimodal=True`, OpenAI-style `content: [text, image_url]` parts) that flows through handle_function_call into the tool message. Anthropic adapter converts into native `tool_result` image blocks; OpenAI-compatible providers get the parts list directly. - Image eviction in convert_messages_to_anthropic: only the 3 most recent screenshots carry real image data; older ones become text placeholders to cap per-turn token cost. - Context compressor image pruning: old multimodal tool results have their image parts stripped instead of being skipped. - Image-aware token estimation: each image counts as a flat 1500 tokens instead of its base64 char length (~1MB would have registered as ~250K tokens before). - COMPUTER_USE_GUIDANCE system-prompt block — injected when the toolset is active. - Session DB persistence strips base64 from multimodal tool messages. - Trajectory saver normalises multimodal messages to text-only. - `hermes tools` post-setup installs cua-driver via the upstream script and prints permission-grant instructions. - CLI approval callback wired so destructive computer_use actions go through the same prompt_toolkit approval dialog as terminal commands. - Hard safety guards at the tool level: blocked type patterns (curl|bash, sudo rm -rf, fork bomb), blocked key combos (empty trash, force delete, lock screen, log out). - Skill `apple/macos-computer-use/SKILL.md` — universal (model-agnostic) workflow guide. - Docs: `user-guide/features/computer-use.md` plus reference catalog entries. 44 new tests in tests/tools/test_computer_use.py covering schema shape (universal, not Anthropic-native), dispatch routing, safety guards, multimodal envelope, Anthropic adapter conversion, screenshot eviction, context compressor pruning, image-aware token estimation, run_agent helpers, and universality guarantees. 469/469 pass across tests/tools/test_computer_use.py + the affected agent/ test suites. - `model_tools.py` provider-gating: the tool is available to every provider. Providers without multi-part tool message support will see text-only tool results (graceful degradation via `text_summary`). - Anthropic server-side `clear_tool_uses_20250919` — deferred; client-side eviction + compressor pruning cover the same cost ceiling without a beta header. - macOS only. cua-driver uses private SkyLight SPIs (SLEventPostToPid, SLPSPostEventRecordTo, _AXObserverAddNotificationAndCheckRemote) that can break on any macOS update. Pin with HERMES_CUA_DRIVER_VERSION. - Requires Accessibility + Screen Recording permissions — the post-setup prints the Settings path. Supersedes PR #4562 (pyautogui/Quartz foreground backend, Anthropic- native schema). Credit @0xbyt4 for the original #3816 groundwork whose context/eviction/token design is preserved here in generic form.
2026-04-23 16:44:24 -07:00
## `computer_use` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `computer_use` | Background macOS desktop control via cua-driver — screenshots (SOM / vision / AX), click / drag / scroll / type / key / wait, list_apps, focus_app. Does NOT steal the user's cursor or keyboard focus. Works with any tool-capable model. macOS only. | `cua-driver` on `$PATH` (install via `hermes tools`). |
:::note
feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619) Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>
2026-04-15 19:12:19 -07:00
**Honcho tools** (`honcho_profile`, `honcho_search`, `honcho_context`, `honcho_reasoning`, `honcho_conclude`) are no longer built-in. They are available via the Honcho memory provider plugin at `plugins/memory/honcho/`. See [Memory Providers](../user-guide/features/memory-providers.md) for installation and usage.
:::
## `image_gen` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
| `image_generate` | Generate high-quality images from text prompts using FAL.ai. The underlying model is user-configured (default: FLUX 2 Klein 9B, sub-1s generation) and is not selectable by the agent. Returns a single image URL. Display it using… | FAL_KEY |
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
## `kanban` toolset
Registered when the agent is either (a) spawned by the kanban dispatcher (`HERMES_KANBAN_TASK` env set) or (b) running in a profile that explicitly enables the `kanban` toolset. Task-scoped workers use lifecycle tools for their assigned task; orchestrator profiles additionally get board-routing tools like `kanban_list` and `kanban_unblock`. See [Kanban Multi-Agent](/user-guide/features/kanban) for the full workflow.
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `kanban_show` | Show the active kanban task assigned to this worker (title, description, comments, dependencies). | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_list` | List board tasks with filters. Orchestrator-only; hidden from dispatcher-spawned task workers. | profile with `kanban` toolset |
| `kanban_complete` | Mark the current task done with a structured handoff payload (results, artifacts, follow-ups). | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_block` | Block the current task on a question for the user — the dispatcher pauses, surfaces the question, and resumes once a human replies. | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_heartbeat` | Send a progress heartbeat during a long-running operation so the dispatcher knows the worker is still alive. | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_comment` | Add a comment to the task thread without changing its state — useful for surfacing intermediate findings. | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_create` | Fan out child tasks from the current task. Used by orchestrators and follow-up-spawning workers. | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_link` | Link tasks with a parent → child dependency edge. | `HERMES_KANBAN_TASK` or `kanban` toolset |
| `kanban_unblock` | Return a blocked task to `ready`. Orchestrator-only; hidden from dispatcher-spawned task workers. | profile with `kanban` toolset |
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
## `memory` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `memory` | Save important information to persistent memory that survives across sessions. Your memory appears in your system prompt at session start -- it's how you remember things about the user and your environment between conversations. WHEN TO SA… | — |
## `messaging` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `send_message` | Send a message to a connected messaging platform, or list available targets. IMPORTANT: When the user asks to send to a specific channel or person (not just a bare platform name), call send_message(action='list') FIRST to see available tar… | — |
## `moa` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `mixture_of_agents` | Route a hard problem through multiple frontier LLMs collaboratively. Makes 5 API calls (4 reference models + 1 aggregator) with maximum reasoning effort — use sparingly for genuinely difficult problems. Best for: complex math, advanced alg… | OPENROUTER_API_KEY |
## `session_search` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
feat(session_search): single-shape tool with discovery, scroll, browse — no LLM (#27590) * feat(session_search): single-shape tool with discovery, scroll, browse — no LLM Replaces the LLM-summarized session_search with a single-shape tool that returns actual messages from the DB. Three calling shapes inferred from args (no mode parameter): 1. Discovery — pass query. FTS5 + anchored ±5 window + bookends per hit, all in one call. ~20ms on a real DB instead of ~90s for the previous three aux-LLM calls. 2. Scroll — pass session_id + around_message_id. Returns a window centered on the anchor. To paginate, re-anchor on the first/last id of the returned window. Boundary message appears in both windows as the orientation marker. ~1ms per scroll call. 3. Browse — no args. Recent sessions chronologically. Bookend_start (first 3 user+assistant msgs) and bookend_end (last 3) give the agent goal + resolution on every discovery hit, so a single tool call reconstructs a long session's arc without loading the whole transcript. The aux-LLM summary path is gone: it cost ~$0.30/call, took ~30s, and laundered FTS5 hits through a model that could confabulate when the right session wasn't in the hit list. The merged shape returns byte-for-byte content from SQLite. History: - PR #20238 (JabberELF) seeded the fast/summary dual-mode split. - PR #26419 (yoniebans) expanded to fast/guided/summary with bookends, multi-anchor drill-down, default-mode config, and a teaching skill. This PR collapses that toolkit into one shape with explicit scroll support, drops the summary path, drops the mode parameter, drops the config knob, drops the skill. JabberELF's seed work is acknowledged via the AUTHOR_MAP entry. Validation: - 38/38 tool tests pass (tests/tools/test_session_search.py) - 12/12 get_messages_around tests pass (tests/hermes_state/) - 11/11 get_anchored_view tests pass (tests/hermes_state/) - Full tests/tools/ run: 5168 passing, 2 failures pre-exist on main (test ordering in test_delegate.py, unrelated) - E2E against live state DB: discovery 20ms, scroll 1ms, browse 280ms; pagination forward+backward works with boundary-message orientation; error paths return clean tool_error responses Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com> * chore(session_search): prune dead LLM-summary config and docs Companion to the single-shape rewrite. The auxiliary.session_search config block, max_concurrency / extra_body tunables, and matching docs sections all referenced the removed LLM summarization path. Removing them so users don't try to tune knobs that nothing reads. - hermes_cli/config.py: drop dead auxiliary.session_search block from DEFAULT_CONFIG. Leftover keys in user config.yaml are harmless and ignored. - hermes_cli/tips.py: drop two tips referencing the removed max_concurrency / extra_body knobs. - website/docs/user-guide/configuration.md: drop 'Session Search Tuning' section and the auxiliary.session_search block from the example. - website/docs/user-guide/features/fallback-providers.md: drop session_search rows from the auxiliary-tasks tables and the dedicated tuning subsection. - website/docs/reference/tools-reference.md: rewrite the session_search entry to describe the new three-shape behaviour. - CONTRIBUTING.md: update the file-tree description. - tests/tools/test_llm_content_none_guard.py: remove TestSessionSearchContentNone class and test_session_search_tool_guarded — both guard against an unguarded .content.strip() call site in _summarize_session() that no longer exists. Validation: 97/97 targeted tests still pass (hermes_state + session_search + llm_content_none_guard). Config tests 55/55. --------- Co-authored-by: JabberELF <abcdjmm970703@gmail.com> Co-authored-by: yoniebans <jonny@nousresearch.com>
2026-05-17 23:28:45 -07:00
| `session_search` | Search past sessions stored in the local session DB, or scroll inside one. FTS5-backed retrieval; returns actual messages from the DB (no LLM calls). Three shapes: discovery (pass `query`), scroll (pass `session_id` + `around_message_id`), browse (no args). | — |
## `skills` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `skill_manage` | Manage skills (create, update, delete). Skills are your procedural memory — reusable approaches for recurring task types. New skills go to ~/.hermes/skills/; existing skills can be modified wherever they live. Actions: create (full SKILL.m… | — |
| `skill_view` | Skills allow for loading information about specific tasks and workflows, as well as scripts and templates. Load a skill's full content or access its linked files (references, templates, scripts). First call returns SKILL.md content plus a… | — |
| `skills_list` | List available skills (name + description). Use skill_view(name) to load full content. | — |
## `terminal` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `process` | Manage background processes started with terminal(background=true). Actions: 'list' (show all), 'poll' (check status + new output), 'log' (full output with pagination), 'wait' (block until done or timeout), 'kill' (terminate), 'write' (sen… | — |
docs: comprehensive docs audit — cover 13 features from last week's PRs (#5815) Cover documentation gaps found by auditing all 50+ merged PRs from the past week: tools-reference.md: - Fix stale tool count (47→46, 11→10 browser tools) after browser_close removal - Document notify_on_complete parameter in terminal tool description telegram.md: - Add Interactive Model Picker section (inline keyboard, provider/model drill-down) discord.md: - Add Interactive Model Picker section (Select dropdowns, 120s timeout) - Add Native Slash Commands for Skills section (auto-registration at startup) signal.md: - Expand Attachments section with outgoing media delivery (send_image_file, send_voice, send_video, send_document via MEDIA: tags) webhooks.md: - Document {__raw__} special template token for full payload access - Document Forum Topic Delivery via message_thread_id in deliver_extra slack.md: - Fix stale/misleading thread reply docs — thread replies no longer require @mention when bot has active session (3 locations updated) security.md: - Add cross-session isolation (layer 6) and input sanitization (layer 7) to security layers overview feishu.md: - Add WebSocket Tuning section (ws_reconnect_interval, ws_ping_interval) - Add Per-Group Access Control section (group_rules with 5 policy types) credential-pools.md: - Add Delegation & Subagent Sharing section delegation.md: - Update key properties to mention credential pool inheritance providers.md: - Add Z.AI Endpoint Auto-Detection note - Add xAI (Grok) Prompt Caching section skills-catalog.md: - Add p5js to creative skills category
2026-04-07 10:21:03 -07:00
| `terminal` | Execute shell commands on a Linux environment. Filesystem persists between calls. Set `background=true` for long-running servers. Set `notify_on_complete=true` (with `background=true`) to get an automatic notification when the process finishes — no polling needed. Do NOT use cat/head/tail — use read_file. Do NOT use grep/rg/find — use search_files. | — |
## `todo` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `todo` | Manage your task list for the current session. Use for complex tasks with 3+ steps or when the user provides multiple tasks. Call with no parameters to read the current list. Writing: - Provide 'todos' array to create/update items - merge=… | — |
## `vision` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
docs: comprehensive 2-week sweep of feature/PR coverage gaps (#28497) Catch the website docs up to two weeks of merged work (May 4 – May 18, 2026, roughly 1,080 PRs). The audit found ~50 user-visible features that had landed in code with no docs footprint, plus a handful of stale pages. This PR closes every gap the scan turned up. New pages - user-guide/features/deliverable-mode.md — extension list, agent triggers, kanban_complete artifacts pattern, [[as_document]] override (PR #27813). - developer-guide/web-search-provider-plugin.md — authoring guide modeled on image-gen-provider-plugin, covering brave_free / ddgs / etc. (PR #25448). Providers / auth - Rename "Alibaba Cloud" → "Qwen Cloud (Alibaba DashScope)" everywhere the display label shows up; provider id stays `alibaba` (PR #24835). - Document OAuth refresh-token quarantine for xAI / MiniMax / Codex (PRs #28116 / #28118 / #28119). - Document Nous JWT minting from refresh token + invalid-refresh quarantine + cross-profile shared token store (PRs #27663 / #19712). - Add `## Microsoft Entra ID authentication (keyless)` section to azure-foundry guide — DefaultAzureCredential, RBAC, OpenAI + Anthropic routing details (PR #28101 / #9df9816da). - Custom providers `api_mode` is now prompted-and-persisted, not just URL autodetected (PR #25068). - Delegation honours `api_mode` + auto-detects anthropic_messages base URLs (PR #26824). - `x_search` auto-enables when xAI credentials are present (PR #27376). - Add `xAI Grok OAuth (SuperGrok)` row to providers headline table (PR #26534). - NVIDIA NIM billing-origin header is set automatically (PR #26585). Windows / installer - `install.ps1`: document `-Commit <sha>` and `-Tag <v>` pin params plus the BOM-strip / git-retry hardening (PR #28169). - Document Hermes Desktop thin installer + first-launch bootstrap (PR #27822). - Document `dep_ensure` Windows bootstrap (PR #27845). - Document install-method auto-detection (pip / git / homebrew / nixos) and the matching update command (PR #27843). Gateway / messaging - `/platform list|pause|resume` full description + circuit-breaker semantics (PR #26600). - Slack / Matrix / Mattermost get parallel `allowed_channels` / `allowed_rooms` allowlist sections matching Telegram/Discord/DingTalk (PR #21251). - Discord `allow_any_attachment` + `max_attachment_bytes` (config and env vars) (PR #27245). - Discord clarify-choice button rendering (PR #25485). - Telegram `guest_mode` @mention bypass for allowlisted groups (PR #22759). - Telegram `notifications` mode (`important` vs `all`) (PR #22793). - `[[as_document]]` skill / response directive for forcing document-style media delivery (PR #21210). CLI / TUI - `/new [name]` argument (PR #19637). - `/subgoal` user-supplied criteria appended to `/goal` (PR #25449). - `/exit --delete` flag confirmation prompts for destructive slash commands (PR #22687). - Status-bar additions: ▶ N background indicator (PR #27175), context compression count (PR #21218), YOLO mode banner+statusbar warning (PR #26238). - `display.timestamps` + `docker_extra_args` config keys (PR #23599). - TUI collapsible startup banner sections (PR #20625). - `HERMES_SESSION_ID` exported to tool subprocesses (PR #23847). i18n - Refresh display.language locale list from 8 → 16 (en, zh, zh-hant, ja, de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu) — matches `agent/i18n.py:SUPPORTED_LANGUAGES`. Tools / features - `vision_analyze` native-pixel passthrough for vision-capable callers, with auxiliary text-describer fallback (PR #22955). - `session_search` rewrite to the single-shape tool (discovery / scroll / browse modes) (PRs #27590 / #27840). - Clarify MCP transport scope: client supports stdio + SSE; embedded `hermes mcp serve` is stdio-only (PR #21227). - Web search backends table: add Brave Search (free tier) and DDGS rows (PR #21337). - ACP session-scoped edit auto-approval modes (PR #27862). - Curator rename map in the user-visible per-run summary (PR #22910). - Prompt caching feature page reference in features/overview.md — Claude cross-session 1-hour prefix cache on native Anthropic / OpenRouter / Nous Portal (PR #23828). - Cron per-job profile parameter (PR #28124). - `--no-skills` flag for `hermes profile create` (PR #20986). Build - Verified with `npm run build` in `website/`; both `en` and `zh-Hans` locales compile. Remaining broken-link/anchor warnings are pre-existing (`rl-training.md` from learning-path / overview; the zh-Hans translation lag the docs skill already calls out).
2026-05-18 23:55:25 -07:00
| `vision_analyze` | Analyze images using AI vision. On vision-capable main models, returns the raw image pixels as a multimodal tool result so the model sees them natively on its next turn. On text-only main models, falls back to an auxiliary vision model that describes the image and returns the description as text. Tool signature is identical either way. | — |
docs: deep audit — fix stale config keys, missing commands, and registry drift (#22784) * docs: deep audit — fix stale config keys, missing commands, and registry drift Cross-checked ~80 high-impact docs pages (getting-started, reference, top-level user-guide, user-guide/features) against the live registries: hermes_cli/commands.py COMMAND_REGISTRY (slash commands) hermes_cli/auth.py PROVIDER_REGISTRY (providers) hermes_cli/config.py DEFAULT_CONFIG (config keys) toolsets.py TOOLSETS (toolsets) tools/registry.py get_all_tool_names() (tools) python -m hermes_cli.main <subcmd> --help (CLI args) reference/ - cli-commands.md: drop duplicate hermes fallback row + duplicate section, add stepfun/lmstudio to --provider enum, expand auth/mcp/curator subcommand lists to match --help output (status/logout/spotify, login, archive/prune/ list-archived). - slash-commands.md: add missing /sessions and /reload-skills entries + correct the cross-platform Notes line. - tools-reference.md: drop bogus '68 tools' headline, drop fictional 'browser-cdp toolset' (these tools live in 'browser' and are runtime-gated), add missing 'kanban' and 'video' toolset sections, fix MCP example to use the real mcp_<server>_<tool> prefix. - toolsets-reference.md: list browser_cdp/browser_dialog inside the 'browser' row, add missing 'kanban' and 'video' toolset rows, drop the stale '38 tools' count for hermes-cli. - profile-commands.md: add missing install/update/info subcommands, document fish completion. - environment-variables.md: dedupe GMI_API_KEY/GMI_BASE_URL rows (kept the one with the correct gmi-serving.com default). - faq.md: Anthropic/Google/OpenAI examples — direct providers exist (not just via OpenRouter), refresh the OpenAI model list. getting-started/ - installation.md: PortableGit (not MinGit) is what the Windows installer fetches; document the 32-bit MinGit fallback. - installation.md / termux.md: installer prefers .[termux-all] then falls back to .[termux]. - nix-setup.md: Python 3.12 (not 3.11), Node.js 22 (not 20); fix invalid 'nix flake update --flake' invocation. - updating.md: 'hermes backup restore --state pre-update' doesn't exist — point at the snapshot/quick-snapshot flow; correct config key 'updates.pre_update_backup' (was 'update.backup'). user-guide/ - configuration.md: api_max_retries default 3 (not 2); display.runtime_footer is the real key (not display.runtime_metadata_footer); checkpoints defaults enabled=false / max_snapshots=20 (not true / 50). - configuring-models.md: 'hermes model list' / 'hermes model set ...' don't exist — hermes model is interactive only. - tui.md: busy_indicator -> tui_status_indicator with values kaomoji|emoji|unicode|ascii (not kawaii|minimal|dots|wings|none). - security.md: SSH backend keys (TERMINAL_SSH_HOST/USER/KEY) live in .env, not config.yaml. - windows-wsl-quickstart.md: there is no 'hermes api' subcommand — the OpenAI-compatible API server runs inside hermes gateway. user-guide/features/ - computer-use.md: approvals.mode (not security.approval_level); fix broken ./browser-use.md link to ./browser.md. - fallback-providers.md: top-level fallback_providers (not model.fallback_providers); the picker is subcommand-based, not modal. - api-server.md: API_SERVER_* are env vars — write to per-profile .env, not 'hermes config set' which targets YAML. - web-search.md: drop web_crawl as a registered tool (it isn't); deep-crawl modes are exposed through web_extract. - kanban.md: failure_limit default is 2, not '~5'. - plugins.md: drop hard-coded '33 providers' count. - honcho.md: fix unclosed quote in echo HONCHO_API_KEY snippet; document that 'hermes honcho' subcommand is gated on memory.provider=honcho; reconcile subcommand list with actual --help output. - memory-providers.md: legacy 'hermes honcho setup' redirect documented. Verified via 'npm run build' — site builds cleanly; broken-link count went from 149 to 146 (no regressions, fixed a few in passing). * docs: round 2 audit fixes + regenerate skill catalogs Follow-up to the previous commit on this branch: Round 2 manual fixes: - quickstart.md: KIMI_CODING_API_KEY mentioned alongside KIMI_API_KEY; voice-mode and ACP install commands rewritten — bare 'pip install ...' doesn't work for curl-installed setups (no pip on PATH, not in repo dir); replaced with 'cd ~/.hermes/hermes-agent && uv pip install -e ".[voice]"'. ACP already ships in [all] so the curl install includes it. - cli.md / configuration.md: 'auxiliary.compression.model' shown as 'google/gemini-3-flash-preview' (the doc's own claimed default); actual default is empty (= use main model). Reworded as 'leave empty (default) or pin a cheap model'. - built-in-plugins.md: added the bundled 'kanban/dashboard' plugin row that was missing from the table. Regenerated skill catalogs: - ran website/scripts/generate-skill-docs.py to refresh all 163 per-skill pages and both reference catalogs (skills-catalog.md, optional-skills-catalog.md). This adds the entries that were genuinely missing — productivity/teams-meeting-pipeline (bundled), optional/finance/* (entire category — 7 skills: 3-statement-model, comps-analysis, dcf-model, excel-author, lbo-model, merger-model, pptx-author), creative/hyperframes, creative/kanban-video-orchestrator, devops/watchers, productivity/shop-app, research/searxng-search, apple/macos-computer-use — and rewrites every other per-skill page from the current SKILL.md. Most diffs are tiny (one line of refreshed metadata). Validation: - 'npm run build' succeeded. - Broken-link count moved 146 -> 155 — the +9 are zh-Hans translation shells that lag every newly-added skill page (pre-existing pattern). No regressions on any en/ page.
2026-05-09 13:19:51 -07:00
## `video` toolset
Opt-in toolset (not loaded in the default `hermes-cli` set). Add via `--toolsets video` or include `video` in your `toolsets:` config.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `video_analyze` | Analyze video content from a URL or file path — captions, scene breakdowns, key timestamps, and visual descriptions. | — |
## `video_gen` toolset
Opt-in toolset (not loaded in the default `hermes-cli` set). Add via `--toolsets video_gen` or enable it in `hermes tools` → Video Generation, which also walks you through picking a backend.
Backends ship as plugins under `plugins/video_gen/<name>/`:
- **xAI Grok-Imagine** — text-to-video and image-to-video (SuperGrok OAuth or `XAI_API_KEY`).
- **FAL.ai** — Veo 3.1, Pixverse v6, Kling O3 (requires `FAL_KEY`).
The single `video_generate` tool covers both modalities — pass `image_url` to animate a still, omit it to generate from text alone. The active backend auto-routes to the right endpoint. The tool's description is rebuilt at session start to reflect the active backend's actual capabilities (modalities, aspect ratios, resolutions, duration range, max reference images, audio support). See [Video Generation Provider Plugins](/developer-guide/video-gen-provider-plugin) for backend authoring.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `video_generate` | Generate a video from a text prompt (text-to-video) or animate a still image (image-to-video) using the user's configured video generation backend. Pass `image_url` to animate that image; omit it to generate from text alone. The backend auto-routes to the right endpoint. Returns either an HTTP URL or an absolute file path in the `video` field. | Active `video_gen` plugin + its credential (e.g. `XAI_API_KEY`, `FAL_KEY`) |
## `web` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `web_search` | Search the web for information. Returns up to 5 results by default with titles, URLs, and descriptions. Accepts an optional `limit` (1-100, default 5). The query is passed through to the configured backend, so operators such as `site:domain`, `filetype:pdf`, `intitle:word`, `-term`, and `"exact phrase"` may work when the backend supports them. | EXA_API_KEY or PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087) Reference docs fixes: - cli-commands.md: remove non-existent --provider alibaba, add hermes profile/completion/plugins/mcp to top-level table, add --profile/-p global flag, add --source chat option - slash-commands.md: add /yolo and /commands, fix /q alias conflict (resolves to /queue not /quit), add missing aliases (/bg, /set-home, /reload_mcp, /gateway) - toolsets-reference.md: fix hermes-api-server (not same as hermes-cli, omits clarify/send_message/text_to_speech) - profile-commands.md: fix show name required not optional, --clone-from not --from, add --remove/--name to alias, fix alias path, fix export/ import arg types, remove non-existent fish completion - tools-reference.md: add EXA_API_KEY to web tools requires_env - mcp-config-reference.md: add auth key for OAuth, tool name sanitization - environment-variables.md: add EXA_API_KEY, update provider values - plugins.md: remove non-existent ctx.register_command(), add ctx.inject_message() Feature docs additions: - security.md: add /yolo mode, approval modes (manual/smart/off), configurable timeout, expanded dangerous patterns table - cron.md: add wrap_response config, [SILENT] suppression - mcp.md: add dynamic tool discovery, MCP sampling support - cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length - docker.md: add skills/credential file mounting Messaging platform docs: - telegram.md: add webhook mode, DoH fallback IPs - slack.md: add multi-workspace OAuth support - discord.md: add DISCORD_IGNORE_NO_MENTION - matrix.md: add MSC3245 native voice messages - feishu.md: expand from 129 to 365 lines (encrypt key, verification token, group policy, card actions, media, rate limiting, markdown, troubleshooting) - wecom.md: expand from 86 to 264 lines (per-group allowlists, media, AES decryption, stream replies, reconnection, troubleshooting) Configuration docs: - quickstart.md: add DeepSeek, Copilot, Copilot ACP providers - configuration.md: add DeepSeek provider, Exa web backend, terminal env_passthrough/images, browser.command_timeout, compression params, discord config, security/tirith config, timezone, auxiliary models 21 files changed, ~1000 lines added
2026-03-30 17:15:21 -07:00
| `web_extract` | Extract content from web page URLs. Returns page content in markdown format. Also works with PDF URLs — pass the PDF link directly and it converts to markdown text. Pages under 5000 chars return full markdown; larger pages are LLM-summarized. | EXA_API_KEY or PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth (#26763) * feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth Salvages tools/x_search_tool.py from the closed PR #10786 (originally by @Jaaneek) and reworks its credential resolution so the tool registers when EITHER xAI credential path is available: * XAI_API_KEY (paid xAI API key) is set in ~/.hermes/.env or the env, OR * The user is signed in via xAI Grok OAuth — SuperGrok subscription — i.e. hermes auth add xai-oauth has been run Both paths route through xAI's built-in x_search Responses tool at https://api.x.ai/v1/responses. When both credentials exist OAuth wins, matching tools/xai_http.py's existing preference order (uses SuperGrok quota instead of paid API spend). The check_fn calls resolve_xai_http_credentials() which auto-refreshes the OAuth access token if it's within the refresh skew window, so a True return means the bearer is fetchable AND non-empty. Wiring - tools/x_search_tool.py — new tool, ~370 LOC. Schema gated by check_fn, bearer resolved per-call so revoked OAuth surfaces a clean tool_error rather than an HTTP 401. - toolsets.py — "x_search" toolset def. NOT added to _HERMES_CORE_TOOLS; users opt in via hermes tools. - hermes_cli/tools_config.py — CONFIGURABLE_TOOLSETS entry + TOOL_CATEGORIES block with two provider options (OAuth + API key) sharing the existing xai_grok post_setup hook for credential bootstrap. - hermes_cli/config.py — DEFAULT_CONFIG["x_search"] with model / timeout_seconds / retries. Additive nested key; no version bump. - tests/tools/test_x_search_tool.py — 13 tests covering HTTP shape, handle validation, citation extraction, 4xx/5xx/timeout handling, and the full credential-resolution matrix (OAuth-only, API-key-only, both-set, neither-set, resolver-raises, config overrides, registry registration). - website/docs/guides/xai-grok-oauth.md — adds X Search to the direct-to-xAI tools section with off-by-default note. - website/docs/user-guide/features/tools.md — new row in the tools table. Off by default — users enable via `hermes tools` → 🐦 X (Twitter) Search. Schema only appears to the model when xAI credentials are configured. Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com> * docs(x_search): add dedicated feature page + reference entries - website/docs/user-guide/features/x-search.md (new) — full feature walkthrough: authentication, enablement, configuration, parameters, returned fields, example, troubleshooting, see-also links. - website/docs/reference/tools-reference.md — new "x_search" toolset section with parameter docs and credential gating note. - website/docs/reference/toolsets-reference.md — new row in the toolset catalog table. - website/sidebars.ts — wires the new feature page under Media & Web, after web-search. --------- Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
2026-05-16 00:58:27 -07:00
## `x_search` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `x_search` | Search X (Twitter) posts, profiles, and threads using xAI's built-in `x_search` Responses tool. Use this for current discussion, reactions, or claims on X rather than general web pages. Off by default — opt in via `hermes tools` → 🐦 X (Twitter) Search. Schema is only registered when xAI credentials are configured (check_fn-gated). | XAI_API_KEY **or** xAI Grok OAuth (SuperGrok / Premium+) login |
feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth (#26763) * feat(x_search): gated X (Twitter) search tool with OAuth-or-API-key auth Salvages tools/x_search_tool.py from the closed PR #10786 (originally by @Jaaneek) and reworks its credential resolution so the tool registers when EITHER xAI credential path is available: * XAI_API_KEY (paid xAI API key) is set in ~/.hermes/.env or the env, OR * The user is signed in via xAI Grok OAuth — SuperGrok subscription — i.e. hermes auth add xai-oauth has been run Both paths route through xAI's built-in x_search Responses tool at https://api.x.ai/v1/responses. When both credentials exist OAuth wins, matching tools/xai_http.py's existing preference order (uses SuperGrok quota instead of paid API spend). The check_fn calls resolve_xai_http_credentials() which auto-refreshes the OAuth access token if it's within the refresh skew window, so a True return means the bearer is fetchable AND non-empty. Wiring - tools/x_search_tool.py — new tool, ~370 LOC. Schema gated by check_fn, bearer resolved per-call so revoked OAuth surfaces a clean tool_error rather than an HTTP 401. - toolsets.py — "x_search" toolset def. NOT added to _HERMES_CORE_TOOLS; users opt in via hermes tools. - hermes_cli/tools_config.py — CONFIGURABLE_TOOLSETS entry + TOOL_CATEGORIES block with two provider options (OAuth + API key) sharing the existing xai_grok post_setup hook for credential bootstrap. - hermes_cli/config.py — DEFAULT_CONFIG["x_search"] with model / timeout_seconds / retries. Additive nested key; no version bump. - tests/tools/test_x_search_tool.py — 13 tests covering HTTP shape, handle validation, citation extraction, 4xx/5xx/timeout handling, and the full credential-resolution matrix (OAuth-only, API-key-only, both-set, neither-set, resolver-raises, config overrides, registry registration). - website/docs/guides/xai-grok-oauth.md — adds X Search to the direct-to-xAI tools section with off-by-default note. - website/docs/user-guide/features/tools.md — new row in the tools table. Off by default — users enable via `hermes tools` → 🐦 X (Twitter) Search. Schema only appears to the model when xAI credentials are configured. Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com> * docs(x_search): add dedicated feature page + reference entries - website/docs/user-guide/features/x-search.md (new) — full feature walkthrough: authentication, enablement, configuration, parameters, returned fields, example, troubleshooting, see-also links. - website/docs/reference/tools-reference.md — new "x_search" toolset section with parameter docs and credential gating note. - website/docs/reference/toolsets-reference.md — new row in the toolset catalog table. - website/sidebars.ts — wires the new feature page under Media & Web, after web-search. --------- Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
2026-05-16 00:58:27 -07:00
## `tts` toolset
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `text_to_speech` | Convert text to speech audio. Returns a MEDIA: path that the platform delivers as a voice message. On Telegram it plays as a voice bubble, on Discord/WhatsApp as an audio attachment. In CLI mode, saves to ~/voice-memos/. Voice and provider… | — |
docs: resync reference, user-guide, developer-guide, and messaging pages against code (#17738) Broad drift audit against origin/main (b52b63396). Reference pages (most user-visible drift): - slash-commands: add /busy, /curator, /footer, /indicator, /redraw, /steer that were missing; drop non-existent /terminal-setup; fix /q footnote (resolves to /queue, not /quit); extend CLI-only list with all 24 CLI-only commands in the registry - cli-commands: add dedicated sections for hermes curator / fallback / hooks (new subcommands not previously documented); remove stale hermes honcho standalone section (the plugin registers dynamically via hermes memory); list curator/fallback/hooks in top-level table; fix completion to include fish - toolsets-reference: document the real 52-toolset count; split browser vs browser-cdp; add discord / discord_admin / spotify / yuanbao; correct hermes-cli tool count from 36 to 38; fix misleading claim that hermes-homeassistant adds tools (it's identical to hermes-cli) - tools-reference: bump tool count 55 -> 68; add 7 Spotify, 5 Yuanbao, 2 Discord toolsets; move browser_cdp/browser_dialog to their own browser-cdp toolset section - environment-variables: add 40+ user-facing HERMES_* vars that were undocumented (--yolo, --accept-hooks, --ignore-*, inference model override, agent/stream/checkpoint timeouts, OAuth trace, per-platform batch tuning for Telegram/Discord/Matrix/Feishu/WeCom, cron knobs, gateway restart/connect timeouts); dedupe the Cron Scheduler section; replace stale QQ_SANDBOX with QQ_PORTAL_HOST User-guide (top level): - cli.md: compression preserves last 20 turns, not 4 (protect_last_n: 20) - configuration.md: display.platforms is the canonical per-platform override key; tool_progress_overrides is deprecated and auto-migrated - profiles.md: model.default is the config key, not model.model - sessions.md: CLI/TUI session IDs use 6-char hex, gateway uses 8 - checkpoints-and-rollback.md: destructive-command list now matches _DESTRUCTIVE_PATTERNS (adds rmdir, cp, install, dd) - docker.md: the container runs as non-root hermes (UID 10000) via gosu; fix install command (uv pip); add missing --insecure on the dashboard compose example (required for non-loopback bind) - security.md: systemctl danger pattern also matches 'restart' - index.md: built-in tool count 47 -> 68 - integrations/index.md: 6 STT providers, 8 memory providers - integrations/providers.md: drop fictional dashscope/qwen aliases Features: - overview.md: 9 image models (not 8), 9 TTS providers (not 5), 8 memory providers (Supermemory was missing) - tool-gateway.md: 9 image models - tools.md: extend common-toolsets list with search / messaging / spotify / discord / debugging / safe - fallback-providers.md: add 6 real providers from PROVIDER_REGISTRY (lmstudio, kimi-coding-cn, stepfun, alibaba-coding-plan, tencent-tokenhub, azure-foundry) - plugins.md: Available Hooks table now includes on_session_finalize, on_session_reset, subagent_stop - built-in-plugins.md: add the 7 bundled plugins the page didn't mention (spotify, google_meet, three image_gen providers, two dashboard examples) - web-dashboard.md: add --insecure and --tui flags - cron.md: hermes cron create takes positional schedule/prompt, not flags Messaging: - telegram.md: TELEGRAM_WEBHOOK_SECRET is now REQUIRED when TELEGRAM_WEBHOOK_URL is set (gateway refuses to start without it per GHSA-3vpc-7q5r-276h). Biggest user-visible drift in the batch. - discord.md: HERMES_DISCORD_TEXT_BATCH_SPLIT_DELAY_SECONDS default is 2.0, not 0.1 - dingtalk.md: document DINGTALK_REQUIRE_MENTION / FREE_RESPONSE_CHATS / MENTION_PATTERNS / HOME_CHANNEL / ALLOW_ALL_USERS that the adapter supports - bluebubbles.md: drop fictional BLUEBUBBLES_SEND_READ_RECEIPTS env var; the setting lives in platforms.bluebubbles.extra only - qqbot.md: drop dead QQ_SANDBOX; add real QQ_PORTAL_HOST and QQ_GROUP_ALLOWED_USERS - wecom-callback.md: replace 'hermes gateway start' (service-only) with 'hermes gateway' for first-time setup Developer-guide: - architecture.md: refresh tool/toolset counts (61/52), terminal backend count (7), line counts for run_agent.py (~13.7k), cli.py (~11.5k), main.py (~10.4k), setup.py (~3.5k), gateway/run.py (~12.2k), mcp_tool.py (~3.1k); add yuanbao adapter, bump platform adapter count 18 -> 20 - agent-loop.md: run_agent.py line count 10.7k -> 13.7k - tools-runtime.md: add vercel_sandbox backend - adding-tools.md: remove stale 'Discovery import added to model_tools.py' checklist item (registry auto-discovery) - adding-platform-adapters.md: mark send_typing / get_chat_info as concrete base methods; only connect/disconnect/send are abstract - acp-internals.md: ACP sessions now persist to SessionDB (~/.hermes/state.db); acp.run_agent call uses use_unstable_protocol=True - cron-internals.md: gateway runs scheduler in a dedicated background thread via _start_cron_ticker, not on a maintenance cycle; locking is cross-process via fcntl.flock (Unix) / msvcrt.locking (Windows) - gateway-internals.md: gateway/run.py ~12k lines - provider-runtime.md: cron DOES support fallback (run_job reads fallback_providers from config) - session-storage.md: SCHEMA_VERSION = 11 (not 9); add migrations 10 and 11 (trigram FTS, inline-mode FTS5 re-index); add api_call_count column to Sessions DDL; document messages_fts_trigram and state_meta in the architecture tree - context-compression-and-caching.md: remove the obsolete 'context pressure warnings' section (warnings were removed for causing models to give up early) - context-engine-plugin.md: compress() signature now includes focus_topic param - extending-the-cli.md: _build_tui_layout_children signature now includes model_picker_widget; add to default layout Also fixed three pre-existing broken links/anchors the build warned about (docker.md -> api-server.md, yuanbao.md -> cron-jobs.md and tips#background-tasks, nix-setup.md -> #container-aware-cli). Regenerated per-skill pages via website/scripts/generate-skill-docs.py so catalog tables and sidebar are consistent with current SKILL.md frontmatter. docusaurus build: clean, no broken links or anchors.
2026-04-29 20:55:59 -07:00
## `discord` toolset
Registered on the `hermes-discord` platform toolset (gateway only). Uses the same bot token as the messaging adapter.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `discord` | Read and participate in a Discord server. Actions include `search_members`, `fetch_messages`, `send_message`, `react`, `fetch_channel`, `list_channels`, and more. | `DISCORD_BOT_TOKEN` |
## `discord_admin` toolset
Registered on the `hermes-discord` platform toolset. Moderation actions require the bot to hold the matching Discord permissions.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `discord_admin` | Manage a Discord server via the REST API: list guilds/channels/roles, create/edit/delete channels, manage role grants, timeouts, kicks, and bans. | `DISCORD_BOT_TOKEN` + bot permissions |
## `spotify` toolset
Registered by the bundled `spotify` plugin. Requires an OAuth token — run `hermes spotify setup` once to authorize.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `spotify_playback` | Control Spotify playback, inspect the active playback state, or fetch recently played tracks. | Spotify OAuth |
| `spotify_devices` | List Spotify Connect devices or transfer playback to a different device. | Spotify OAuth |
| `spotify_queue` | Inspect the user's Spotify queue or add an item to it. | Spotify OAuth |
| `spotify_search` | Search the Spotify catalog for tracks, albums, artists, playlists, shows, or episodes. | Spotify OAuth |
| `spotify_playlists` | List, inspect, create, update, and modify Spotify playlists. | Spotify OAuth |
| `spotify_albums` | Fetch Spotify album metadata or album tracks. | Spotify OAuth |
| `spotify_library` | List, save, or remove the user's saved Spotify tracks or albums. | Spotify OAuth |
## `hermes-yuanbao` toolset
Registered only on the `hermes-yuanbao` platform toolset. Yuanbao is Tencent's chat app; these tools drive its DM/group/sticker APIs.
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `yb_query_group_info` | Query basic info about a group (called "派/Pai" in the app): name, owner, member count. | Yuanbao credentials |
| `yb_query_group_members` | Query members of a group (for `@`-mentions, finding a user by name, listing bots). | Yuanbao credentials |
| `yb_send_dm` | Send a private/direct message to a user in a group, with optional media files. | Yuanbao credentials |
| `yb_search_sticker` | Search the built-in Yuanbao sticker (TIM face) catalogue by keyword. | Yuanbao credentials |
| `yb_send_sticker` | Send a built-in sticker to the current Yuanbao chat. | Yuanbao credentials |