hermes-bsd/tools
Siddharth Balyan c9a3f36f56
feat: add video_analyze tool for native video understanding (#19301)
* feat: add video_analyze tool for native video understanding

Adds a video_analyze tool that sends video files to multimodal LLMs
(e.g. Gemini) for analysis via the OpenRouter-compatible video_url
content type. Mirrors vision_analyze in structure, error handling,
and registration pattern.

Key design:
- Base64 encodes entire video (no frame extraction, no ffmpeg dep)
- Uses 'video_url' content block type (OpenRouter standard)
- Supports mp4, webm, mov, avi, mkv, mpeg formats
- 50 MB hard cap, 20 MB warning threshold
- 180s minimum timeout (videos take longer than images)
- AUXILIARY_VIDEO_MODEL env override, falls back to AUXILIARY_VISION_MODEL
- Same SSRF protection, retry logic, and cleanup as vision_analyze

Default disabled: registered in 'video' toolset (not in _HERMES_CORE_TOOLS).
Users opt in via: hermes tools enable video, or enabled_toolsets=['video'].

* feat(video): add models.dev capability pre-check + CONFIGURABLE_TOOLSETS entry

- Pre-checks model video capability via models.dev modalities.input
  before expensive base64 encoding. Fails early with helpful message
  suggesting video-capable alternatives (gemini, mimo-v2.5-pro).
- Passes optimistically if model unknown or lookup fails.
- Adds ModelInfo.supports_video_input() helper.
- Adds 'video' to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS
  so 'hermes tools enable video' works from CLI.
- 8 new tests for the capability check (37 total).

* refactor(video): remove models.dev capability pre-check

Removes _check_video_model_capability and ModelInfo.supports_video_input.
The vision_analyze tool doesn't pre-check image capability either — both
tools rely on the same pattern: send request, handle API errors gracefully
with categorized user-facing messages. The pre-check was inconsistent
(only worked for some providers/models) so drop it for parity.

* cleanup: compress comments, fix fragile timeout coupling

- Replace _VISION_DOWNLOAD_TIMEOUT * 2 with hardcoded 60s (no silent
  breakage if vision timeout changes independently)
- Strip verbose comments and redundant log lines throughout
- No behavioral changes
2026-05-04 00:04:36 +05:30
..
browser_providers
environments fix(ci): stabilize current main test regressions 2026-04-30 06:36:50 -07:00
neutts_samples
__init__.py
ansi_strip.py
approval.py fix(approval): extend sensitive write target to cover shell RC and credential files 2026-05-03 08:49:13 -07:00
binary_extensions.py
browser_camofox.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
browser_camofox_state.py
browser_cdp_tool.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
browser_dialog_tool.py
browser_supervisor.py fix(browser_supervisor): verify thread and loop health before returning cached supervisor 2026-04-30 20:33:33 -07:00
browser_tool.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
budget_config.py
checkpoint_manager.py
clarify_tool.py
code_execution_tool.py fix(tools): serialize concurrent hermes_tools RPC calls from execute_code 2026-04-30 03:31:16 -07:00
credential_files.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
cronjob_tools.py fix(cron): accept list-form deliver values so deliver=['telegram'] works (#17456) 2026-04-29 06:35:34 -07:00
debug_helpers.py
delegate_tool.py fix(delegate): honor runtime default model during provider resolution 2026-04-30 19:58:55 -07:00
discord_tool.py fix(discord_tool): key capability cache by token instead of single global 2026-04-30 20:37:12 -07:00
env_passthrough.py refactor(config): add cfg_get() helper; migrate 20 nested-get call sites (#17304) 2026-04-28 23:17:39 -07:00
feishu_doc_tool.py
feishu_drive_tool.py
file_operations.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
file_state.py
file_tools.py fix(tools): write_file handler now rejects missing 'content'/'path' args instead of silently writing zero-byte files (#19096) 2026-05-03 08:52:41 -07:00
fuzzy_match.py
homeassistant_tool.py
image_generation_tool.py
interrupt.py
kanban_tools.py feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
managed_tool_gateway.py
mcp_oauth.py
mcp_oauth_manager.py
mcp_tool.py fix: clean up defensive shims and finish CI stabilization from #17660 (#17801) 2026-04-29 23:53:17 -07:00
memory_tool.py refactor: consolidate symlink-safe atomic replace into shared helper 2026-04-28 04:58:22 -07:00
mixture_of_agents_tool.py
neutts_synth.py
openrouter_client.py
osv_check.py
patch_parser.py
path_security.py
process_registry.py fix(process): reconcile session.exited against real child exit in poll/wait (#17430) 2026-04-29 04:59:21 -07:00
registry.py perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098) 2026-04-28 18:20:17 -07:00
rl_training_tool.py
schema_sanitizer.py refactor(schema): consolidate nullable-union stripping in schema_sanitizer 2026-04-28 04:58:03 -07:00
send_message_tool.py feat(gateway/signal): add support for multiple images sending 2026-04-30 04:28:08 -07:00
session_search_tool.py fix(session_search): order recent mode by last activity instead of start time 2026-04-30 20:17:15 -07:00
skill_manager_tool.py fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731) 2026-05-02 01:29:57 -07:00
skill_usage.py fix: use skill activity in curator status 2026-04-30 10:31:47 -07:00
skills_guard.py
skills_hub.py
skills_sync.py refactor: consolidate symlink-safe atomic replace into shared helper 2026-04-28 04:58:22 -07:00
skills_tool.py fix(skills): also bump_use on skill_view tool invocation 2026-04-30 05:07:34 -07:00
slash_confirm.py feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation 2026-04-29 21:56:47 -07:00
terminal_tool.py fix(terminal): skip sudo prompt when local NOPASSWD sudo works 2026-04-30 20:38:09 -07:00
tirith_security.py
todo_tool.py
tool_backend_helpers.py
tool_output_limits.py
tool_result_storage.py
transcription_tools.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
tts_tool.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
url_safety.py
vision_tools.py feat: add video_analyze tool for native video understanding (#19301) 2026-05-04 00:04:36 +05:30
voice_mode.py
web_tools.py perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098) 2026-04-28 18:20:17 -07:00
website_policy.py
xai_http.py
yuanbao_tools.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00