hermes-bsd/tools/image_generation_tool.py

1181 lines
43 KiB
Python
Raw Normal View History

2025-08-09 09:51:53 -07:00
#!/usr/bin/env python3
"""
Image Generation Tools Module
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
Provides image generation via FAL.ai. Multiple FAL models are supported and
selectable via ``hermes tools`` Image Generation; the active model is
persisted to ``image_gen.model`` in ``config.yaml``.
Architecture:
- ``FAL_MODELS`` is a catalog of supported models with per-model metadata
(size-style family, defaults, ``supports`` whitelist, upscaler flag).
- ``_build_fal_payload()`` translates the agent's unified inputs (prompt +
aspect_ratio) into the model-specific payload and filters to the
``supports`` whitelist so models never receive rejected keys.
- Upscaling via FAL's Clarity Upscaler is gated per-model via the ``upscale``
flag on for FLUX 2 Pro (backward-compat), off for all faster/newer models
where upscaling would either hurt latency or add marginal quality.
Pricing shown in UI strings is as-of the initial commit; we accept drift and
update when it's noticed.
2025-08-09 09:51:53 -07:00
"""
import json
import logging
2025-08-09 09:51:53 -07:00
import os
import datetime
import threading
import uuid
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
from typing import Any, Dict, Optional
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
perf(image_gen): defer fal_client import to first generation request (#22859) `tools/image_generation_tool.py` did `import fal_client` at module top, which pulled the entire fal_client + httpx + rich stack on every process that ran `discover_builtin_tools()` — every `hermes` cold start, even ones that never touch image generation. Make the import lazy: replace the eager import with a placeholder (`fal_client: Any = None`) and add an idempotent `_load_fal_client()` that rebinds the module global on first use. Call it from the two runtime entry points (`_ManagedFalSyncClient.__init__` and `_submit_fal_request`) and from the SDK-presence check in `check_image_generation_requirements`. The loader short-circuits if the global is already truthy, which preserves the test pattern of monkeypatching `fal_client` to install a mock — the `monkeypatch.setattr(image_tool, "fal_client", ...)` calls in test_image_generation.py keep working unchanged. Measured impact (15-run min times, 9950X3D): tools.image_generation_tool alone: 77 → 20 ms (-74%) 36 → 20 MB (-44%) import cli (full): 734 → 720 ms (-2%) import model_tools: 372 → 366 ms (-2%) The microbench is dramatic but the full-CLI win is small — fal_client shares its httpx + rich dependencies with the rest of the agent, so on a real cold start most of the 16 MB / 64 ms is already paid by other imports. The win matters mostly for processes that touch this tool without otherwise loading httpx (rare) and for architectural consistency with the previous lazy-load PRs (#22681 google_chat, #22831 teams). Tests: 55/55 `tests/tools/test_image_generation.py` pass, including the cases that monkeypatch the module global to install a mock fal_client. End-to-end verification confirms `import model_tools` no longer pulls `fal_client` into `sys.modules`.
2026-05-09 17:45:09 -07:00
# fal_client is imported lazily — see _load_fal_client(). Pulling it
# eagerly added ~64 ms to every CLI cold start because
# discover_builtin_tools() imports this module unconditionally during
# the registry walk, even when image generation is never used.
#
# Tests that monkeypatch this attribute (e.g.
# ``monkeypatch.setattr(image_tool, "fal_client", fake_fal_client)``)
# still work: _load_fal_client() short-circuits when the attribute is
# anything truthy, so a test-installed mock is not overwritten by a
# subsequent real import.
fal_client: Any = None
def _load_fal_client() -> Any:
"""Lazily import fal_client and rebind the module global on first use.
Idempotent. Returns the (now-loaded) ``fal_client`` module reference.
Skips the import if the global is already truthy this preserves the
test pattern of monkeypatching the module global to install a mock.
"""
global fal_client
if fal_client is not None:
return fal_client
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
from tools.fal_common import import_fal_client
fal_client = import_fal_client()
perf(image_gen): defer fal_client import to first generation request (#22859) `tools/image_generation_tool.py` did `import fal_client` at module top, which pulled the entire fal_client + httpx + rich stack on every process that ran `discover_builtin_tools()` — every `hermes` cold start, even ones that never touch image generation. Make the import lazy: replace the eager import with a placeholder (`fal_client: Any = None`) and add an idempotent `_load_fal_client()` that rebinds the module global on first use. Call it from the two runtime entry points (`_ManagedFalSyncClient.__init__` and `_submit_fal_request`) and from the SDK-presence check in `check_image_generation_requirements`. The loader short-circuits if the global is already truthy, which preserves the test pattern of monkeypatching `fal_client` to install a mock — the `monkeypatch.setattr(image_tool, "fal_client", ...)` calls in test_image_generation.py keep working unchanged. Measured impact (15-run min times, 9950X3D): tools.image_generation_tool alone: 77 → 20 ms (-74%) 36 → 20 MB (-44%) import cli (full): 734 → 720 ms (-2%) import model_tools: 372 → 366 ms (-2%) The microbench is dramatic but the full-CLI win is small — fal_client shares its httpx + rich dependencies with the rest of the agent, so on a real cold start most of the 16 MB / 64 ms is already paid by other imports. The win matters mostly for processes that touch this tool without otherwise loading httpx (rare) and for architectural consistency with the previous lazy-load PRs (#22681 google_chat, #22831 teams). Tests: 55/55 `tests/tools/test_image_generation.py` pass, including the cases that monkeypatch the module global to install a mock fal_client. End-to-end verification confirms `import model_tools` no longer pulls `fal_client` into `sys.modules`.
2026-05-09 17:45:09 -07:00
return fal_client
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
from tools.debug_helpers import DebugSession
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
from tools.fal_common import (
_ManagedFalSyncClient,
_extract_http_status,
_normalize_fal_queue_url_format, # noqa: F401 — re-exported for tests
)
from tools.managed_tool_gateway import resolve_managed_tool_gateway
from tools.tool_backend_helpers import (
fal_key_is_configured,
managed_nous_tools_enabled,
nous_tool_gateway_unavailable_message,
prefers_gateway,
)
2025-08-09 09:51:53 -07:00
logger = logging.getLogger(__name__)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
# ---------------------------------------------------------------------------
# FAL model catalog
# ---------------------------------------------------------------------------
#
# Each entry declares how to translate our unified inputs into the model's
# native payload shape. Size specification falls into three families:
#
# "image_size_preset" — preset enum ("square_hd", "landscape_16_9", ...)
# used by the flux family, z-image, qwen, recraft,
# ideogram.
# "aspect_ratio" — aspect ratio enum ("16:9", "1:1", ...) used by
# nano-banana (Gemini).
# "gpt_literal" — literal dimension strings ("1024x1024", etc.)
# used by gpt-image-1.5.
#
# ``supports`` is a whitelist of keys allowed in the outgoing payload — any
# key outside this set is stripped before submission so models never receive
# rejected parameters (each FAL model rejects unknown keys differently).
#
# ``upscale`` controls whether to chain Clarity Upscaler after generation.
FAL_MODELS: Dict[str, Dict[str, Any]] = {
"fal-ai/flux-2/klein/9b": {
"display": "FLUX 2 Klein 9B",
"speed": "<1s",
"strengths": "Fast, crisp text",
"price": "$0.006/MP",
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
"num_inference_steps": 4,
"output_format": "png",
"enable_safety_checker": False,
},
"supports": {
"prompt", "image_size", "num_inference_steps", "seed",
"output_format", "enable_safety_checker",
},
"upscale": False,
},
"fal-ai/flux-2-pro": {
"display": "FLUX 2 Pro",
"speed": "~6s",
"strengths": "Studio photorealism",
"price": "$0.03/MP",
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
"num_inference_steps": 50,
"guidance_scale": 4.5,
"num_images": 1,
"output_format": "png",
"enable_safety_checker": False,
"safety_tolerance": "5",
"sync_mode": True,
},
"supports": {
"prompt", "image_size", "num_inference_steps", "guidance_scale",
"num_images", "output_format", "enable_safety_checker",
"safety_tolerance", "sync_mode", "seed",
},
"upscale": True, # Backward-compat: current default behavior.
},
"fal-ai/z-image/turbo": {
"display": "Z-Image Turbo",
"speed": "~2s",
"strengths": "Bilingual EN/CN, 6B",
"price": "$0.005/MP",
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
"num_inference_steps": 8,
"num_images": 1,
"output_format": "png",
"enable_safety_checker": False,
"enable_prompt_expansion": False, # avoid the extra per-request charge
},
"supports": {
"prompt", "image_size", "num_inference_steps", "num_images",
"seed", "output_format", "enable_safety_checker",
"enable_prompt_expansion",
},
"upscale": False,
},
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
"fal-ai/nano-banana-pro": {
"display": "Nano Banana Pro (Gemini 3 Pro Image)",
"speed": "~8s",
"strengths": "Gemini 3 Pro, reasoning depth, text rendering",
"price": "$0.15/image (1K)",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"size_style": "aspect_ratio",
"sizes": {
"landscape": "16:9",
"square": "1:1",
"portrait": "9:16",
},
"defaults": {
"num_images": 1,
"output_format": "png",
"safety_tolerance": "5",
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
# "1K" is the cheapest tier; 4K doubles the per-image cost.
# Users on Nous Subscription should stay at 1K for predictable billing.
"resolution": "1K",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
},
"supports": {
"prompt", "aspect_ratio", "num_images", "output_format",
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
"safety_tolerance", "seed", "sync_mode", "resolution",
"enable_web_search", "limit_generations",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
},
"upscale": False,
},
"fal-ai/gpt-image-1.5": {
"display": "GPT Image 1.5",
"speed": "~15s",
"strengths": "Prompt adherence",
"price": "$0.034/image",
"size_style": "gpt_literal",
"sizes": {
"landscape": "1536x1024",
"square": "1024x1024",
"portrait": "1024x1536",
},
"defaults": {
# Quality is pinned to medium to keep portal billing predictable
# across all users (low is too rough, high is 4-6x more expensive).
"quality": "medium",
"num_images": 1,
"output_format": "png",
},
"supports": {
"prompt", "image_size", "quality", "num_images", "output_format",
"background", "sync_mode",
},
"upscale": False,
},
"fal-ai/gpt-image-2": {
"display": "GPT Image 2",
"speed": "~20s",
"strengths": "SOTA text rendering + CJK, world-aware photorealism",
"price": "$0.040.06/image",
# GPT Image 2 uses FAL's standard preset enum (unlike 1.5's literal
# dimensions). We map to the 4:3 variants — the 16:9 presets
# (1024x576) fall below GPT-Image-2's 655,360 min-pixel requirement
# and would be rejected. 4:3 keeps us above the minimum on all
# three aspect ratios.
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_4_3", # 1024x768
"square": "square_hd", # 1024x1024
"portrait": "portrait_4_3", # 768x1024
},
"defaults": {
# Same quality pinning as gpt-image-1.5: medium keeps Nous
# Portal billing predictable. "high" is 3-4x the per-image
# cost at the same size; "low" is too rough for production use.
"quality": "medium",
"num_images": 1,
"output_format": "png",
},
"supports": {
"prompt", "image_size", "quality", "num_images", "output_format",
"sync_mode",
# openai_api_key (BYOK) intentionally omitted — all users go
# through the shared FAL billing path.
},
"upscale": False,
},
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"fal-ai/ideogram/v3": {
"display": "Ideogram V3",
"speed": "~5s",
"strengths": "Best typography",
"price": "$0.03-0.09/image",
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
"rendering_speed": "BALANCED",
"expand_prompt": True,
"style": "AUTO",
},
"supports": {
"prompt", "image_size", "rendering_speed", "expand_prompt",
"style", "seed",
},
"upscale": False,
},
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
"fal-ai/recraft/v4/pro/text-to-image": {
"display": "Recraft V4 Pro",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"speed": "~8s",
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
"strengths": "Design, brand systems, production-ready",
"price": "$0.25/image",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
# V4 Pro dropped V3's required `style` enum — defaults handle taste now.
"enable_safety_checker": False,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
},
"supports": {
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
"prompt", "image_size", "enable_safety_checker",
"colors", "background_color",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
},
"upscale": False,
},
"fal-ai/qwen-image": {
"display": "Qwen Image",
"speed": "~12s",
"strengths": "LLM-based, complex text",
"price": "$0.02/MP",
"size_style": "image_size_preset",
"sizes": {
"landscape": "landscape_16_9",
"square": "square_hd",
"portrait": "portrait_16_9",
},
"defaults": {
"num_inference_steps": 30,
"guidance_scale": 2.5,
"num_images": 1,
"output_format": "png",
"acceleration": "regular",
},
"supports": {
"prompt", "image_size", "num_inference_steps", "guidance_scale",
"num_images", "output_format", "acceleration", "seed", "sync_mode",
},
"upscale": False,
},
feat(image_gen/fal): add Krea 2 Medium + Large to FAL catalog (#33506) fal announced Krea 2 day-0 as an official API partner on 2026-05-27. Add both variants to the FAL_MODELS catalog so they appear in the 'hermes tools' model picker alongside flux-2, gpt-image, nano-banana, etc. Users who already bill through FAL or Nous Portal subscription can now use Krea without registering directly with Krea. Model IDs (as listed in fal's launch announcement): fal-ai/krea/v2/medium/text-to-image — $0.030 / image fal-ai/krea/v2/large/text-to-image — $0.060 / image Both share the same parameter schema: - aspect_ratio (1:1, 4:3, 3:2, 16:9, 2.35:1, 4:5, 2:3, 9:16) mapped from our 3 abstract ratios via size_style='aspect_ratio' - creativity (raw|low|medium|high; default medium) - seed (reproducibility) - image_style_references (up to 10 per Krea's API spec) No num_inference_steps / guidance_scale / num_images — Krea 2 does not expose those, and the supports-set filter strips them defensively if the agent ever passes them. This is the FAL-routed variant. The separate native-Krea-API plugin shipped in PR #33236 (plugins/image_gen/krea/) remains available for users who want to bill directly through Krea's API with their own key. Both routes converge on the same underlying model. Nous Portal managed-FAL gateway: this commit makes the model IDs known to the catalog and the picker. The Portal team will need to allowlist these two endpoint slugs on the fal-queue origin server-side for them to flow through the managed billing path.
2026-05-27 21:42:52 -07:00
# Krea 2 — Krea's first foundation image model, day-0 partner launch on
# fal (2026-05-27). Same model family as our direct ``plugins/image_gen/krea``
# backend, exposed here for users who prefer to bill through their
# existing FAL key / Nous Portal subscription rather than register
# directly with Krea. Both variants share the same parameter schema —
# only model id, price, and recommended use case differ.
"fal-ai/krea/v2/medium/text-to-image": {
"display": "Krea 2 Medium",
"speed": "~15-25s",
"strengths": "Illustration, anime, painting, expressive/artistic styles",
"price": "$0.030 (text) / $0.035 (style refs)",
"size_style": "aspect_ratio",
# Krea natively accepts 1:1, 4:3, 3:2, 16:9, 2.35:1, 4:5, 2:3, 9:16 —
# we map our 3 abstract ratios to the closest match.
"sizes": {
"landscape": "16:9",
"square": "1:1",
"portrait": "9:16",
},
"defaults": {
"creativity": "medium",
},
"supports": {
"prompt", "aspect_ratio", "creativity", "seed",
"image_style_references",
},
"upscale": False,
},
"fal-ai/krea/v2/large/text-to-image": {
"display": "Krea 2 Large",
"speed": "~25-60s",
"strengths": "Photorealism, raw textured looks (motion blur, grain, film)",
"price": "$0.060 (text) / $0.065 (style refs)",
"size_style": "aspect_ratio",
"sizes": {
"landscape": "16:9",
"square": "1:1",
"portrait": "9:16",
},
"defaults": {
"creativity": "medium",
},
"supports": {
"prompt", "aspect_ratio", "creativity", "seed",
"image_style_references",
},
"upscale": False,
},
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
# Default model is the fastest reasonable option. Kept cheap and sub-1s.
DEFAULT_MODEL = "fal-ai/flux-2/klein/9b"
DEFAULT_ASPECT_RATIO = "landscape"
VALID_ASPECT_RATIOS = ("landscape", "square", "portrait")
# ---------------------------------------------------------------------------
# Upscaler (Clarity Upscaler — unchanged from previous implementation)
# ---------------------------------------------------------------------------
2025-08-09 09:51:53 -07:00
UPSCALER_MODEL = "fal-ai/clarity-upscaler"
UPSCALER_FACTOR = 2
UPSCALER_SAFETY_CHECKER = False
UPSCALER_DEFAULT_PROMPT = "masterpiece, best quality, highres"
UPSCALER_NEGATIVE_PROMPT = "(worst quality, low quality, normal quality:2)"
UPSCALER_CREATIVITY = 0.35
UPSCALER_RESEMBLANCE = 0.6
UPSCALER_GUIDANCE_SCALE = 4
UPSCALER_NUM_INFERENCE_STEPS = 18
_debug = DebugSession("image_tools", env_var="IMAGE_TOOLS_DEBUG")
_managed_fal_client = None
_managed_fal_client_config = None
_managed_fal_client_lock = threading.Lock()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
# ---------------------------------------------------------------------------
# Managed FAL gateway (Nous Subscription)
# ---------------------------------------------------------------------------
def _resolve_managed_fal_gateway():
feat: ungate Tool Gateway — subscription-based access with per-tool opt-in Replace the HERMES_ENABLE_NOUS_MANAGED_TOOLS env-var feature flag with subscription-based detection. The Tool Gateway is now available to any paid Nous subscriber without needing a hidden env var. Core changes: - managed_nous_tools_enabled() checks get_nous_auth_status() + check_nous_free_tier() instead of an env var - New use_gateway config flag per tool section (web, tts, browser, image_gen) records explicit user opt-in and overrides direct API keys at runtime - New prefers_gateway(section) shared helper in tool_backend_helpers.py used by all 4 tool runtimes (web, tts, image gen, browser) UX flow: - hermes model: after Nous login/model selection, shows a curses prompt listing all gateway-eligible tools with current status. User chooses to enable all, enable only unconfigured tools, or skip. Defaults to Enable for new users, Skip when direct keys exist. - hermes tools: provider selection now manages use_gateway flag — selecting Nous Subscription sets it, selecting any other provider clears it - hermes status: renamed section to Nous Tool Gateway, added free-tier upgrade nudge for logged-in free users - curses_radiolist: new description parameter for multi-line context that survives the screen clear Runtime behavior: - Each tool runtime (web_tools, tts_tool, image_generation_tool, browser_use) checks prefers_gateway() before falling back to direct env-var credentials - get_nous_subscription_features() respects use_gateway flags, suppressing direct credential detection when the user opted in Removed: - HERMES_ENABLE_NOUS_MANAGED_TOOLS env var and all references - apply_nous_provider_defaults() silent TTS auto-set - get_nous_subscription_explainer_lines() static text - Override env var warnings (use_gateway handles this properly now)
2026-04-16 01:59:51 -04:00
"""Return managed fal-queue gateway config when the user prefers the gateway
or direct FAL credentials are absent."""
if fal_key_is_configured() and not prefers_gateway("image_gen"):
return None
return resolve_managed_tool_gateway("fal-queue")
def _get_managed_fal_client(managed_gateway):
"""Reuse the managed FAL client so its internal httpx.Client is not leaked per call."""
global _managed_fal_client, _managed_fal_client_config
client_config = (
managed_gateway.gateway_origin.rstrip("/"),
managed_gateway.nous_user_token,
)
with _managed_fal_client_lock:
if _managed_fal_client is not None and _managed_fal_client_config == client_config:
return _managed_fal_client
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
# Resolve fal_client on the legacy module — preserves the test
# pattern of monkey-patching ``image_generation_tool.fal_client``.
_load_fal_client()
_managed_fal_client = _ManagedFalSyncClient(
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
fal_client,
key=managed_gateway.nous_user_token,
queue_run_origin=managed_gateway.gateway_origin,
)
_managed_fal_client_config = client_config
return _managed_fal_client
def _submit_fal_request(model: str, arguments: Dict[str, Any]):
"""Submit a FAL request using direct credentials or the managed queue gateway."""
perf(image_gen): defer fal_client import to first generation request (#22859) `tools/image_generation_tool.py` did `import fal_client` at module top, which pulled the entire fal_client + httpx + rich stack on every process that ran `discover_builtin_tools()` — every `hermes` cold start, even ones that never touch image generation. Make the import lazy: replace the eager import with a placeholder (`fal_client: Any = None`) and add an idempotent `_load_fal_client()` that rebinds the module global on first use. Call it from the two runtime entry points (`_ManagedFalSyncClient.__init__` and `_submit_fal_request`) and from the SDK-presence check in `check_image_generation_requirements`. The loader short-circuits if the global is already truthy, which preserves the test pattern of monkeypatching `fal_client` to install a mock — the `monkeypatch.setattr(image_tool, "fal_client", ...)` calls in test_image_generation.py keep working unchanged. Measured impact (15-run min times, 9950X3D): tools.image_generation_tool alone: 77 → 20 ms (-74%) 36 → 20 MB (-44%) import cli (full): 734 → 720 ms (-2%) import model_tools: 372 → 366 ms (-2%) The microbench is dramatic but the full-CLI win is small — fal_client shares its httpx + rich dependencies with the rest of the agent, so on a real cold start most of the 16 MB / 64 ms is already paid by other imports. The win matters mostly for processes that touch this tool without otherwise loading httpx (rare) and for architectural consistency with the previous lazy-load PRs (#22681 google_chat, #22831 teams). Tests: 55/55 `tests/tools/test_image_generation.py` pass, including the cases that monkeypatch the module global to install a mock fal_client. End-to-end verification confirms `import model_tools` no longer pulls `fal_client` into `sys.modules`.
2026-05-09 17:45:09 -07:00
# Trigger the lazy import on first call. Idempotent.
_load_fal_client()
request_headers = {"x-idempotency-key": str(uuid.uuid4())}
managed_gateway = _resolve_managed_fal_gateway()
if managed_gateway is None:
return fal_client.submit(model, arguments=arguments, headers=request_headers)
managed_client = _get_managed_fal_client(managed_gateway)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
try:
return managed_client.submit(
model,
arguments=arguments,
headers=request_headers,
)
except Exception as exc:
# 4xx from the managed gateway typically means the portal doesn't
# currently proxy this model (allowlist miss, billing gate, etc.)
# — surface a clearer message with actionable remediation instead
# of a raw HTTP error from httpx.
status = _extract_http_status(exc)
if status is not None and 400 <= status < 500:
gateway_message = ""
if status in {401, 402, 403}:
gateway_message = (
"\n\n"
+ nous_tool_gateway_unavailable_message(
"managed FAL image generation",
force_fresh=True,
)
)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
raise ValueError(
f"Nous Subscription gateway rejected model '{model}' "
f"(HTTP {status}). This model may not yet be enabled on "
f"the Nous Portal's FAL proxy. Either:\n"
f" • Set FAL_KEY in your environment to use FAL.ai directly, or\n"
f" • Pick a different model via `hermes tools` → Image Generation."
f"{gateway_message}"
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
) from exc
raise
# ---------------------------------------------------------------------------
# Model resolution + payload construction
# ---------------------------------------------------------------------------
def _resolve_fal_model() -> tuple:
"""Resolve the active FAL model from config.yaml (primary) or default.
Returns (model_id, metadata_dict). Falls back to DEFAULT_MODEL if the
configured model is unknown (logged as a warning).
2025-08-09 09:51:53 -07:00
"""
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
model_id = ""
try:
from hermes_cli.config import load_config
cfg = load_config()
img_cfg = cfg.get("image_gen") if isinstance(cfg, dict) else None
if isinstance(img_cfg, dict):
raw = img_cfg.get("model")
if isinstance(raw, str):
model_id = raw.strip()
except Exception as exc:
logger.debug("Could not load image_gen.model from config: %s", exc)
# Env var escape hatch (undocumented; backward-compat for tests/scripts).
if not model_id:
model_id = os.getenv("FAL_IMAGE_MODEL", "").strip()
if not model_id:
return DEFAULT_MODEL, FAL_MODELS[DEFAULT_MODEL]
if model_id not in FAL_MODELS:
logger.warning(
"Unknown FAL model '%s' in config; falling back to %s",
model_id, DEFAULT_MODEL,
)
return DEFAULT_MODEL, FAL_MODELS[DEFAULT_MODEL]
return model_id, FAL_MODELS[model_id]
def _build_fal_payload(
model_id: str,
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
seed: Optional[int] = None,
overrides: Optional[Dict[str, Any]] = None,
) -> Dict[str, Any]:
"""Build a FAL request payload for `model_id` from unified inputs.
Translates aspect_ratio into the model's native size spec (preset enum,
aspect-ratio enum, or GPT literal string), merges model defaults, applies
caller overrides, then filters to the model's ``supports`` whitelist.
2025-08-09 09:51:53 -07:00
"""
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
meta = FAL_MODELS[model_id]
size_style = meta["size_style"]
sizes = meta["sizes"]
aspect = (aspect_ratio or DEFAULT_ASPECT_RATIO).lower().strip()
if aspect not in sizes:
aspect = DEFAULT_ASPECT_RATIO
payload: Dict[str, Any] = dict(meta.get("defaults", {}))
payload["prompt"] = (prompt or "").strip()
if size_style in {"image_size_preset", "gpt_literal"}:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
payload["image_size"] = sizes[aspect]
elif size_style == "aspect_ratio":
payload["aspect_ratio"] = sizes[aspect]
2025-08-09 09:51:53 -07:00
else:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
raise ValueError(f"Unknown size_style: {size_style!r}")
if seed is not None and isinstance(seed, int):
payload["seed"] = seed
if overrides:
for k, v in overrides.items():
if v is not None:
payload[k] = v
supports = meta["supports"]
return {k: v for k, v in payload.items() if k in supports}
# ---------------------------------------------------------------------------
# Upscaler
# ---------------------------------------------------------------------------
def _upscale_image(image_url: str, original_prompt: str) -> Optional[Dict[str, Any]]:
"""Upscale an image using FAL.ai's Clarity Upscaler.
Returns upscaled image dict, or None on failure (caller falls back to
the original image).
2025-08-09 09:51:53 -07:00
"""
try:
logger.info("Upscaling image with Clarity Upscaler...")
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
upscaler_arguments = {
"image_url": image_url,
"prompt": f"{UPSCALER_DEFAULT_PROMPT}, {original_prompt}",
"upscale_factor": UPSCALER_FACTOR,
"negative_prompt": UPSCALER_NEGATIVE_PROMPT,
"creativity": UPSCALER_CREATIVITY,
"resemblance": UPSCALER_RESEMBLANCE,
"guidance_scale": UPSCALER_GUIDANCE_SCALE,
"num_inference_steps": UPSCALER_NUM_INFERENCE_STEPS,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"enable_safety_checker": UPSCALER_SAFETY_CHECKER,
2025-08-09 09:51:53 -07:00
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
handler = _submit_fal_request(UPSCALER_MODEL, arguments=upscaler_arguments)
result = handler.get()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
if result and "image" in result:
upscaled_image = result["image"]
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
logger.info(
"Image upscaled successfully to %sx%s",
upscaled_image.get("width", "unknown"),
upscaled_image.get("height", "unknown"),
)
2025-08-09 09:51:53 -07:00
return {
"url": upscaled_image["url"],
"width": upscaled_image.get("width", 0),
"height": upscaled_image.get("height", 0),
"upscaled": True,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"upscale_factor": UPSCALER_FACTOR,
2025-08-09 09:51:53 -07:00
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
logger.error("Upscaler returned invalid response")
return None
2025-08-09 09:51:53 -07:00
except Exception as e:
logger.error("Error upscaling image: %s", e, exc_info=True)
2025-08-09 09:51:53 -07:00
return None
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
# ---------------------------------------------------------------------------
# Tool entry point
# ---------------------------------------------------------------------------
def _looks_like_absolute_file_path(value: str) -> bool:
if not value or not isinstance(value, str):
return False
lower = value.lower()
if lower.startswith(("http://", "https://", "data:")):
return False
if os.path.isabs(value):
return True
return len(value) >= 3 and value[1] == ":" and value[2] in {"/", "\\"}
def _active_terminal_env(task_id: str | None):
try:
from tools.terminal_tool import get_active_env
return get_active_env(task_id or "default")
except Exception as exc: # noqa: BLE001 - artifact hinting must not break generation
logger.debug("Could not inspect active terminal environment: %s", exc)
return None
def _agent_cache_base_for_env(env: Any) -> str | None:
if env is not None:
# Forward-looking optional override: an environment may expose its own
# agent-visible cache root via this callable. No backend defines it yet
# — it's an extension hook, not a typo. The getattr/callable guards make
# it a safe no-op until a producer exists.
explicit = getattr(env, "agent_visible_cache_base", None)
if callable(explicit):
try:
value = explicit()
if value:
return str(value).rstrip("/")
except Exception as exc: # noqa: BLE001
logger.debug("active env agent_visible_cache_base failed: %s", exc)
remote_home = getattr(env, "_remote_home", None)
if remote_home:
return f"{str(remote_home).rstrip('/')}/.hermes"
env_name = env.__class__.__name__
if env_name in {"DockerEnvironment", "SingularityEnvironment", "ModalEnvironment"}:
return "/root/.hermes"
# If no environment has been created yet, only backends with deterministic
# Hermes cache roots can be translated without side effects. SSH can still
# use a shell-visible tilde path; its first environment sync will upload
# the cache file before the first command runs.
backend = (os.getenv("TERMINAL_ENV") or "local").strip().lower()
if backend in {"docker", "singularity", "modal"}:
return "/root/.hermes"
if backend == "ssh":
return "~/.hermes"
return None
def _agent_visible_cache_path(host_path: str, env: Any) -> str | None:
if not _looks_like_absolute_file_path(host_path):
return None
cache_base = _agent_cache_base_for_env(env)
if not cache_base:
return None
try:
from tools.credential_files import map_cache_path_to_container
return map_cache_path_to_container(host_path, container_base=cache_base)
except Exception as exc: # noqa: BLE001
logger.debug("Could not translate image cache path for backend: %s", exc)
return None
def _force_artifact_sync(env: Any) -> None:
sync_manager = getattr(env, "_sync_manager", None)
if sync_manager is None:
return
try:
sync_manager.sync(force=True)
except Exception as exc: # noqa: BLE001 - keep generation success; log for operators
logger.warning("Could not force-sync generated image artifact: %s", exc)
def _postprocess_image_generate_result(raw: str, task_id: str | None = None) -> str:
"""Annotate successful local image results with backend-visible paths.
``image`` remains the host/gateway-deliverable path. When the active
terminal backend has a different filesystem, ``agent_visible_image`` gives
the path the agent can use with terminal/file tools.
"""
try:
payload = json.loads(raw) if isinstance(raw, str) else raw
except Exception:
return raw
if not isinstance(payload, dict) or not payload.get("success"):
return raw
image = payload.get("image")
if not isinstance(image, str) or not _looks_like_absolute_file_path(image):
return raw
env = _active_terminal_env(task_id)
agent_path = _agent_visible_cache_path(image, env)
if not agent_path or agent_path == image:
return raw
if env is not None:
_force_artifact_sync(env)
payload.setdefault("host_image", image)
payload.setdefault("agent_visible_image", agent_path)
return json.dumps(payload, ensure_ascii=False)
def image_generate_tool(
2025-08-09 09:51:53 -07:00
prompt: str,
aspect_ratio: str = DEFAULT_ASPECT_RATIO,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
num_inference_steps: Optional[int] = None,
guidance_scale: Optional[float] = None,
num_images: Optional[int] = None,
output_format: Optional[str] = None,
seed: Optional[int] = None,
2025-08-09 09:51:53 -07:00
) -> str:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"""Generate an image from a text prompt using the configured FAL model.
The agent-facing schema exposes only ``prompt`` and ``aspect_ratio``; the
remaining kwargs are overrides for direct Python callers and are filtered
per-model via the ``supports`` whitelist (unsupported overrides are
silently dropped so legacy callers don't break when switching models).
Returns a JSON string with ``{"success": bool, "image": url | None,
"error": str, "error_type": str}``.
2025-08-09 09:51:53 -07:00
"""
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
model_id, meta = _resolve_fal_model()
2025-08-09 09:51:53 -07:00
debug_call_data = {
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"model": model_id,
2025-08-09 09:51:53 -07:00
"parameters": {
"prompt": prompt,
"aspect_ratio": aspect_ratio,
2025-08-09 09:51:53 -07:00
"num_inference_steps": num_inference_steps,
"guidance_scale": guidance_scale,
"num_images": num_images,
"output_format": output_format,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"seed": seed,
2025-08-09 09:51:53 -07:00
},
"error": None,
"success": False,
"images_generated": 0,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"generation_time": 0,
2025-08-09 09:51:53 -07:00
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
start_time = datetime.datetime.now()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
try:
if not prompt or not isinstance(prompt, str) or len(prompt.strip()) == 0:
raise ValueError("Prompt is required and must be a non-empty string")
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
if not (fal_key_is_configured() or _resolve_managed_fal_gateway()):
raise ValueError(_build_no_backend_setup_message())
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
aspect_lc = (aspect_ratio or DEFAULT_ASPECT_RATIO).lower().strip()
if aspect_lc not in VALID_ASPECT_RATIOS:
logger.warning(
"Invalid aspect_ratio '%s', defaulting to '%s'",
aspect_ratio, DEFAULT_ASPECT_RATIO,
)
aspect_lc = DEFAULT_ASPECT_RATIO
overrides: Dict[str, Any] = {}
if num_inference_steps is not None:
overrides["num_inference_steps"] = num_inference_steps
if guidance_scale is not None:
overrides["guidance_scale"] = guidance_scale
if num_images is not None:
overrides["num_images"] = num_images
if output_format is not None:
overrides["output_format"] = output_format
arguments = _build_fal_payload(
model_id, prompt, aspect_lc, seed=seed, overrides=overrides,
2025-08-09 09:51:53 -07:00
)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
logger.info(
"Generating image with %s (%s) — prompt: %s",
meta.get("display", model_id), model_id, prompt[:80],
2025-08-09 09:51:53 -07:00
)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
handler = _submit_fal_request(model_id, arguments=arguments)
result = handler.get()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
generation_time = (datetime.datetime.now() - start_time).total_seconds()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
if not result or "images" not in result:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
raise ValueError("Invalid response from FAL.ai API — no images returned")
2025-08-09 09:51:53 -07:00
images = result.get("images", [])
if not images:
raise ValueError("No images were generated")
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
should_upscale = bool(meta.get("upscale", False))
2025-08-09 09:51:53 -07:00
formatted_images = []
for img in images:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
if not (isinstance(img, dict) and "url" in img):
continue
original_image = {
"url": img["url"],
"width": img.get("width", 0),
"height": img.get("height", 0),
}
if should_upscale:
upscaled_image = _upscale_image(img["url"], prompt.strip())
2025-08-09 09:51:53 -07:00
if upscaled_image:
formatted_images.append(upscaled_image)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
continue
logger.warning("Using original image as fallback (upscale failed)")
original_image["upscaled"] = False
formatted_images.append(original_image)
2025-08-09 09:51:53 -07:00
if not formatted_images:
raise ValueError("No valid image URLs returned from API")
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
upscaled_count = sum(1 for img in formatted_images if img.get("upscaled"))
logger.info(
"Generated %s image(s) in %.1fs (%s upscaled) via %s",
len(formatted_images), generation_time, upscaled_count, model_id,
)
2025-08-09 09:51:53 -07:00
response_data = {
"success": True,
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"image": formatted_images[0]["url"] if formatted_images else None,
2025-08-09 09:51:53 -07:00
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
debug_call_data["success"] = True
debug_call_data["images_generated"] = len(formatted_images)
debug_call_data["generation_time"] = generation_time
_debug.log_call("image_generate_tool", debug_call_data)
_debug.save()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-11-05 03:47:17 +00:00
return json.dumps(response_data, indent=2, ensure_ascii=False)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
except Exception as e:
generation_time = (datetime.datetime.now() - start_time).total_seconds()
error_msg = f"Error generating image: {str(e)}"
logger.error("%s", error_msg, exc_info=True)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
response_data = {
"success": False,
"image": None,
"error": str(e),
"error_type": type(e).__name__,
2025-08-09 09:51:53 -07:00
}
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-08-09 09:51:53 -07:00
debug_call_data["error"] = error_msg
debug_call_data["generation_time"] = generation_time
_debug.log_call("image_generate_tool", debug_call_data)
_debug.save()
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
2025-11-05 03:47:17 +00:00
return json.dumps(response_data, indent=2, ensure_ascii=False)
2025-08-09 09:51:53 -07:00
def check_fal_api_key() -> bool:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"""True if the FAL.ai API key (direct or managed gateway) is available."""
return bool(fal_key_is_configured() or _resolve_managed_fal_gateway())
2025-08-09 09:51:53 -07:00
def _build_no_backend_setup_message() -> str:
"""Build an actionable error string when no FAL backend is reachable.
Used by the in-tree FAL path. Mentions:
- FAL_KEY signup link
- managed-gateway status (if Nous tools are enabled)
- plugin alternative pointer (so users on a stale ``image_gen.provider``
know the registry exists and how to inspect it)
"""
lines = ["Image generation is unavailable in this environment.", ""]
lines.append("Missing requirements:")
if managed_nous_tools_enabled():
lines.append(
" - FAL_KEY is not set and the managed FAL gateway is unreachable"
)
else:
lines.append(" - FAL_KEY environment variable is not set")
gateway_message = nous_tool_gateway_unavailable_message(
"managed FAL image generation",
)
if gateway_message:
lines.append(f" - {gateway_message}")
lines.append("")
lines.append("To enable image generation, do one of:")
lines.append(
" 1. Get a free API key at https://fal.ai and set "
"FAL_KEY=<your-key> (then restart the session)"
)
if managed_nous_tools_enabled():
lines.append(
" 2. Sign in to a Nous account that has the managed FAL "
"gateway enabled (`hermes setup`)"
)
lines.append(
" 3. Configure a different image_gen provider via `hermes tools` "
"→ Image Generation (run `hermes plugins list` to see installed "
"backends)"
)
return "\n".join(lines)
2025-08-09 09:51:53 -07:00
def check_image_generation_requirements() -> bool:
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
"""True if any image gen backend is available.
Providers are considered in this order:
1. The in-tree FAL backend (FAL_KEY or managed gateway).
2. Any plugin-registered provider whose ``is_available()`` returns True.
Plugins win only when the in-tree FAL path is NOT ready, which matches
the historical behavior: shipping hermes with a FAL key configured
should still expose the tool. The active selection among ready
providers is resolved per-call by ``image_gen.provider``.
"""
2025-08-09 09:51:53 -07:00
try:
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
if check_fal_api_key():
perf(image_gen): defer fal_client import to first generation request (#22859) `tools/image_generation_tool.py` did `import fal_client` at module top, which pulled the entire fal_client + httpx + rich stack on every process that ran `discover_builtin_tools()` — every `hermes` cold start, even ones that never touch image generation. Make the import lazy: replace the eager import with a placeholder (`fal_client: Any = None`) and add an idempotent `_load_fal_client()` that rebinds the module global on first use. Call it from the two runtime entry points (`_ManagedFalSyncClient.__init__` and `_submit_fal_request`) and from the SDK-presence check in `check_image_generation_requirements`. The loader short-circuits if the global is already truthy, which preserves the test pattern of monkeypatching `fal_client` to install a mock — the `monkeypatch.setattr(image_tool, "fal_client", ...)` calls in test_image_generation.py keep working unchanged. Measured impact (15-run min times, 9950X3D): tools.image_generation_tool alone: 77 → 20 ms (-74%) 36 → 20 MB (-44%) import cli (full): 734 → 720 ms (-2%) import model_tools: 372 → 366 ms (-2%) The microbench is dramatic but the full-CLI win is small — fal_client shares its httpx + rich dependencies with the rest of the agent, so on a real cold start most of the 16 MB / 64 ms is already paid by other imports. The win matters mostly for processes that touch this tool without otherwise loading httpx (rare) and for architectural consistency with the previous lazy-load PRs (#22681 google_chat, #22831 teams). Tests: 55/55 `tests/tools/test_image_generation.py` pass, including the cases that monkeypatch the module global to install a mock fal_client. End-to-end verification confirms `import model_tools` no longer pulls `fal_client` into `sys.modules`.
2026-05-09 17:45:09 -07:00
# Trigger the lazy fal_client import here as the SDK presence
# check. Raises ImportError if the optional ``fal-client``
# package isn't installed; the caller's except ImportError
# below catches that and continues to plugin probing.
_load_fal_client()
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
return True
2025-08-09 09:51:53 -07:00
except ImportError:
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
pass
# Probe plugin providers. Discovery is idempotent and cheap.
try:
from agent.image_gen_registry import list_providers
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
for provider in list_providers():
try:
if provider.is_available():
return True
except Exception:
continue
except Exception:
pass
return False
2025-08-09 09:51:53 -07:00
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
# ---------------------------------------------------------------------------
# Demo / CLI entry point
# ---------------------------------------------------------------------------
2025-08-09 09:51:53 -07:00
if __name__ == "__main__":
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
print("🎨 Image Generation Tools — FAL.ai multi-model support")
2025-08-09 09:51:53 -07:00
print("=" * 60)
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
if not check_fal_api_key():
2025-08-21 08:10:54 -07:00
print("❌ FAL_KEY environment variable not set")
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
print(" Set it via: export FAL_KEY='your-key-here'")
print(" Get a key: https://fal.ai/")
raise SystemExit(1)
print("✅ FAL.ai API key found")
2025-08-09 09:51:53 -07:00
try:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
import fal_client # noqa: F401
2025-08-09 09:51:53 -07:00
print("✅ fal_client library available")
except ImportError:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
print("❌ fal_client library not found — pip install fal-client")
raise SystemExit(1)
model_id, meta = _resolve_fal_model()
print(f"🤖 Active model: {meta.get('display', model_id)} ({model_id})")
print(f" Speed: {meta.get('speed', '?')} · Price: {meta.get('price', '?')}")
print(f" Upscaler: {'on' if meta.get('upscale') else 'off'}")
print("\nAvailable models:")
for mid, m in FAL_MODELS.items():
marker = " ← active" if mid == model_id else ""
print(f" {mid:<32} {m.get('speed', '?'):<6} {m.get('price', '?')}{marker}")
if _debug.active:
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
print(f"\n🐛 Debug mode enabled — session {_debug.session_id}")
2026-02-21 20:22:33 -08:00
# ---------------------------------------------------------------------------
# Registry
# ---------------------------------------------------------------------------
from tools.registry import registry, tool_error
2026-02-21 20:22:33 -08:00
IMAGE_GENERATE_SCHEMA = {
"name": "image_generate",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"description": (
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
"Generate high-quality images from text prompts. The underlying "
"backend (FAL, OpenAI, etc.) and model are user-configured and not "
"selectable by the agent. Returns either a URL or an absolute file "
"path in the `image` field; display it with markdown "
"![description](url-or-path) and the gateway will deliver it. When "
"the active terminal backend has a different filesystem, successful "
"local-file results may also include `agent_visible_image` for "
"follow-up terminal/file operations."
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
),
2026-02-21 20:22:33 -08:00
"parameters": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"description": "The text prompt describing the desired image. Be detailed and descriptive.",
2026-02-21 20:22:33 -08:00
},
"aspect_ratio": {
"type": "string",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"enum": list(VALID_ASPECT_RATIOS),
2026-02-21 20:22:33 -08:00
"description": "The aspect ratio of the generated image. 'landscape' is 16:9 wide, 'portrait' is 16:9 tall, 'square' is 1:1.",
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"default": DEFAULT_ASPECT_RATIO,
},
2026-02-21 20:22:33 -08:00
},
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
"required": ["prompt"],
},
2026-02-21 20:22:33 -08:00
}
def _read_configured_image_model():
"""Return the value of ``image_gen.model`` from config.yaml, or None."""
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
if isinstance(section, dict):
value = section.get("model")
if isinstance(value, str) and value.strip():
return value.strip()
except Exception as exc:
logger.debug("Could not read image_gen.model: %s", exc)
return None
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
def _read_configured_image_provider():
"""Return the value of ``image_gen.provider`` from config.yaml, or None.
We only consult the plugin registry when this is explicitly set an
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
unset value keeps users on the in-tree FAL fallback even when other
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
providers happen to be registered (e.g. a user has OPENAI_API_KEY set
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
for other features but never asked for OpenAI image gen). ``"fal"``
explicitly routes through ``plugins/image_gen/fal/`` (which delegates
back into this module's pipeline via call-time indirection — see
issue #26241).
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
"""
try:
from hermes_cli.config import load_config
cfg = load_config()
section = cfg.get("image_gen") if isinstance(cfg, dict) else None
if isinstance(section, dict):
value = section.get("provider")
if isinstance(value, str) and value.strip():
return value.strip()
except Exception as exc:
logger.debug("Could not read image_gen.provider: %s", exc)
return None
def _dispatch_to_plugin_provider(prompt: str, aspect_ratio: str):
"""Route the call to a plugin-registered provider when one is selected.
Returns a JSON string on dispatch, or ``None`` to fall through to the
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
in-tree FAL fallback in ``image_generate_tool``.
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
Dispatch fires when ``image_gen.provider`` is explicitly set including
``"fal"`` itself, which now resolves to the
``plugins/image_gen/fal/`` plugin (the plugin re-enters this module's
pipeline via ``_it`` indirection so behavior is identical to the
direct call, just routed through the registry).
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
"""
configured = _read_configured_image_provider()
refactor(image_gen): port FAL backend to plugins/image_gen/fal Mirrors the architecture established by the web (#25182), browser (#25214), and video_gen (#25126) plugin migrations: * `tools/fal_common.py` — stateless atoms shared by both FAL-backed plugins (image_gen + video_gen). Holds the lazy `fal_client` import helper, `_ManagedFalSyncClient`, `_normalize_fal_queue_url_format`, `_extract_http_status`. Stateful pieces (`fal_client` module global, `_managed_fal_client*` cache, `_submit_fal_request`, `_resolve_managed_fal_gateway`, `_get_managed_fal_client`) intentionally stay on `tools.image_generation_tool` so the existing `monkeypatch.setattr(image_tool, ...)` patch sites keep working unchanged. * `plugins/video_gen/fal/__init__.py` — drops its inline `_load_fal_client` duplicate; consumes `tools.fal_common.import_fal_client`. * `plugins/image_gen/fal/{plugin.yaml,__init__.py}` — new plugin. `FalImageGenProvider` is a thin registration adapter that resolves the legacy module via `import tools.image_generation_tool as _it` and calls `_it.image_generate_tool` + `_it._resolve_fal_model` at call time. The 18-model catalog, `_build_fal_payload`, managed- gateway selection, and Clarity Upscaler chaining all remain in `tools.image_generation_tool` as the single source of truth — the plugin is a registration adapter, not a parallel implementation. * `tools/image_generation_tool.py::_dispatch_to_plugin_provider` — drops the `configured == "fal"` skip. Setting `image_gen.provider: fal` now routes through the registry like any other provider; the plugin re-enters this module's pipeline so behavior is identical. Unset `image_gen.provider` still falls through to the in-tree pipeline (preserves no-config-with-FAL_KEY UX from #15696). * `hermes_cli/tools_config.py` — drops the hardcoded "FAL.ai" row from `TOOL_CATEGORIES["image_gen"]["providers"]` (now injected by `_plugin_image_gen_providers` like every other backend) and the `getattr(provider, "name") == "fal"` skip that protected against duplication with the hardcoded row. The "Nous Subscription" row stays as a setup-flow entry — same shape browser kept "Nous Subscription (Browser Use cloud)" after #25214. * `tests/plugins/image_gen/test_fal_provider.py` — 14 cases covering the ABC surface, call-time indirection (verifying `monkeypatch.setattr(image_tool, "image_generate_tool", ...)` takes effect through the plugin), response-shape stamping, exception handling, and registry wiring. * `tests/plugins/image_gen/check_parity_vs_main.py` — subprocess harness mirroring `tests/plugins/browser/check_parity_vs_main.py`. Pins one path to origin/main, one to the worktree; runs six scenarios (unset, explicit-fal-no-creds, explicit-fal-with-creds, explicit-fal-with-model, typo provider, managed-gateway-only) and diffs the reduced shape `{dispatch_kind, provider_name, model}` per scenario. The only acceptable diff is "legacy_fal → plugin (fal)" for explicit-FAL paths — every other delta is flagged as a regression. * `tests/hermes_cli/test_image_gen_picker.py::test_fal_surfaced_alongside_other_plugins` — flips the previous `test_fal_skipped_to_avoid_duplicate` to match the new shape (FAL is a plugin now, no dedup needed). Verified: 195/195 tests across `tests/{tools/test_image_generation*,tools/test_managed_media_gateways,plugins/image_gen,plugins/video_gen,hermes_cli/test_image_gen_picker}.py` pass on this branch with no test patches modified outside the picker test that asserted the old skip behaviour. Fixes #26241
2026-05-18 17:42:02 +05:30
if not configured:
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
return None
# Also read configured model so we can pass it to the plugin
configured_model = _read_configured_image_model()
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
try:
# Import locally so plugin discovery isn't triggered just by
# importing this module (tests rely on that).
from agent.image_gen_registry import get_provider
from hermes_cli.plugins import _ensure_plugins_discovered
_ensure_plugins_discovered()
provider = get_provider(configured)
except Exception as exc:
logger.debug("image_gen plugin dispatch skipped: %s", exc)
return None
if provider is None:
try:
# Long-lived sessions may have discovered plugins before a bundled
# backend was patched in or before config changed. Retry once with
# a forced refresh before surfacing a missing-provider error.
_ensure_plugins_discovered(force=True)
provider = get_provider(configured)
except Exception as exc:
logger.debug("image_gen plugin force-refresh skipped: %s", exc)
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
if provider is None:
return json.dumps({
"success": False,
"image": None,
"error": (
f"image_gen.provider='{configured}' is set but no plugin "
f"registered that name. Run `hermes plugins list` to see "
f"available image gen backends."
),
"error_type": "provider_not_registered",
})
try:
kwargs = {"prompt": prompt, "aspect_ratio": aspect_ratio}
if configured_model:
kwargs["model"] = configured_model
result = provider.generate(**kwargs)
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
except Exception as exc:
logger.warning(
"Image gen provider '%s' raised: %s",
getattr(provider, "name", "?"), exc,
)
return json.dumps({
"success": False,
"image": None,
"error": f"Provider '{getattr(provider, 'name', '?')}' error: {exc}",
"error_type": "provider_exception",
})
if not isinstance(result, dict):
return json.dumps({
"success": False,
"image": None,
"error": "Provider returned a non-dict result",
"error_type": "provider_contract",
})
return json.dumps(result)
2026-02-21 20:22:33 -08:00
def _handle_image_generate(args, **kw):
prompt = args.get("prompt", "")
if not prompt:
return tool_error("prompt is required for image generation")
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
aspect_ratio = args.get("aspect_ratio", DEFAULT_ASPECT_RATIO)
task_id = kw.get("task_id")
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
# Route to a plugin-registered provider if one is active (and it's
# not the in-tree FAL path).
dispatched = _dispatch_to_plugin_provider(prompt, aspect_ratio)
if dispatched is not None:
return _postprocess_image_generate_result(dispatched, task_id=task_id)
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
raw = image_generate_tool(
2026-02-21 20:22:33 -08:00
prompt=prompt,
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799) * feat(plugins): pluggable image_gen backends + OpenAI provider Adds a ImageGenProvider ABC so image generation backends register as bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner gains three primitives to make this work generically: - `kind:` manifest field (`standalone` | `backend` | `exclusive`). Bundled `kind: backend` plugins auto-load — no `plugins.enabled` incantation. User-installed backends stay opt-in. - Path-derived keys: `plugins/image_gen/openai/` gets key `image_gen/openai`, so a future `tts/openai` cannot collide. - Depth-2 recursion into category namespaces (parent dirs without a `plugin.yaml` of their own). Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5 default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64 responses save to `$HERMES_HOME/cache/images/`; URL responses pass through. FAL stays in-tree for this PR — a follow-up ports it into `plugins/image_gen/fal/` so the in-tree `image_generation_tool.py` slims down. The dispatch shim in `_handle_image_generate` only fires when `image_gen.provider` is explicitly set to a non-FAL value, so existing FAL setups are untouched. - 41 unit tests (scanner recursion, kind parsing, gate logic, registry, OpenAI payload shapes) - E2E smoke verified: bundled plugin autoloads, registers, and `_handle_image_generate` routes to OpenAI when configured * fix(image_gen/openai): don't send response_format to gpt-image-* The live API rejects it: 'Unknown parameter: response_format' (verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return b64_json unconditionally, so the parameter was both unnecessary and actively broken. * feat(image_gen/openai): gpt-image-2 only, drop legacy catalog gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21) and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 / dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward (dall-e-2 squares only). Trim the catalog down to a single model. Live-verified end-to-end: landscape 1536x1024 render of a Moog-style synth matches prompt exactly, 2.4MB PNG saved to cache. * feat(image_gen/openai): expose gpt-image-2 as three quality tiers Users pick speed/fidelity via the normal model picker instead of a hidden quality knob. All three tier IDs resolve to the single underlying gpt-image-2 API model with a different quality parameter: gpt-image-2-low ~15s fast iteration gpt-image-2-medium ~40s default gpt-image-2-high ~2min highest fidelity Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the same 1024x1024 prompt. Config: image_gen.openai.model: gpt-image-2-high # or image_gen.model: gpt-image-2-low # or env var for scripts/tests OPENAI_IMAGE_MODEL=gpt-image-2-medium Live-verified end-to-end with the low tier: 18.8s landscape render of a golden retriever in wildflowers, vision-confirmed exact match. * feat(tools_config): plugin image_gen providers inject themselves into picker 'hermes tools' → Image Generation now shows plugin-registered backends alongside Nous Subscription and FAL.ai without tools_config.py needing to know about them. OpenAI appears as a third option today; future backends appear automatically as they're added. Mechanism: - ImageGenProvider gains an optional get_setup_schema() hook (name, badge, tag, env_vars). Default derived from display_name. - tools_config._plugin_image_gen_providers() pulls the schemas from every registered non-FAL plugin provider. - _visible_providers() appends those rows when rendering the Image Generation category. - _configure_provider() handles the new image_gen_plugin_name marker: writes image_gen.provider and routes to the plugin's list_models() catalog for the model picker. - _toolset_needs_configuration_prompt('image_gen') stops demanding a FAL key when any plugin provider reports is_available(). FAL is skipped in the plugin path because it already has hardcoded TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up PR the hardcoded rows go away and it surfaces through the same path as OpenAI. Verified live: picker shows Nous Subscription / FAL.ai / OpenAI. Picking OpenAI prompts for OPENAI_API_KEY, then shows the gpt-image-2-low/medium/high model picker sourced from the plugin. 397 tests pass across plugins/, tools_config, registry, and picker. * fix(image_gen): close final gaps for plugin-backend parity with FAL Two small places that still hardcoded FAL: - hermes_cli/setup.py status line: an OpenAI-only setup showed 'Image Generation: missing FAL_KEY'. Now probes plugin providers and reports '(OpenAI)' when one is_available() — or falls back to 'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured. - image_generate tool schema description: said 'using FAL.ai, default FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are user-configured' — and notes the 'image' field can be a URL or an absolute path, which the gateway delivers either way via extract_local_files().
2026-04-21 21:30:10 -07:00
aspect_ratio=aspect_ratio,
2026-02-21 20:22:33 -08:00
)
return _postprocess_image_generate_result(raw, task_id=task_id)
2026-02-21 20:22:33 -08:00
registry.register(
name="image_generate",
toolset="image_gen",
schema=IMAGE_GENERATE_SCHEMA,
handler=_handle_image_generate,
check_fn=check_image_generation_requirements,
requires_env=[],
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265) * feat(image_gen): multi-model FAL support with picker in hermes tools Adds 8 FAL text-to-image models selectable via `hermes tools` → Image Generation → (FAL.ai | Nous Subscription) → model picker. Models supported: - fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP) - fal-ai/flux-2-pro (previous default, kept backward-compat upscaling) - fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN) - fal-ai/nano-banana (Gemini 2.5 Flash Image) - fal-ai/gpt-image-1.5 (with quality tier: low/medium/high) - fal-ai/ideogram/v3 (best typography) - fal-ai/recraft-v3 (vector, brand styles) - fal-ai/qwen-image (LLM-based) Architecture: - FAL_MODELS catalog declares per-model size family, defaults, supports whitelist, and upscale flag. Three size families handled uniformly: image_size_preset (flux family), aspect_ratio (nano-banana), and gpt_literal (gpt-image-1.5). - _build_fal_payload() translates unified inputs (prompt + aspect_ratio) into model-specific payloads, merges defaults, applies caller overrides, wires GPT quality_setting, then filters to the supports whitelist — so models never receive rejected keys. - IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen providers (Replicate, Stability, etc.); each provider entry tags itself with imagegen_backend: 'fal' to select the right catalog. - Upscaler (Clarity) defaults off for new models (preserves <1s value prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS. Config: image_gen.model = fal-ai/flux-2/klein/9b (new) image_gen.quality_setting = medium (new, GPT only) image_gen.use_gateway = bool (existing) Agent-facing schema unchanged (prompt + aspect_ratio only) — model choice is a user-level config decision, not an agent-level arg. Picker uses curses_radiolist (arrow keys, auto numbered-fallback on non-TTY). Column-aligned: Model / Speed / Strengths / Price. Docs: image-generation.md rewritten with the model table and picker walkthrough. tools-reference, tool-gateway, overview updated to drop the stale "FLUX 2 Pro" wording. Tests: 42 new in tests/tools/test_image_generation.py covering catalog integrity, all 3 size families, supports filter, default merging, GPT quality wiring, model resolution fallback. 8 new in tests/hermes_cli/test_tools_config.py for picker wiring (registry, config writes, GPT quality follow-up prompt, corrupt-config repair). * feat(image_gen): translate managed-gateway 4xx to actionable error When the Nous Subscription managed FAL proxy rejects a model with 4xx (likely portal-side allowlist miss or billing gate), surface a clear message explaining: 1. The rejected model ID + HTTP status 2. Two remediation paths: set FAL_KEY for direct access, or pick a different model via `hermes tools` 5xx, connection errors, and direct-FAL errors pass through unchanged (those have different root causes and reasonable native messages). Motivation: new FAL models added to this release (flux-2-klein-9b, z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3, qwen-image) are untested against the Nous Portal proxy. If the portal allowlists model IDs, users on Nous Subscription will hit cryptic 4xx errors without guidance on how to work around it. Tests: 8 new cases covering status extraction across httpx/fal error shapes and 4xx-vs-5xx-vs-ConnectionError translation policy. Docs: brief note in image-generation.md for Nous subscribers. Operator action (Nous Portal side): verify that fal-queue-gateway passes through these 7 new FAL model IDs. If the proxy has an allowlist, add them; otherwise Nous Subscription users will see the new translated error and fall back to direct FAL. * feat(image_gen): pin GPT-Image quality to medium (no user choice) Previously the tools picker asked a follow-up question for GPT-Image quality tier (low / medium / high) and persisted the answer to `image_gen.quality_setting`. This created two problems: 1. Nous Portal billing complexity — the 22x cost spread between tiers ($0.009 low / $0.20 high) forces the gateway to meter per-tier per user, which the portal team can't easily support at launch. 2. User footgun — anyone picking `high` by mistake burns through credit ~6x faster than `medium`. This commit pins quality at medium by baking it into FAL_MODELS defaults for gpt-image-1.5 and removes all user-facing override paths: - Removed `_resolve_gpt_quality()` runtime lookup - Removed `honors_quality_setting` flag on the model entry - Removed `_configure_gpt_quality_setting()` picker helper - Removed `_GPT_QUALITY_CHOICES` constant - Removed the follow-up prompt call in `_configure_imagegen_model()` - Even if a user manually edits `image_gen.quality_setting` in config.yaml, no code path reads it — always sends medium. Tests: - Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium (5 tests) — proves medium is baked in, config is ignored, flag is removed, helper is removed, non-gpt models never get quality. - Replaced test_picker_with_gpt_image_also_prompts_quality with test_picker_with_gpt_image_does_not_prompt_quality — proves only 1 picker call fires when gpt-image is selected (no quality follow-up). Docs updated: image-generation.md replaces the quality-tier table with a short note explaining the pinning decision. * docs(image_gen): drop stale 'wires GPT quality tier' line from internals section Caught in a cleanup sweep after pinning quality to medium. The "How It Works Internally" walkthrough still described the removed quality-wiring step.
2026-04-16 20:19:53 -07:00
is_async=False, # sync fal_client API to avoid "Event loop is closed" in gateway
emoji="🎨",
2026-02-21 20:22:33 -08:00
)