hermes-bsd

History

brooklyn! 3e74f75e41 feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 ) * feat(agent): coding-context posture with per-model edit-format tuning Hermes detects when it's running in a coding context — an interactive surface (CLI, TUI, ACP, desktop) sitting in a code workspace (git repo or recognised project root) — and shifts into a coding posture. Outside that (chat platforms, non-workspaces) nothing changes. The posture is modelled as a frozen RuntimeMode selected from a small ContextProfile registry (coding/general). A profile is data: the toolset to collapse to, the operating brief to inject, and seams for model routing and memory. Every domain reads the same resolved object instead of re-probing git/config on its own: - System prompt — RuntimeMode.system_blocks(): an operating brief (gather context before editing, edit through tools not chat, verify with terminal, cap retry loops) plus a live git/workspace snapshot, built once and baked into the stable prompt tier so per-conversation caching is preserved. - Per-model edit-format tuning — the brief nudges each model family toward the patch mode it handles best: OpenAI/Codex toward mode='patch' (V4A multi-file diffs), Anthropic toward mode='replace' (string replacement). The model id rides on RuntimeMode; unknown families keep neutral wording. - Skill index — non-coding skill categories are pruned from the prompt's skill index (discovery-only; skills_list/skill_view still reach the full catalog, with a disclosure note). - Toolset — only under the opt-in 'focus' mode does the posture collapse to the coding toolset + enabled MCP servers; the default posture is prompt-only and never overrides configured toolsets. Activation via agent.coding_context: auto (default), focus, on, off. Subagents inherit the posture for free via toolset inheritance + the shared prompt builder. Detection is not memoized so a long-lived gateway/TUI process can't pin a stale posture across working directories. * feat(agent): cover new-file authoring in the coding edit-format nudge The per-model edit-format guidance only addressed editing existing code (patch mode='patch' vs 'replace'), but authoring a brand-new file — write_file, not patch — is a large fraction of real coding work and the nudge was silent on it. Surfaced when building a single-file artifact where the dominant operation was write_file and the steering offered no guidance. Both family lines now lead with "author new files with write_file; for edits to existing code prefer ...". Tests assert write_file appears in each family's brief; unknown families still get neutral wording. * docs(agent): correct memoization docstring + clarify TUI config-load asymmetry * feat(agent): sharpen the coding posture — verify-loop facts, wider edit steering, $HOME guard Tuning pass on the coding posture from dogfooding it as a harness: - Workspace snapshot now hands the model its verify loop up front: detected manifests + package manager (lockfile sniff), the exact verify commands (package.json scripts, Makefile targets, scripts/run_tests.sh, pytest config), and which context files (AGENTS.md / CLAUDE.md / .cursorrules) exist at the root. Marker-only (non-git) projects get the snapshot too instead of nothing. The "verify before claiming done" brief line was the highest-value piece in evals — this turns it from advice into an executable loop instead of making the model rediscover the test command every session. Still stat-cheap, size-guarded reads, built once at prompt time. - Edit-format steering covers the families Hermes actually serves: Gemini and open-weight coding models (DeepSeek, Qwen, Kimi, GLM, Grok, Hermes, Llama, Mistral, Devstral, MiniMax) steer to mode='replace' — their RL scaffolds use str_replace-style editors. Previously only GPT/Codex and Claude families got steering; the models Hermes users disproportionately run all fell to neutral. - Operating brief gains four behaviors elite harnesses encode: batch independent reads/searches in one turn; fix root causes and the bug class (sibling call paths), not the reported site; no drive-by refactors/renames/reformatting; never read, print, or commit secrets. Plus a patch-failure escalation ladder: after the same region fails twice, rewrite the enclosing function/file with write_file instead of a third patch attempt. - $HOME dotfiles guard: a git repo rooted exactly at the home directory (or a marker sitting in it, e.g. a global ~/AGENTS.md) is user config, not a code workspace — without the guard, every session anywhere under a dotfiles-managed home silently flipped to the coding posture. Real projects under such a home still detect via their own markers/repos; 'on' mode bypasses the guard.		2026-06-10 23:06:44 -05:00
..
lsp	fix(lsp): detect Windows wrapper binaries in installer probes	2026-05-30 02:08:36 -07:00
transports	fix(codex): record app-server token usage in session accounting	2026-06-09 02:46:04 -07:00
__init__.py
test_anthropic_adapter.py	fix(anthropic): default new Claude models to the modern thinking contract (#42991 )	2026-06-09 23:37:23 +05:30
test_anthropic_keychain.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_anthropic_kwargs_sanitize.py	fix(anthropic): strip Responses-only kwargs before Messages SDK call (#31673 ) (#42155 )	2026-06-08 09:36:38 -07:00
test_anthropic_mcp_prefix_strip.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_anthropic_oauth_pkce.py	fix(auth): don't launch a text-mode browser inside the terminal for OAuth (#34479 )	2026-05-29 01:23:06 -07:00
test_anthropic_output_field_leak.py	fix(anthropic): strip output-only SDK fields from replayed content blocks	2026-06-10 20:45:16 -07:00
test_anthropic_thinking_block_order.py	refactor: keep anthropic_content_blocks in-memory only (no state.db column)	2026-06-10 20:45:16 -07:00
test_arcee_trinity_overrides.py	feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth (#40957 )	2026-06-07 01:40:50 -07:00
test_async_utils.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_client.py	fix(params): send max_completion_tokens for newer OpenAI families on custom endpoints	2026-06-09 23:22:10 -07:00
test_auxiliary_client_anthropic_custom.py
test_auxiliary_client_azure_foundry.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_client_xai_oauth_recovery.py	fix(auxiliary): detect xAI OAuth 403 bad-credentials as auth error	2026-05-29 00:28:02 -07:00
test_auxiliary_config_bridge.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_main_first.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_named_custom_providers.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_auxiliary_transport_autodetect.py
test_auxiliary_user_default_headers.py	fix(aux): honor model.default_headers on auxiliary client too (#40033 )	2026-06-07 02:02:40 -07:00
test_azure_identity_adapter.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_bedrock_1m_context.py
test_bedrock_adapter.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_bedrock_integration.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_cascading_interrupt_6600.py	fix(agent): don't retry interrupt-induced transport errors (cascading-interrupt hang)	2026-06-08 02:19:13 -07:00
test_codex_cloudflare_headers.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_codex_responses_adapter.py	feat(prompt): universal task-completion guidance + local Python toolchain probe (#34340 )	2026-05-28 22:26:09 -07:00
test_codex_ttfb_watchdog.py	fix(codex): relax no-byte TTFB watchdog default from 12s to 120s	2026-05-29 02:02:25 -07:00
test_coding_context.py	feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 )	2026-06-10 23:06:44 -05:00
test_compress_focus.py
test_compression_concurrent_fork.py	fix(compression): disable compression on background-review fork to prevent cross-turn stale-parent fork (#41708 )	2026-06-07 22:06:48 -07:00
test_compression_logging_session_context.py	fix(agent): sync logging session context on compaction id rotation	2026-06-07 22:30:02 -07:00
test_compressor_historical_media.py
test_compressor_image_tokens.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_compressor.py	fix(compression): don't overwrite the -1 post-compression sentinel in preflight seed (#36718 )	2026-06-07 01:56:51 -07:00
test_context_compressor_cross_session_guard.py	fix(compression): guard against cross-session stale _previous_summary contamination	2026-06-07 22:09:45 -07:00
test_context_compressor_summary_continuity.py	test: cover ci-unblocker production regressions	2026-05-27 22:14:53 -07:00
test_context_compressor_temporal_anchoring.py	feat(compression): temporal anchoring in compaction summaries (#41102 )	2026-06-07 08:36:45 -07:00
test_context_engine.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_engine_host_contract.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_context_references.py	fix(agent): make a binary @file: reference actionable instead of a dead end	2026-06-09 19:16:46 -05:00
test_copilot_acp_client.py
test_copilot_acp_deprecation.py
test_credential_pool.py	fix(auth): address Nous JWT fallback review	2026-05-29 02:24:48 -07:00
test_credential_pool_routing.py
test_credits_cold_start.py	Suppress "Credit access paused" notice on free models (#43669 )	2026-06-10 23:55:06 +05:30
test_credits_fixture_snapshot.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_credits_policy.py	Suppress "Credit access paused" notice on free models (#43669 )	2026-06-10 23:55:06 +05:30
test_credits_tracker.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_crossloop_client_cache.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_curator.py	fix(curator): use shared atomic state writer	2026-06-10 03:04:54 -07:00
test_curator_activity.py
test_curator_backup.py
test_curator_classification.py
test_curator_reports.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_custom_provider_extra_body.py
test_custom_providers_vision.py	fix(vision): honor custom_providers per-model supports_vision (#41036 )	2026-06-07 21:50:57 -07:00
test_deepseek_anthropic_thinking.py
test_direct_provider_url_detection.py
test_display.py	feat(web): Parallel-backed web search & extract — free Search MCP when keyless, v1 REST when keyed	2026-06-10 19:54:38 -07:00
test_display_emoji.py
test_display_todo_progress.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_display_tool_failure.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_error_classifier.py	fix(agent): route 'thinking blocks cannot be modified' 400 to recovery	2026-06-10 12:39:44 -07:00
test_external_skills.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_external_skills_dirs_cache.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety_container_mirror.py	fix(file-safety): extend sandbox-mirror guard to cover inner-container path (#32049 ) (#32407 )	2026-06-02 14:03:37 +10:00
test_file_safety_credentials.py
test_file_safety_cross_profile.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_file_safety_sandbox_mirror.py	fix(file-safety): add sandbox-mirror soft guard for writes to per-task .hermes mirrors (#32213 )	2026-06-02 11:29:24 +10:00
test_gemini_cloudcode.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_gemini_fast_fallback.py
test_gemini_free_tier_gate.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_gemini_native_adapter.py	fix(gemini): default native maxOutputTokens + strip OpenAI extra_body on Gemini endpoints (#39730 )	2026-06-05 03:53:59 -07:00
test_gemini_schema.py
test_i18n.py	fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383 )	2026-06-03 12:00:27 -07:00
test_image_gen_registry.py
test_image_routing.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_insights.py	refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618 )	2026-06-07 18:33:20 -07:00
test_jiter_preload.py	fix(agent): preload jiter native parser	2026-05-28 00:20:11 -07:00
test_kimi_coding_anthropic_thinking.py
test_last_total_tokens.py
test_local_stream_timeout.py	fix(local): recognize unqualified hostnames as local endpoints (#9248 )	2026-06-05 10:18:10 +10:00
test_markdown_tables.py
test_memory_async_sync.py	fix(memory): run end-of-turn sync off the turn thread (#41945 )	2026-06-08 02:18:59 -07:00
test_memory_provider.py	fix(memory): run end-of-turn sync off the turn thread (#41945 )	2026-06-08 02:18:59 -07:00
test_memory_session_switch.py	fix(memory): run end-of-turn sync off the turn thread (#41945 )	2026-06-08 02:18:59 -07:00
test_memory_user_id.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_minimax_auxiliary_url.py
test_minimax_provider.py	polish(minimax): address Copilot review comments on M3 default-aux fix	2026-06-04 05:53:35 -07:00
test_model_metadata.py	fix(models): read OpenRouter live context_length before hardcoded catch-all (#42986 )	2026-06-09 10:49:32 -07:00
test_model_metadata_local_ctx.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_model_metadata_ssl.py
test_models_dev.py	test: remove low-value model-catalog mirror tests	2026-05-29 23:45:05 -07:00
test_moonshot_schema.py	Add Hermes desktop app (#20059 )	2026-05-31 17:46:56 -05:00
test_non_stream_stale_timeout.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_nous_credits_gauge.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_nous_credits_snapshot.py	feat(credits): usage-aware credits — in-session notices, /usage view, dev readout (#40011 )	2026-06-06 13:18:18 +05:30
test_nous_oauth_401_guidance.py	feat(cli): make `hermes portal` the human-readable Portal onboarding alias	2026-06-04 01:19:28 +05:30
test_nous_rate_guard.py
test_onboarding.py	feat(onboarding): opt-in structured profile-build path on first contact (#41114 )	2026-06-07 08:36:48 -07:00
test_openrouter_response_cache.py
test_plugin_llm.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_portal_tags.py
test_prompt_builder.py	feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 )	2026-06-10 23:06:44 -05:00
test_prompt_caching.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_proxy_and_url_validation.py
test_rate_limit_tracker.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_redact.py	test(redact): assert Discord mentions pass through unchanged	2026-05-30 20:48:41 -07:00
test_resume_stale_active_task.py	fix(compressor): strip stale handoff prefix on resume; reconcile #26290+#32787 (#35344 )	2026-05-30 07:29:21 -07:00
test_runtime_cwd.py	fix(desktop): stabilize project folder sessions (#37586 )	2026-06-02 20:23:09 +00:00
test_save_url_image.py
test_set_runtime_main_custom_provider.py	test(auxiliary): e2e routing assertions for custom-provider aux resolution	2026-05-30 02:38:59 -07:00
test_shell_hooks.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_shell_hooks_consent.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_skill_bundles.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_skill_commands.py
test_skill_commands_reload.py
test_skill_utils.py
test_stream_read_timeout_floor.py	fix(streaming): stop socket read timeout from preempting stale-stream detector (#43570 )	2026-06-10 20:21:38 -05:00
test_streaming_context_scrubber.py
test_subagent_progress.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_subagent_stop_hook.py
test_subdirectory_hints.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_summary_prefix_semantics.py	fix(compression): drop conflicting 'resume Active Task' directive in summary prefix	2026-05-30 07:29:21 -07:00
test_system_prompt.py	feat(agent): coding-context posture across CLI/TUI/desktop/ACP (#43316 )	2026-06-10 23:06:44 -05:00
test_system_prompt_restore.py
test_think_scrubber.py
test_title_generator.py	chore: prune unused imports and duplicate import redefinitions	2026-05-28 22:26:25 -07:00
test_tool_dispatch_helpers.py	feat(security): promptware defense — shared threat patterns + memory load-time scan + tool-result delimiters (#32269 )	2026-05-25 14:52:24 -07:00
test_tool_guardrails.py
test_tool_result_classification.py
test_transcription_registry.py
test_tts_registry.py
test_turn_context.py	refactor(agent): extract run_conversation prologue into agent/turn_context.py	2026-06-07 22:17:35 -07:00
test_turn_retry_state.py	refactor(agent): consolidate inner-retry-loop recovery flags into TurnRetryState (god-file Phase 1b)	2026-06-07 22:42:05 -07:00
test_unsupported_parameter_retry.py
test_unsupported_temperature_retry.py	fix(auxiliary): stop capping output with max_tokens by default (#34530 ) (#34845 )	2026-05-29 17:24:30 -07:00
test_usage_pricing.py	fix(model): require confirmation for expensive model selections	2026-06-10 00:24:06 -07:00
test_video_gen_registry.py
test_vision_resolved_args.py
test_vision_routing_31179.py