Follow-up to the winget stale-registration fix. Update-ProcessPathForPackages
rebuilt $env:Path wholesale from the persisted User+Machine hives (plus winget's
Links dir), discarding any process-only PATH entries added earlier in the
installer run. Since the helper now runs after every package manager, that
wholesale replace is more likely to clobber a process-local entry than the
original winget-branch-only version was.
Merge instead: seed from the current process PATH, then append hive and
winget-Links entries not already present, with a case-insensitive,
order-preserving dedupe. Behaviour on a clean box is unchanged (the hive entries
are simply appended); the difference is that pre-existing process-only entries
now survive the refresh.
When ripgrep/ffmpeg is missing, `winget install <id>` on a package winget
already has registered is treated as an upgrade: it finds no newer version and
exits 0x8A15002B (-1978335189, APPINSTALLER_CLI_ERROR_UPDATE_NOT_APPLICABLE)
without ensuring the binary is actually present. The installer only logged that
code and judged success by `Get-Command rg`, so a stale registration (files
removed outside winget, or a missing alias shim) became a permanent dead-end —
winget kept reporting "already installed" and the user could never reinstall.
Detect that exit code and retry once with `--force` to repair the registration
so the shim reappears.
Also refresh the process PATH after the choco and scoop fallbacks (not just
winget) via a shared helper, so a successful fallback install — or any install
on a box without winget — is no longer misreported as "not installed".
* docs(windows): correct native data dir to %LOCALAPPDATA%\hermes
The Windows-native guide claimed a deliberate split where config, auth,
skills, and sessions live under %USERPROFILE%\.hermes. That is not what
the installer does: scripts/install.ps1 sets HERMES_HOME=%LOCALAPPDATA%\hermes,
so data actually lives in %LOCALAPPDATA%\hermes alongside the disposable
install (the hermes-agent\, git\, node\, bin\ subdirectories) — `hermes
config` confirms config.yaml/.env resolve there, not under %USERPROFILE%.
Update the data-layout table, the "split is deliberate" note, the env-var
and uninstall sections to describe the real layout: data and install share
the %LOCALAPPDATA%\hermes root, reinstall only replaces hermes-agent\, and
a full wipe targets %LOCALAPPDATA%\hermes (with %USERPROFILE%\.hermes kept
only as a legacy/WSL cleanup). Mention HERMES_HOME as the override knob.
* docs(windows): fix PATH + bin layout to match installer
The installer adds hermes-agent\venv\Scripts (where hermes.exe lives) to
User PATH and sets HERMES_HOME — not %LOCALAPPDATA%\hermes\bin. The \bin
dir holds Hermes's managed uv.exe, not a hermes.cmd shim. Correct the
install-step list and the data-layout table accordingly.
* fix(install): show real HERMES_HOME path in setup messages
The native Windows installer wrote config/env/skills under $HermesHome
(%LOCALAPPDATA%\hermes) but its success messages claimed ~/.hermes,
which doesn't exist on native Windows. Print the actual paths so a new
user can find their config, .env, and skills.
* fix(install): self-heal a stuck Electron download on the desktop build
The desktop build downloads Electron (~114MB) from GitHub. A corrupt cached
zip, or a blocked/throttled GitHub release host (the repeating "retrying" log),
hard-failed the install — and install.sh had no recovery at all while
install.ps1 / `hermes desktop` only purged the cache.
All three build paths now escalate on a failed `npm run pack`:
GitHub → purge corrupt electron-*.zip + stale *-unpacked and retry → one retry
via a public Electron mirror (npmmirror.com). @electron/get SHASUM-verifies the
download, and a user-pinned ELECTRON_MIRROR is always respected (never
overridden). Adds a bash clear_electron_build_cache()/_desktop_pack() to mirror
the existing PowerShell/Python helpers.
* test(install): cover the Electron mirror fallback
Verify `hermes desktop` falls back to a mirror when the cache purge finds
nothing, and that a user-pinned ELECTRON_MIRROR is respected (no extra attempt,
not overridden).
* docs(desktop): troubleshoot a stuck Electron download
Document the automatic cache-purge + mirror fallback, how to pin your own
ELECTRON_MIRROR, and how to clear a corrupt cached zip by hand.
* docs(install): correct the Electron mirror trust framing
The mirror-fallback comments and the desktop troubleshooting doc implied
`@electron/get`'s SHASUM check makes the npmmirror.com download safe against
tampering. It doesn't: the SHASUMS256.txt is fetched from the same mirror, so
the check guards against a corrupt/partial download, not a compromised mirror.
Reframe all four surfaces (install.sh, install.ps1, `hermes desktop`, and the
docs) to state the trust trade-off honestly — npmmirror.com is the de-facto
Electron community mirror, we only fall back to it after the canonical GitHub
download fails, and a user-pinned ELECTRON_MIRROR is never overridden. No
behavior change.
---------
Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
A bare `git fetch origin` (and `git fetch upstream`) pulls every ref. The
repo carries thousands of auto-generated branches, so on any
non-single-branch checkout the installer's update path and `hermes update`
spend minutes downloading the full branch list — long enough to stall the
desktop installer or trip the follow-up `git pull --ff-only`.
Scope every update-path fetch to the branch we actually compare/merge
against:
- scripts/install.sh: collapse the remote to single-branch and fetch only
$BRANCH on the "existing install, updating" path.
- hermes_cli/main.py: fetch the resolved branch in the apply path, the
--check path (upstream + origin), and the fork upstream-sync.
Tracking-ref updates still happen via git's opportunistic refspec, so the
later origin/<branch> rev-parse/rev-list checks are unaffected.
Tests assert the apply-path fetch is branch-scoped and never bare.
Review feedback (#40998): `rm -rf` / `Remove-Item -Recurse -Force` on the
install dir is destructive -- a user might still want whatever is there.
Rename the broken checkout to a timestamped `<dir>.broken-<ts>` backup and
re-clone fresh, so nothing is ever deleted. Transient cleanup of a clone
attempt that fails within the same run is left as-is.
An interrupted previous clone leaves the install dir's .git present but with
no initial commit. rev-parse --is-inside-work-tree and git status both still
succeed there, so the installer entered the update path and ran `git stash`,
which aborts with "You do not have the initial commit yet" and failed the
desktop install at the "Cloning Hermes repository" stage.
- install.ps1: add a `git rev-parse --verify HEAD` probe to the repo-validity
check so a commit-less checkout is treated as broken and re-cloned fresh.
- install.sh: mirror it at the top of clone_repo() — drop a partial checkout
with no resolvable HEAD so the fresh-clone path handles it (POSIX parity).
Fixes#40998
* fix: respect disabled auto-compaction on context overflow
Port from anomalyco/opencode#30749.
When compression.enabled is false, NO automatic compaction trigger may
fire. The proactive token-threshold paths (preflight + post-response
should_compress gate) already honoured the setting, but the three
provider-overflow recovery paths in the agent loop — long-context-tier
429, 413 payload-too-large, and context-overflow — called
_compress_context() unconditionally, silently compressing and rotating
the session against the user's explicit choice.
Add a single guard at the top of the overflow-recovery dispatch: when
compression is disabled and the error is one of those three overflow
classes, surface a terminal error (compaction_disabled: True) telling the
user to /compress manually, /new, switch to a larger-context model, or
reduce attachments. Manual /compress (force=True) is unaffected — it never
enters this loop.
Tests: new TestOverflowWithCompactionDisabled (413 + 400 overflow don't
compress when disabled; control case still compresses when enabled).
Existing overflow-recovery tests updated to enable compaction explicitly
(they verify the recovery fires); fixture defaults flipped to True to
match production (compression.enabled defaults to True).
* fix(install): detect TLS cert-trust failures during npm install on Windows
Corporate MITM proxies and missing root CAs surface as 'unable to get
local issuer certificate' while npm (most often Electron's install.js
postinstall) downloads over HTTPS. The installer surfaced this as an
opaque 'desktop workspace npm install failed (exit 1)', so users
misread it as a permissions/admin-rights problem (issue #38016).
Add a shared Show-NpmCertHint detector and route all three npm-install
failure paths (agent-browser global install, browser-tools workspace,
desktop workspace) through it. On a cert error it prints actionable
NODE_EXTRA_CA_CERTS / strict-ssl remediation; on any other failure it
stays silent.
* Revert "fix(update): require managed marker before destructive clean"
This reverts commit c8e80cd0bf.
* Revert "fix(update): stop stash/restore from clobbering desktop source on managed clones (#38542)"
This reverts commit 8a19884bf3.
* chore(install): keep npm ci desktop-build fix after stash revert
The destructive-clean reverts (#38542/#39568) pulled the desktop
workspace install back to bare `npm install`. The npm ci -> npm install
fallback is orthogonal build-correctness (avoids the Windows
workspace-hoisting flake where install reports up-to-date against a
stale marker while node_modules is empty, breaking tsc -b). Preserve it.
* feat(update): settable stash-or-discard for non-interactive local changes
Adds updates.non_interactive_local_changes (stash | discard, default
stash). Governs ONLY non-interactive updates (desktop/chat app, gateway,
--yes) — interactive terminal updates always stash-and-ask, unchanged.
- config.py: new key under existing updates section; _config_version 26->27.
- main.py: _cmd_update_impl detects non-interactive (gateway/--yes/no-TTY),
reads the setting; new _discard_stashed_changes() drops the stash
(stash-and-drop, never reset --hard/clean -fd, so ignored paths survive).
Post-pull restore site branches on it; the bail-out and up-to-date
restores always preserve work.
- web_server.py + apps/desktop settings: exposes it as a stash/discard
select (Advanced section, In-App Update Local Changes).
- docs + tests (discard drops, stash restores, interactive ignores setting,
missing section defaults to stash).
* fix(install.ps1): stash/restore instead of reset --hard on Windows update
The PR reverted the destructive update path to stash/restore everywhere
except scripts/install.ps1, whose managed-clone update path still ran
`git reset --hard HEAD` before checkout — silently destroying agent-edited
tracked source on Windows (the same #38542 data-loss class the PR fixes).
- Replace `git reset --hard HEAD` with stash-before-checkout +
restore-after-checkout, mirroring install.sh. Untracked files are
included so agent-created dirs (e.g. tinker-atropos/) survive.
- Keep `core.autocrlf false` (it prevents the phantom CRLF dirt that made
the stash necessary; it's also load-bearing for a clean restore).
- Wrap all three checkout modes (Commit/Tag/Branch); Branch case now uses
`git pull --ff-only` so local commits are never clobbered.
- Only prompt to restore when a real console is attached (UserInteractive
+ non-redirected stdin/stdout + ConsoleHost); the desktop Update button
and bootstrap have no usable console, so they default to restore and
never hang on Read-Host.
- On restore conflict or a failed update, the stash is preserved with
recovery instructions — work is never silently dropped.
Validated on Windows (PowerShell 5.1, git 2.54): AST parse clean;
E2E non-conflicting restore applies+drops cleanly with ignored paths
(node_modules) untouched; conflicting restore preserves the stash.
---------
Co-authored-by: alt-glitch <balyan.sid@gmail.com>
Windows counterpart of #39127: scripts/install.ps1 `Install-Desktop` runs
`npm run pack` once and throws on the opaque ENOENT a corrupt cached Electron
download produces, with no recovery. Add `Clear-ElectronBuildCache` plus a
purge-and-retry-once on pack failure, mirroring the install.sh fix: remove the
cached electron-*.zip (%LOCALAPPDATA%\electron\Cache + ELECTRON_CACHE /
electron_config_cache overrides) and stale *-unpacked output, then retry so
@electron/get re-downloads with its own SHASUM verification.
Refs #37544.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
uv selects the project Python from requires-python and from the UV_PYTHON
env var, both of which override an already-created venv on the next
'uv sync'. With no upper bound on requires-python, an inherited
UV_PYTHON=3.14 (or a fresh distro whose newest interpreter uv auto-picks)
silently recreated the installer's 3.11 venv at 3.14, where Rust-backed
transitives (pydantic-core) have no cp314 wheel and fall back to a maturin
source build that fails. This bit a Windows/WSL user with UV_PYTHON set in
their shell and a fresh WSL-arch box where uv auto-picked 3.14.
Two layers:
- pyproject: requires-python '>=3.11' -> '>=3.11,<3.14' (+ uv lock regen).
uv now refuses a 3.14 interpreter with a clear error instead of attempting
the maturin build. Backstop independent of the installer.
- install.sh / install.ps1: pin UV_PYTHON to the venv interpreter after
creating it (in both the venv step and the deps step, since bootstrap runs
those stages as separate processes). An inherited UV_PYTHON can no longer
hijack the sync/pip tiers, so the install just works regardless of shell env.
Verified E2E: hostile UV_PYTHON=3.14 + uv venv --python 3.11 + uv sync now
installs into 3.11 with pydantic-core's 3.11 wheel; without the re-pin the
capped requires-python produces a legible incompatibility error rather than a
cryptic build failure.
The stash/restore cycle in the update path was observed to clobber
freshly-pulled source files (apps/desktop/ deletion -> Vite
'[UNRESOLVED_ENTRY] Cannot resolve entry module index.html'). On a
managed clone the user never edits the source tree, so any 'dirty' state
is pure git artifact (CRLF renormalization, npm lockfile churn, files
left behind when a directory was deleted upstream such as
apps/bootstrap-installer/). Stashing that and re-applying it after a pull
is fragile and unnecessary.
- hermes update (hermes_cli/main.py): on a non-fork (managed) clone,
discard working-tree dirt via reset --hard HEAD + clean -fd instead of
stash/apply. Forks keep the stash machinery so intentional edits
survive. Also pin core.autocrlf=false on Windows so the dirt is never
created (mirrors install.ps1 #38239).
- install.sh: replace the update-path stash/restore dance with a hard
reset to origin/<branch>; the installer is a managed-only entry point.
- install.sh + install.ps1 desktop stage: prefer 'npm ci' (wipes and
reinstalls node_modules from the lockfile) over bare 'npm install',
which can report 'up to date' against a stale marker while node_modules
is empty -- leaving tsc unresolved so 'npm run pack' fails.
Tests: managed clone cleans instead of stashing; fork still stashes;
existing stash tests force the stash path explicitly.
The "Build desktop app" install step failed with an opaque "exit code 1"
on machines with an old Node, and nothing in the logs explained it.
Reproduced: on Node 20.5.1, `npm run pack`'s `vite build` crashes with
You are using Node.js 20.5.1. Vite requires Node.js version 20.19+ or 22.12+.
SyntaxError: The requested module 'node:util' does not provide an
export named 'styleText'
Vite 8 (rolldown) imports node:util.styleText, which doesn't exist before
Node 20.12, so the build dies before producing the app. The installer's
check_node / Test-Node accepted ANY pre-existing Node with no version
floor, so a too-old system Node was used for the build instead of the
bundled Node 22.
Add a version floor (^20.19 || >=22.12) to check_node (install.sh) and
Test-Node (install.ps1): a too-old system Node is replaced with the
Hermes-managed Node 22 LTS, and the desktop stage re-resolves Node so the
build always runs on a satisfying version. Declare the same range in
apps/desktop/package.json engines.
Verified: build succeeds on Node 22, fails on 20.5.1 with the error above;
the floor logic matches Vite's range across boundary versions (20.18/20.19,
21.x, 22.11/22.12).
Git for Windows defaults to core.autocrlf=true, which renormalizes the
repo's LF-only text files to CRLF in the working tree. On a managed,
never-user-edited clone this makes tracked files (.envrc, AGENTS.md,
agent/*.py, workflows) show as locally modified, so the update path's
bare git checkout aborts with 'Your local changes would be overwritten
by checkout' and the desktop bootstrap fails at stage=repository.
The bash installer already autostashes before checkout; the PowerShell
path had no dirty-tree handling at all and never pinned autocrlf.
Fix: (1) git reset --hard HEAD before fetch/checkout in the update path
to discard any pre-existing dirt, and (2) pin core.autocrlf=false on both
the update and fresh-clone paths so the dirt is never created again.
Replace the multi-path UV resolution chain (PATH probing, conda guards,
5-location trust ordering, temp-dir fallback installs) with a single
managed uv binary at $HERMES_HOME/bin/uv. Every code path that needs
uv resolves it from that one location; if missing, ensure_uv()
bootstraps it via the official standalone installer.
Key changes:
- New hermes_cli/managed_uv.py: managed_uv_path(), resolve_uv(),
ensure_uv() (returns (path, freshly_bootstrapped) tuple),
update_managed_uv(), rebuild_venv(), installer internals.
- hermes_cli/main.py: replace all shutil.which('uv') with ensure_uv(),
add venv rebuild on first-time managed uv bootstrap, update_managed_uv
before dep install on all 3 update paths.
- scripts/install.sh: install_uv() always installs to
$HERMES_HOME/bin/uv; delete ensure_fts5, _python_has_fts5,
_reinstall_python_with_fts5, _warn_no_fts5 (61 lines).
Managed uv always installs current Python with FTS5.
- scripts/install.ps1: Install-Uv always installs to
$HermesHome\bin\uv.exe; Resolve-UvCmd checks managed location first.
- hermes_state.py: simplified FTS5 warning now suggests 'hermes update'
as the fix instead of blaming install method.
- tests: 15 tests in test_managed_uv.py, autouse _patch_managed_uv
fixture in test_cmd_update.py.
Closes#37605, Closes#37622
* feat: better composer etc
* docs: add desktop and dashboard run instructions
* fix(desktop): address security scan findings
* fix(dashboard): resolve @nous-research/ui path under npm workspaces
The sync-assets prebuild step shelled out to 'cp -r
node_modules/@nous-research/ui/dist/fonts ...' with a path relative
to apps/dashboard/. That works only when the dep is installed
locally in the dashboard workspace, but 'npm install' at the repo
root (the documented setup — see apps/desktop/README.md) hoists
shared deps to the root node_modules under npm workspaces. The
relative cp then fails with 'No such file or directory', sync-assets
exits 1, the Vite build aborts, and 'hermes dashboard' surfaces a
generic 'Web UI build failed' message.
Replace the shell one-liner with scripts/sync-assets.cjs, which
walks up from the dashboard directory looking for node_modules/
@nous-research/ui — working in both the hoisted (workspaces) and
co-located (standalone) layouts. Also guards against a missing
dist/fonts or dist/assets with a clearer error pointing at a
rebuild of the UI package rather than silently copying nothing.
* feat(desktop): support connecting to a remote Hermes backend
Add HERMES_DESKTOP_REMOTE_URL and HERMES_DESKTOP_REMOTE_TOKEN env
vars that, when set, short-circuit the local-child spawn in
startHermes() and connect the Electron renderer to an already-
running 'hermes dashboard' server reachable over the network.
Motivating use case: WSL2 users who want to run the Hermes core
(agent loop, tools, filesystem access) inside their WSL
distribution while rendering the Electron GUI on native Windows.
Before this change, the desktop app always spawned a local Python
child on the same host as the renderer, which doesn't cross the
WSL/Windows boundary.
The remote path reuses waitForHermes() as a liveness probe
(/api/status is in the backend's public endpoint allowlist), so
the connection is only returned once the backend is actually
ready. WebSocket URL derivation picks ws:// or wss:// based on
the input scheme. URL validation rejects non-http(s) schemes and
requires both env vars together to avoid a half-configured
connection that would silently fall through to the spawn path.
No behaviour change when the env vars are unset — the default
local-spawn flow is untouched.
Typical usage:
# in WSL2
hermes dashboard --tui --no-open --host 0.0.0.0 --port 9119 --insecure
# on Windows
set HERMES_DESKTOP_REMOTE_URL=http://localhost:9119
set HERMES_DESKTOP_REMOTE_TOKEN=<session token>
set HERMES_DESKTOP_IGNORE_EXISTING=1
(launch Hermes desktop)
* ci(desktop): automate desktop releases
Add GitHub Actions release channels for signed desktop installers and document the stable/nightly download paths.
* feat: file tabs
* refactor(desktop): tighten right-rail tab close API
Promote closeRightRailTab/closeActiveRightRailTab as the single
public entry point. Drops the activeTabRef + handleCloseDocument
indirection in ChatPreviewRail, the unused $rightRailHasContent
atom, and the legacy dismissFilePreviewTarget alias. -70 LOC.
* feat(desktop): polish composer pill toward reference look
Solid foreground-on-background send/voice-conversation circle (black-on-white
in light, white-on-black in dark) anchors the right edge as the primary CTA
instead of the orange theme primary. Bumps the primary control to 2.125rem so
it visually outranks the ghost mic/plus controls. Opens up the surface padding
(0.625rem x / 0.5rem y) so the input row breathes around its controls, and
nudges the corner radius from 20 to 24px for a slightly pill-ier silhouette.
LiquidGlass distortion is preserved.
* feat(desktop): add startup and onboarding flow
Add phase-based desktop boot progress, fresh-install sandbox testing, and first-run provider credential onboarding so packaged installs can start cleanly without manual settings detours.
* fix(desktop): gate prompts on provider setup
Show the desktop provider onboarding flow before prompt submission when no inference provider is configured, preventing fresh installs from falling through to backend credential errors.
* fix(desktop): surface provider onboarding from session warnings
Propagate credential warnings through session runtime info and open desktop onboarding whenever a session reports no usable provider, so unconfigured installs cannot fall through to prompt errors.
* fix(desktop): route gateway provider errors to onboarding
The "No inference provider configured" auth error reaches the renderer through gateway error events, not the prompt.submit promise; the previous patch only caught the latter, so the error toast still surfaced and onboarding never opened.
Also strip credential-shaped env vars from the test:desktop:fresh sandbox so the packaged backend can't see provider keys leaking from the launching shell.
* fix(desktop): use strict runtime check to drive onboarding
setup.status returned True whenever any provider auth state was discoverable, including indirect fallbacks like a gh-CLI Copilot token. That made desktop think the user was set up while the agent's actual resolve_runtime_provider call still raised AuthError, leaving the user with a useless toast and no onboarding.
Add a setup.runtime_check gateway method that runs the same resolver the agent uses on session creation, and switch the desktop onboarding overlay and prompt precheck to use it.
* feat(desktop): OAuth-first onboarding using existing dashboard provider API
Replace the engineer-flavored API key form with a Sign-in-first onboarding overlay that uses the dashboard's existing /api/providers/oauth catalog and PKCE/device-code endpoints (Anthropic, Nous, OpenAI Codex, etc.). API key entry is now a fallback tab with friendly provider names instead of env var prefixes, and the loud raw resolver error is gone in favor of a one-line welcome message.
* fix(desktop): polish onboarding provider list
Reorder OAuth providers so Nous Portal is first, give the segmented Sign in / API key control equal column widths, and replace the engineer-flavored backend names like "Anthropic (Claude API)" / "MiniMax (OAuth)" with friendlier in-app titles. External-CLI providers now show a softer subtitle and an external-link icon instead of a chevron.
* refactor(desktop): split onboarding overlay into store + view
Move the OAuth state machine, runtime check, copy-to-clipboard, and api-key save into store/onboarding.ts (matching the boot.ts pattern), leaving the overlay as a presentation layer that subscribes via useStore. Tabs are now table-driven, child panels read flow from the store instead of prop-drilling, and the polling/PKCE/error/success branches share a small Status atom.
* fix(desktop): external CLI providers + center mode tabs
External-CLI providers (Claude Code, Qwen Code) now open an in-overlay panel with the CLI command, copy button, and an "I've signed in" recheck instead of firing an invisible toast. Center the Sign in / API key tab control so it sits under the heading instead of hugging the left edge.
* fix(desktop): drop onboarding tabs for an inline link, group device-code waiting state
Replace the Sign in / API key tab pair with an "I have an API key" footer link under the OAuth provider list, with a "Back to sign in" affordance inside the API key form. Group the device-code "Waiting for you to authorize..." status next to the Cancel button so the alignment matches the action.
* refactor(desktop): tighten onboarding store + overlay
Drop the dead isOnboardingBusy/BUSY set, factor the catch-fallback dance into safeReq, and share a single reloadAndConnect helper between PKCE submit, device-code success, external recheck, and api-key save.
In the overlay, extract Step / CodeBlock / FlowFooter / CancelBtn / DocsLink atoms so the four sign-in panels share the same chrome instead of repeating it inline. Net effect: fewer literal divs, one place to touch the spacing, and the code-block + footer rows are reusable across future flows.
* fix(desktop): mount onboarding from frame 1 to kill the FOUT
Default onboarding.configured to null (unknown until the runtime check resolves) and have the onboarding overlay render whenever it's not yet confirmed true. The boot overlay now yields to it, so the very first paint is the Welcome card with a "While we get you set up..." progress strip instead of a flash of the chat shell between boot dismiss and onboarding mount.
The picker swaps in cleanly once the gateway opens and the runtime check confirms the user is not configured. Already-configured users see the same prep card briefly while their existing runtime warms up, then the overlay dismisses without touching the chat shell.
* fix(desktop): top-align empty sessions placeholder
The "Start a chat to build your history." empty state used a min-h-35 grid place-items-center container, which floated the text in a tall dead zone. Render it as a flat paragraph that sits right under the section header like the empty pinned state does.
* refactor(desktop): drop dead boot overlay
Onboarding overlay subsumes the boot card now that it mounts from frame 1 and renders boot progress inline. The standalone DesktopBootOverlay is unreachable in every flow (yields whenever onboarding has not confirmed configured, dismisses once it has).
* fix(desktop): hide pinned/recents sections until first session
A fresh sidebar showed the Pinned and Recent chats headers with floating empty-state copy underneath. Drop both sections (and the now-orphan SidebarEmptySessionState) when there are no sessions yet — they reappear after the first chat. Skeletons during initial load are unchanged.
* feat(gui): route embedded TUI through dashboard gateway (#21979)
Inject HERMES_TUI_GATEWAY_URL into dashboard PTY sessions so embedded ui-tui instances attach to the in-process websocket gateway, with coverage for the new env wiring.
* Add desktop remote gateway settings
Make the desktop gateway connection configurable from settings so local remains the default while remote backends can be saved, tested, and applied without environment variables.
* feat(gui): first-class Messaging page + gateway menu redesign
- Add Messaging page to the desktop app with per-platform setup,
status, and inline guidance. Catalog derives from gateway.config
Platform enum + plugin registry, so every messaging adapter the CLI
supports (Telegram, Discord, Slack, Mattermost, Matrix, WhatsApp,
Signal, BlueBubbles, Home Assistant, Email, SMS, DingTalk, Feishu,
WeCom, Weixin, QQ, Yuanbao, API server, Webhooks, plugins) shows up
without per-platform code.
- New REST endpoints: GET /api/messaging/platforms, PUT and POST
/test on the same path. Secrets go through the existing .env
pipeline; enable/disable writes config.yaml.
- Replace gateway statusbar dropdown with a richer panel: status row,
icon-only restart + system-panel actions, recent activity (with
timestamps trimmed in display, full text on hover), platform list.
- Auto-poll the messaging page every 6s (paused when hidden) so
status updates without a manual check.
- Drop Settings / Command Center from the sidebar nav (still
reachable via shortcuts and the titlebar cog).
- Flatten top corners on Messaging/Skills/Artifacts/Chat panes.
- Share new StatusDot component across messaging + gateway menu.
- Fix gateway/config.py so an explicit platforms.<name>.enabled=false
in config.yaml is honored when env tokens are present.
- pb-9 on the chat content area for breathing room above the composer.
* Potential fix for pull request finding 'CodeQL / Clear-text logging of sensitive information'
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* pin electron version
* hide application menu on non-mac systems
* interpret compactPreview for non-string vlaues as JSON or an empty string
* fix(desktop): keep composer contenteditable mounted across stacked toggle
The composer rendered {input} inside two different parent fragments
depending on `stacked`. When auto-expand flipped `stacked` (e.g. the
moment typed text wrapped past two lines), React reconciled the two
branches as different positions and unmounted/remounted the
contenteditable. The fresh mount started empty, so any in-flight
characters — most reliably reproduced by holding a key — were lost.
Replace the conditional with a single CSS Grid whose template-areas
swap on `stacked`. The three children (menu, input, controls) keep
stable identities across the toggle; only their grid placement
changes, which the browser handles without React tearing down the
editor.
* refactor(desktop): align install layout with install.ps1 / install.sh
Make the desktop app's runtime layout match what scripts/install.ps1 and
scripts/install.sh produce, so a desktop-only user and a CLI-only user end
up with the same files in the same places and can share one install.
Layout
- ACTIVE_HERMES_ROOT = HERMES_HOME/hermes-agent (was: process.resourcesPath/hermes-agent, read-only)
- VENV_ROOT = HERMES_HOME/hermes-agent/venv (was: userData/hermes-runtime)
- desktop.log = HERMES_HOME/logs/desktop.log (was: userData/desktop.log)
- HERMES_HOME default: %LOCALAPPDATA%\hermes on Windows, ~/.hermes elsewhere
The packaged .app/.exe still ships a read-only payload at
process.resourcesPath/hermes-agent (FACTORY_HERMES_ROOT). On first launch
or after an installer-driven upgrade we sync factory -> active, then
provision the venv and run pip install -e . against the active root.
Key behaviors
- Pin HERMES_HOME in the spawned Python's env so get_hermes_home() resolves
to the same path resolveHermesHome() picked. Without this, Python falls
back to ~/.hermes on every platform - fine on mac/linux, a split-state
bug on Windows where our default is %LOCALAPPDATA%\hermes.
- Detect developer installs by .git presence at ACTIVE; never overwrite
a user's checkout via factory sync.
- Marker at ACTIVE/.hermes-desktop-runtime.json (schema v4) tracks
pyproject hash + factory version + runtime schema version. depsFresh
fast-paths when nothing changed.
- Dev (npm run dev) prefers SOURCE_REPO_ROOT over ACTIVE so devs run
their local edits, not whatever's under HERMES_HOME.
- Better error messages distinguish "no payload" from "no Python".
- Preserve a legacy ~/.hermes on Windows when no %LOCALAPPDATA%\hermes
exists, so users with prior pip/manual installs aren't orphaned.
pyproject.toml
- Promote fastapi, uvicorn[standard], ptyprocess (non-Windows), and
pywinpty (Windows) to main dependencies. The dashboard backend
(hermes dashboard) needs them at runtime; the previous lazy-import
fallback was a footgun for fresh installs.
- Empty the [pty] optional-extra; kept as a no-op back-compat alias for
any existing pip install hermes-agent[pty] invocations.
Drops the hardcoded BUNDLED_RUNTIME_REQUIREMENTS list in main.cjs - the
desktop now installs whatever pyproject.toml says, single source of truth.
Files
- apps/desktop/electron/main.cjs: runtime layout, HERMES_HOME pin,
factory->active sync, marker v4
- apps/desktop/scripts/test-desktop.mjs: track new venv location
- apps/desktop/README.md: new Setup, Runtime Bootstrap, and
Debugging sections
- pyproject.toml: fastapi/uvicorn/pty backends in main
dependencies; [pty] extra emptied
Tested locally on Windows: npm run dev boots cleanly, sessions land at
the new location, type-check + lint + test:desktop:platforms all pass.
Verified end-to-end on a fresh Win11 VM via dist:win installer.
Known gaps (filed as follow-ups, not in this PR):
- Skills not seeded on packaged installs (sync_skills only runs in
cmd_chat, not cmd_dashboard). Need to move to shared pre-dispatch.
- Git Bash not bundled or detected; agent's terminal tool errors out
with a useful message but desktop bootstrapper should pre-flight it.
- install.ps1 / install.sh should be decomposed into composable phase
libraries so the desktop bootstrapper can reuse them as a single
source of truth across all install surfaces.
* feat(desktop): theme polish, prose chat typography, composer chrome
- DS tokens/midground, Backdrop, scoped scrollbars, typography plugin + prose
- Composer liquid/radius utilities, thread font parity, tool/thinking cues
- File tree label scale, preview flex, thread retry loading + streaming tests
* feat(desktop): NSIS prereq detection page + auto-install via winget
The packaged Windows installer now detects Python 3.11+ and Git for Windows
at install time and offers to install missing prereqs via winget. Mirrors
the prereq logic scripts/install.ps1 already runs for CLI installs, so
desktop installer users get the same out-of-the-box experience as
install.ps1 users.
Why
- Hermes' terminal tool calls bash.exe directly (tools/environments/
local.py); on Windows that's Git Bash from Git for Windows. Without it,
the agent fails on the first terminal() call.
- Hermes' Python runtime needs 3.11+. Without it, the desktop bootstrapper
errors out at venv creation.
- Both gaps surfaced on a fresh Windows 11 VM smoke test: VM had Python
pre-installed but no Git, so the agent's first terminal call failed
with "Git Bash isn't installed."
- install.ps1 has had Install-Git + Install-Uv functions for ages. The
desktop installer was the asymmetric outlier.
How — NSIS prereq page
- New file: apps/desktop/installer/prereq-check.nsh (plugged into
electron-builder via build.nsis.include)
- Real Wizard page using nsDialogs, inserted via customPageAfterChangeDir
hook (between the Directory page and InstFiles).
- Group boxes for Python and Git, each showing detection status.
- Pre-checked install checkboxes when winget is available.
- Auto-skips silently if both prereqs are already installed.
- Falls back to manual download URLs when winget itself is missing.
- Detection:
- Python: probes `py -3.11`/`-3.12`/`-3.13`/`-3.14` via the Python
launcher. Microsoft Store "Python stub" (no py.exe) is correctly
classified as not-installed.
- Git: `where git`.
- winget: `where winget` (Win10 1809+ / Win11 with App Installer).
- Install execution (in customInstall macro):
- Python: nsExec::ExecToLog with `--scope user --silent`. Per-user
install, no UAC prompt, output streams to install log.
- Git: ExecShellWait via Windows ShellExecute. Critical because Git
always installs per-machine and triggers UAC; ShellExecute preserves
the foreground focus chain across non-elevated → elevated process
spawns, so UAC actually comes to the foreground. nsExec::ExecToLog
breaks the chain because winget runs hidden.
- Both pass `--disable-interactivity --accept-package-agreements
--accept-source-agreements` to suppress winget's own dialogs.
- Verification: probes Git's standard install locations via FileExists
rather than `where git`. NSIS's process inherits PATH at startup, so
a freshly-installed Git won't be visible to `where` until restart.
- Silent installs (/S) skip the prompts; managed deploys handle prereqs
out-of-band via Group Policy / Intune.
How — Electron-side safety net
- New findGitBash() in main.cjs, parallel to findSystemPython(). Probes
the same locations as tools/environments/local.py:_find_bash() so a
positive result here means the agent's terminal tool will work.
- ensureRuntime now throws a clear, actionable error on Windows when Git
Bash isn't found, matching the existing "Python 3.11+ is required"
error path.
- Catches users the NSIS page doesn't: .msi installer users (NSIS prereq
page doesn't run for MSI), `npm run dev` users, manual installers,
anyone who unchecked the install boxes on the NSIS prereq page.
- All gated on `IS_WINDOWS`; macOS / Linux unaffected.
NSIS build issue (resolved)
- electron-builder defaults to `-WX` (warnings as errors). NSIS optimizer
emits "warning 6010: function not referenced" for our page functions
because Page custom directives don't count as references in its
static-analysis pass. The functions ARE called at runtime when NSIS
invokes the page; the optimizer just can't see it statically.
- Set `build.nsis.warningsAsErrors=false` in package.json so this
spurious warning doesn't fail the build. (Documented option from
electron-builder's nsisOptions.)
Out of scope (filed for future work)
- MSI prereq detection: Windows Installer custom actions are a different
mechanism. Enterprise deploys typically handle prereqs via GP/Intune.
- Bundle PortableGit + python-build-standalone in extraResources for
zero-network installs. ~80MB increase.
- Mac / Linux GUI prereq flows (different installer formats; Xcode CLT
covers most macOS prereqs already; Linux is per-distro hard).
Files
- apps/desktop/installer/prereq-check.nsh (new, ~290 lines NSIS)
- apps/desktop/package.json (build.nsis.include +
warningsAsErrors)
- apps/desktop/electron/main.cjs (findGitBash + preflight)
- apps/desktop/README.md (Runtime prerequisites
section)
Cross-platform impact
- macOS / Linux builds (dist:mac, dist:mac:dmg, dist:mac:zip): nsis
config is ignored entirely; .nsh is dormant.
- npm run dev: .nsh dormant; main.cjs preflight gated on IS_WINDOWS.
- scripts/install.ps1, scripts/install.sh: no reference to any new
files; CLI install paths untouched.
- Hermes CLI / dashboard / gateway: no reference; runtime untouched.
- All checks: node --check on main.cjs and test-desktop.mjs pass;
npm run test:desktop:platforms 4/4 passing; node --test green.
Tested
- npm run dist:win produces signed .exe and .msi without errors.
- Fresh Win11 VM (Python pre-installed, no Git): prereq page renders,
Python check shows detected, Git checkbox pre-checked. Click Next →
Git installs via winget with UAC prompt in foreground.
- After install completes, Hermes launches and the agent's terminal
tool can run bash commands. Verified Git Bash is detected at
`C:\Program Files\Git\bin\bash.exe` by ensureRuntime's preflight.
* feat: theme changes, composer tweaks, in app update ux, finesse
* fix(cli): seed bundled skills on dashboard + gateway entrypoints
`sync_skills(quiet=True)` was only being called from inside `cmd_chat`,
which meant `hermes dashboard` (the desktop GUI's backend) and `hermes
gateway` (Telegram/Discord/Slack/etc daemons) never seeded the bundled
skill library into ~/.hermes/skills/.
This surfaced as "No skills found" in the desktop GUI's skills panel on
fresh installs, despite the agent having access to the full bundled
library when invoked via `hermes chat`. scripts/install.ps1 worked
around it by running skills_sync.py as part of Copy-ConfigTemplates,
but that's not part of the desktop installer's bootstrap chain.
Fix
- Extract the skills-sync block from cmd_chat into a module-level
`_sync_bundled_skills_quietly()` helper.
- Call the helper from cmd_chat (preserving existing behavior),
cmd_dashboard (after the --status/--stop early-return paths and
fastapi import check, so we don't run skills_sync on management
commands or when deps aren't installed), and cmd_gateway.
Why these three entrypoints
- cmd_chat: the user's primary CLI entrypoint
- cmd_dashboard: the desktop GUI's backend; this is what `hermes
dashboard --tui` invokes when the desktop bootstrapper spawns Hermes
- cmd_gateway: long-running daemons where the user expects the agent
to have full skill access
Other entrypoints (cmd_config, cmd_doctor, cmd_login, cmd_status,
etc.) are management commands that don't need skill discovery and were
never running skills_sync in the first place — leaving them alone.
Idempotence
- tools/skills_sync.py is manifest-based: skipped skills cost
milliseconds. Calling it from multiple entrypoints adds no real
cost, and users running `hermes chat` then `hermes dashboard` get
two fast no-ops on the second call.
Failure handling
- Helper wraps skills_sync in try/except. Skills are an enhancement,
not a hard dependency — Hermes runs fine with an empty skills/ dir.
Files
- hermes_cli/main.py:
+ new helper `_sync_bundled_skills_quietly()` at module level
+ cmd_chat: replace inline block with helper call
+ cmd_dashboard: add helper call after fastapi import succeeds
+ cmd_gateway: add helper call before delegating to gateway_command
* feat(desktop): hoisted todo widget, JSON tool summaries, history grouping & timer fixes
- Hoist todo to first-class widget (shadcn checkboxes, brand colors, no
tool-accordion). Header derives label from active task; non-active rows fade.
- Replace raw JSON dumps with structured key/value summaries via
formatToolResultSummary; nested error extraction for clearer failures.
- Fix loaded-session grouping: stitch interleaved assistant/tool iterations
into one bubble instead of orphaned synthetic messages.
- Stable tool/thinking timers via keyed registry so unmount/scroll doesn't
reset elapsed counts; gate "running" on real live thread state.
- Reorganize chat-only assistant-ui components under components/chat/.
* fix(desktop): address CodeQL alerts on PR #20059
- settings/helpers.ts: harden setNested against prototype pollution.
POLLUTING_PATH_PARTS check is now applied at every assignment site
(loop + leaf) and uses Object.defineProperty so CodeQL can see the
guard inline rather than via a helper function call.
- lib/markdown-preprocess.ts: rebuild the dangling-fence close regex
from a fence-char + length instead of marker.replace(...). The marker
is captured by `(`{3,}|~{3,})` so it can only be backticks or tildes,
but CodeQL was tracing tainted input text into the RegExp source and
flagging hostname dots from input as part of the pattern (false
positive js/incomplete-hostname-regexp on the test fixture URLs).
Reconstructing from a literal char breaks the dataflow.
- scripts/notarize-artifact.cjs: drop args from the run() rejection
message. Args carry --key-id / --issuer / key file path; the existing
outer catch already squashes errors to a generic line, but CodeQL was
flagging the args.join(' ') as clear-text logging of APPLE_API_KEY_ID.
Composer DOM-text-as-HTML alerts (composer/index.tsx:379, :547) are
already addressed in 4dd9732a9 — innerHTML assignment was replaced with
renderComposerContents which builds DOM via replaceChildren / append
text nodes (no HTML interpretation).
* fix(desktop): inline prototype-pollution guard so CodeQL sees it
CodeQL's dataflow doesn't follow the helper-function guard inside
`safeSet`, so it kept flagging Object.defineProperty as prototype-
polluting. Inline the literal `__proto__`/`constructor`/`prototype`
check at the assignment site to break the dataflow.
Behavior unchanged — same set of disallowed keys, same throw.
* feat(ui-tui): resolve links to readable page titles
Mirror desktop pretty-link behavior in the TUI by resolving HTTP links to page titles with shared caching and safe fetch filters, plus slug-based fallbacks so chat links stay readable even when title fetch fails.
* fix(desktop): drop RegExp from dangling-fence close detection
Previous attempt tried to break the dataflow by reconstructing the
close-fence regex from a literal char + marker.length, but CodeQL still
traced marker.length back to input and kept flagging the test-fixture
URLs as hostname-regex sources (js/incomplete-hostname-regexp).
Replace `new RegExp(...)` + `closeRe.test(body)` with a string-only
hasCloseFenceLine() helper that splits on '\n' and uses ===. No regex
on this path now, so input data can no longer reach a RegExp source.
Behavior preserved: matches lines that are (whitespace + marker +
whitespace), which is what the original `\n[ \t]*${marker}[ \t]*(?=\n|$)`
matched. All 12 markdown-text tests still pass.
* fix(process-registry): suppress windows-footgun false positive on guarded killpg
Keep the existing POSIX-only process-group teardown path, but make the
signal selection explicit via getattr and add an inline windows-footgun
suppression marker on the guarded os.killpg line so the Windows footgun
check no longer blocks CI on this intentionally platform-gated code.
* feat(desktop): reconcile live tool events, polish thread chrome, harden boot
- chat-messages: match tool rows by overlapping query/context/preview values
so preview-first `tool.progress` rows reliably adopt later stable-id
`tool.start` payloads instead of spawning ghost rows or mis-merging
parallel same-name calls; preserve prior args/result across phases.
- tui_gateway: emit full args + parsed result on `tool.start` / `tool.complete`,
drop redundant `tool.started` re-emit from `tool.progress`.
- electron/main: prefer SOURCE_REPO_ROOT before PATH `hermes` in dev so
local backend edits actually run; split hardening helpers into
`electron/hardening.cjs` with tests.
- thread/tool UI: one-shot enter animation keyed by stable ids, braille
spinner for running rows, Cursor-like disclosure rows, drill-down +
duration/count formatting via new tool-fallback-model.
- composer: extract `text-utils`, drop liquid-glass overrides.
- right-rail: split preview-pane into preview-console / preview-file.
- runtime: incremental external-store runtime + runtime-readiness gate;
onboarding store + tests; route-resume hook test.
- regression tests for live tool reconciliation (parallel tools, id-less
progress, preview-first rows, structured args/results).
* feat(desktop): add ripgrep to NSIS prereq page + polish layout
Add ripgrep as a third (recommended) prereq alongside Python and Git in
the NSIS prereq detection page, and clean up the page layout based on
on-VM testing.
Why ripgrep
- Hermes' search_files tool calls `rg` directly for content + filename
search (tools/file_operations.py:1382). Falls back to grep/find from
Git Bash when missing — works but slower and noisier (no .gitignore
awareness).
- ~5MB winget install via `BurntSushi.ripgrep.MSVC --scope user` — no
UAC prompt, parallel to how Python installs.
- scripts/install.ps1 already installs ripgrep as part of
Install-SystemPackages; this brings the desktop installer to parity.
Why "recommended" not "required"
- Python and Git are hard requirements: without them the agent runtime
or terminal tool refuses to start. The bootstrapper preflight throws.
- ripgrep is a performance enhancement: missing it just means slower
searches. Page wording reflects this; failure to install is logged
but doesn't show a MessageBox or block.
Layout polish (response to on-VM screenshot review)
- Wizard header now correctly reads "System Requirements" instead of
the leftover "Choose Install Location" from the previous page. Set
via `GetDlgItem $HWNDPARENT 1037/1038` + WM_SETTEXT — the standard
NSIS pattern for overriding the page header on a custom Page.
- Removed redundant in-body title + verbose intro paragraph; the
wizard header IS the title now. Body has one short intro line.
- Group boxes tightened to 26u with content positioned just below the
groupbox title (not top-anchored status + bottom-anchored checkbox
with empty space in the middle). All three panels + footer fit
comfortably in 126u, well under the 140u page limit.
- Checkbox labels simplified: dropped "(per-user, no admin prompt)"
and "(administrator approval required)" suffixes. The footer note
still calls out UAC for Git when relevant.
- Footer text trimmed to fit cleanly without clipping.
Install order (in customInstall macro)
- Python → ripgrep → Git
- Python and ripgrep are silent and run first; Git's UAC prompt comes
last so the user's approval interaction isn't interrupted by silent
activity afterwards.
Skip behavior unchanged
- All three detected → page auto-skips via Abort
- Silent install (/S) → customInstall winget block skips
- User unchecks all → page advances without running winget
Files
- apps/desktop/installer/prereq-check.nsh: ripgrep detection block,
ripgrep page panel + checkbox, ripgrep customInstall block,
GetDlgItem header override, layout reflow
- apps/desktop/README.md: Runtime prerequisites section updated to
list ripgrep as recommended, with manual winget command
* feat(desktop): add model-confirmation step to onboarding
After OAuth/API-key login completes, onboarding now shows a confirmation
card with the curated default model and a Change button before dropping
the user into chat. Closes the gap where the desktop's `model.default`
was empty after first launch and the agent had to fall back to whatever
heuristic happened to fire — leaving users wondering "why am I getting
sonnet-4 when I logged into Nous Portal?"
Why
- Desktop onboarding only persisted credentials, never `model.default`.
The CLI's `hermes model` command pairs provider + model selection,
but the desktop's onboarding skipped the model step entirely.
- Result: users saw whichever model the agent's auto-fallback picked,
unpredictably and undocumented.
- For the BUILD demo we want users to land on the model they expect
for their provider, with a clear "this is what you're getting" UI
and a one-click path to change it before chatting.
How
- New `confirming_model` flow status carries the just-authenticated
provider slug, current default model, label, and a saving flag.
- `completeWithModelConfirm()` runs after credentials succeed: reloads
env, verifies runtime, fetches /api/model/options to find the curated
first-model for the provider, persists it via /api/model/set, then
transitions into `confirming_model`.
- If anything fails (no providers returned, network error), falls
through to the previous behaviour — onboarding completes without
the confirm step. Polish, not a hard requirement.
- All four credential paths (device_code OAuth, PKCE OAuth, external
CLI flow, API key) now use completeWithModelConfirm instead of
reloadAndConnect.
UI
- `ConfirmingModelPanel` shows: green "<provider> connected" banner,
card with "Default model: <name>" + Change button, and a "Start
chatting" CTA that finalises onboarding.
- Reuses the existing `ModelPickerDialog` (the same picker available
from the chat shell) for the change-model UX. Search, filtering,
multi-provider listing — all already built.
- Stacking: ModelPickerDialog defaults to z-130, which renders UNDER
the onboarding overlay (z-1300) and breaks pointer events. Added
optional `contentClassName` prop to ModelPickerDialog so callers
can override; onboarding passes `z-[1310]`.
Provider-slug matching
- For OAuth flows: pass `provider.id` directly as the preferred slug.
- For API-key flows: `OPENROUTER_API_KEY` → "openrouter" via env-key
prefix strip. Also includes the user-visible label as a fallback
candidate.
- fetchProviderDefaultModel falls back to the first authenticated
provider in the response if no preferred slug matches — so even a
miss still surfaces a reasonable default.
Files
- apps/desktop/src/store/onboarding.ts:
+ new `confirming_model` flow variant
+ fetchProviderDefaultModel + completeWithModelConfirm helpers
+ setOnboardingModel (optimistic update + revert on failure)
+ confirmOnboardingModel (finalises onboarding from the card)
- reloadAndConnect (replaced; the four call sites now go through
completeWithModelConfirm)
- apps/desktop/src/components/desktop-onboarding-overlay.tsx:
+ ConfirmingModelPanel component
+ new branch in FlowPanel for status `confirming_model`
+ ModelPickerDialog usage with z-[1310] content class
- apps/desktop/src/components/model-picker.tsx:
+ optional `contentClassName` prop on ModelPickerDialog so the
dialog can be stacked on top of other fixed overlays
Tested
- `npm run type-check` passes
- `npx eslint` clean on touched files
- Live test in `npm run dev`: cleared onboarding cache, walked
through Nous device-code flow, saw confirm card with curated
default, clicked Change → ModelPickerDialog rendered above the
onboarding overlay with working pointer events, picked a different
model, "Start chatting" persisted to ~/.hermes/config.yaml.
* fix(desktop): suppress generic provider warning in onboarding
Hide the red setup notice when the message is the generic missing-provider guidance, since onboarding already presents provider auth actions. Centralize provider-setup matching across desktop hooks and add coverage for the matcher.
* fix(desktop): add 2u clearance below prereq checkboxes
Group box bottom border was clipping the checkboxes by 1-2px.
Bumped each box height 26u→30u; checkboxes now sit 2u above the bottom border.
* fix(nix): refresh dashboard lockfile hash
Update the web npm deps hash in nix/web.nix to match the committed apps/dashboard/package-lock.json so bb/gui passes the nix lockfile check.
* fix(desktop): install TUI deps in release workflow
Ensure desktop release builds install the standalone ui-tui package before bundling the TUI payload.
* fix(desktop): run release builder from app package
Invoke the desktop builder through the package script so electron-builder uses apps/desktop/package.json.
* fix(desktop): expand release artifact names safely
Build desktop artifact names from workflow version/channel while preserving electron-builder platform macros.
* fix(desktop): use package artifact naming in release workflow
Let electron-builder's desktop package config provide platform-specific artifact extensions while the workflow injects the release version/channel metadata.
* fix(nix): fetch dashboard npm deps from package root
Point the dashboard npm dependency fetch at apps/dashboard so Nix can find the package lockfile after the dashboard move.
* fix(nix): build dashboard from package directory
Set the web package source root to apps/dashboard so npm patch/build phases run beside the dashboard lockfile while keeping apps/shared available as a sibling.
* feat(desktop): render LaTeX math via KaTeX after streaming completes
Add @streamdown/math plugin to the chat markdown renderer.
Inline ($x^2$) and block ($$...$$) math both supported with
singleDollarTextMath enabled. Plugin is gated to non-streaming state
to match the existing pattern for syntax highlighting — math renders
when the message completes, avoiding KaTeX re-render churn during
streaming. KaTeX CSS is imported in styles.css; ~30KB CSS + ~430KB
JS added to the bundle. Smoothness improvements during streaming
deferred to a follow-up.
* perf(desktop): memoize KaTeX renders so math streams without re-rendering
Wrap rehype-katex with a per-equation LRU cache (keyed by
displayMode + source text) and re-enable math during streaming.
Stock @streamdown/math runs rehype-katex on every markdown commit,
so each new token re-katexes every equation in the message. For
math-heavy responses (an equation derived step-by-step) that's
hundreds of ms of wasted work per token and the streaming UI
chokes. With memoization, each equation pays katex.renderToString
exactly once; subsequent tokens re-walk the tree but hit cache for
unchanged equations.
The wrapper mirrors rehype-katex's semantics exactly: same class
detection (language-math, math-inline, math-display), same
<pre>-walk-up for fenced math blocks, same parent.children.splice
replacement, same SKIP traversal, same strict-then-lenient render
strategy with VFile message reporting.
Cached children are structuredCloned on each splice so downstream
rehype plugins or toJsxRuntime can't mutate the cache.
* fix(desktop): declare katex-memo deps directly + drop per-app lockfile
katex-memo.ts (added in 112cad59b) imports hast-util-from-html-isomorphic,
hast-util-to-text, remark-math, katex, and unist-util-visit-parents but
those were never added to apps/desktop/package.json. They were silently
resolving via @streamdown/math at the workspace root, which broke the
moment `npm i --prefix apps/desktop` ran with the per-workspace lockfile
because that install only consults apps/desktop/package.json. Add them
as direct deps, plus unified/vfile/@types/hast for the type imports.
Also delete apps/desktop/package-lock.json — root package.json declares
workspaces: ["apps/*"], so npm manages all lockfile state at the root.
The stale per-app lockfile is what made `npm i --prefix apps/desktop`
diverge from the workspace install in the first place and left an empty
apps/desktop/node_modules/@assistant-ui/ stub that Vite's dep optimizer
then tried (and failed) to open at @assistant-ui/core/dist/internal.js.
* feat(desktop): disable Backdrop noise overlay by default
The noise overlay defaulted to on, which adds a busy speckle layer over
the whole window for every new user. Flip the Leva default to off; the
toggle stays in Backdrop / Noise for anyone who wants it back.
* fix(desktop): polish LaTeX rendering — currency, code blocks, brackets
Five distinct bugs surfaced from a math-heavy stress test:
1. Adjacent code fences glued together. scrubBacktickNoise's
second-pass regex /``\s*``/g matched the LAST 2 backticks of
one fence + whitespace + FIRST 2 backticks of the next, collapsing
two blocks into one. Fixed with lookbehind/lookahead so we only
match exactly 2 backticks not part of a longer run.
2. Whitespace eaten between fences and following content.
stripPreviewTargets internally calls .trim() which strips leading/
trailing whitespace from each split-segment. For segments between
two fences this collapsed \n\n to '', gluing fence close to next
block. Fixed by capturing leading/trailing whitespace at the call
site and restoring it after the transform.
3. Currency dollar signs eaten as math. With singleDollarTextMath:true
remark-math greedy-matched any pair of $, so '$5 ... $10' became
one inline math span. Added escapeCurrencyDollars to escape $<digit>
patterns to \$<digit> in prose segments (not in code). Trade-off:
math expressions starting with a digit (rare — '$5x = 10$') get
escaped too. Mirrors the convention in ChatGPT/Claude's UIs.
4. \(...\) and \[...\] LaTeX brackets unsupported. Models often
emit these instead of $...$ / $$...$$. Added
rewriteLatexBracketDelimiters preprocessor pass.
5. ```latex / ```tex blocks were being routed to KaTeX via a
rewrite to ```math. Aligns with GitHub markdown convention:
```math = render as math; ```latex / ```tex = LaTeX/TeX
source code (syntax highlighted, not rendered). Conflating them
broke teaching/showing-source use cases. MATH_FENCE_LANGUAGES
pruned to {'math'} only.
Also flipped parseIncompleteMarkdown to true (was !isStreaming) so
the math parser can't see $ inside streaming-but-not-yet-closed code
fences. Shiki was already deferred via defer={isStreaming} so this
doesn't introduce new tokenization cost.
Test: 18/18 existing tests still pass; one test updated to expect
escaped \$ in currency-prose-with-URL case.
* fix(desktop): detect Python via registry/filesystem; pin to 3.11–3.13
Two related fixes for Python detection on Windows:
1. py.exe (Python launcher) is missing from per-user installs that
didn't check the launcher option, so 'py -3.X --version' alone
misses real Python installs. User-reported case: clean Win11 +
official Python.org 3.14 install -> 'where py' returned nothing,
our installer offered to install Python again. Both NSIS prereq
page and main.cjs now probe in this order:
1. py.exe launcher (when present)
2. PEP 514 registry: HKLM/HKCU\SOFTWARE\Python\PythonCore\<v>\InstallPath
3. Filesystem: %ProgramFiles%\Python<v>, %LocalAppData%\Programs\Python\Python<v>
Crucially, we never fall back to running 'python.exe' from PATH
on Windows — the WindowsApps stub at %LOCALAPPDATA%\Microsoft\
WindowsApps\python.exe is a redirector that opens the Microsoft
Store window if no Store Python is installed. Triggering that
during boot would be terrible UX. Registry/filesystem probes
never execute the binary.
2. Drop 3.14 from the supported version set. Several Hermes deps
(notably pywinpty, which carries Rust crates like
windows_x86_64_msvc) don't yet publish 3.14 wheels. With wheels
missing, 'pip install -e .' falls back to building from sdist,
which needs a Rust toolchain — users see 'could not compile
windows_x86_64_msvc build script' on first run. install.ps1
sidesteps this by pinning to 3.11 via uv; the desktop installer
doesn't yet have the same uv-managed-Python pathway, so for now
we accept 3.11/3.12/3.13 and tell winget to install 3.11 if
none of those are present. Revisit when the wheel ecosystem
catches up to 3.14 (~early 2026).
* feat(desktop): Cron, Profiles, usage analytics, and titlebar fixes
- Add Cron and Profiles sidebar routes with full CRUD-style flows and API wiring.
- Extend Command Center with auxiliary task overrides and a Usage panel (7d/30d/90d).
- Fix titlebar geometry for WSL/Windows (native overlay width, tool spacing).
- Remove stray merge conflict markers from pyproject.toml optional deps.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(title-bar): position sidebar toggle button
* feat(desktop): composer queue — queue many, edit/delete/cancel-edit, Cursor-style
Press Enter while busy with a draft to queue it; with no draft to interrupt
and send the next queued turn. Auto-drains one queued turn each time the
session settles, same as Cursor. Queue persists across reloads so an
interrupted-and-queued turn isn't lost on refresh.
Each queued row supports edit-in-composer (with explicit Save/Cancel),
send-now (↑), and delete. Drain skips only the entry currently being
edited so the rest of the queue keeps flowing.
Queue dequeue is transactional — an entry only leaves the queue after
`prompt.submit` is accepted, so a rejected submit doesn't drop the turn.
Also shrinks the `[interrupted]` marker to a muted one-liner and drops
its assistant footer so it stops looking like a real reply.
* fix(desktop): handle empty usage analytics totals
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(desktop): address PR review titlebar and usage races
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(desktop): add MCP settings and live subagent tree
Surface configured MCP servers in Settings with JSON edit/save and a gateway-backed reload action so users can manage tool servers without falling back to slash commands.
Track live subagent gateway events in a desktop store, show active subagent counts in the Agents statusbar item, and replace the Agents overlay stub with a live spawn tree for the active session.
* fix(desktop): move power-user views out of sidebar
Keep Cron and Profiles available through lower-prominence chrome entry points so the workspace sidebar stays focused on core chat navigation.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(desktop): subagent overlay reads like a live transcript, not a dashboard
Strip the card chrome and rewire /agents to feel like peeking into the
child agent's stream:
- subagents store: single `stream` of typed entries (thinking/tool/progress/
summary) replaces the parallel notes/thinking/tools arrays. Drop unused
fields (toolsets, depth, apiCalls, reasoningTokens, sessionId).
- agents view: no OverlayCards, no boxed stream, no per-row borders. Goal +
status pill + indented stream lines, full row width.
- Group root spawns into "Delegation N" sections when batch shape + spawn
time match — hides task-index interleaving and makes hierarchy obvious.
- Sort tree by spawn time, then task_index. Step indicator is one colored
pill (primary while running, emerald when done) inside the row, not a
trailing pill that wrapped under the chevron.
- Tree picks up `subagent.start` (not only `spawn_requested`) and prunes
delegate-tool fallback rows once native subagent events land for the
session — fixes duplicate "Delegated task" rows alongside the real ones.
* feat(desktop): Esc closes every OverlayView-based overlay
Lift the keyboard handler into the shared OverlayView so Agents, Settings,
Command Center — and anything we build on top of it later — all dismiss on
Esc by default. Nested Radix dialogs stop propagation themselves, so a
modal opened inside an overlay (e.g. model picker inside Settings) still
closes the modal first, not the overlay underneath.
Drop the now-redundant Esc handlers in Settings (kept Cmd/Ctrl+P) and
Command Center.
* fix(desktop): drop numbered step pill on subagent rows
The pill was getting clipped at the overlay edge anyway. Just use the
status glyph (●/✓/✗/■/○) — the delegation header already conveys
"3 workers, 3 active", and order in the list implies which step you're
looking at.
* fix(desktop): drop noisy "returned N items / empty object" stub strings
When a tool returns nothing useful, the row should be silent — the title
("Search Files", etc.) already tells the user what happened. Counting the
fields in an opaque payload is engineer-noise.
`formatToolResultSummary` and `minimalValueSummary` now return '' for
empty arrays / records / unrecognized values; tool-fallback already hides
the detail section when its body is empty.
* refactor(desktop): subagent rows borrow chat tool patterns (fade-in, lucide glyphs, shimmer)
Pull the agents view closer to how chat tool blocks render:
- statusGlyph() returns the same lucide BrailleSpinner / CheckCircle2 /
AlertCircle vocabulary as tool-fallback's statusGlyph
- Stream lines fade-in via useEnterAnimation (one-shot WAAPI), keyed per
entry so streamed deltas settle in instead of popping
- Subagent rows fade in too, and pick up the existing data-slot=tool-block
spacing rules between blocks
- Active stream line trails a BrailleSpinner instead of a hand-rolled
pulsing rectangle
- Goal text drops FadeText (which forces nowrap); keep FadeText only for
the single-line meta subtitle
- Running rows shimmer the title — same affordance the chat thinking row
uses
* refactor(desktop): make /agents subagent-only, drop sidebar + dead sections
Activity rail and History stub were both noise. Strip the split layout,
sidebar, route enum, and the rail/stub helpers — the overlay is now just
the spawn tree, centered in a max-w-3xl column so it stops claiming the
whole screen for one section's worth of content.
* feat: update cron modals
* Add dedicated GUI log stream for dashboard debugging.
Capture dashboard and PTY websocket lifecycle failures in gui.log and expose it via hermes logs.
* Improve desktop runtime UX by surfacing inference readiness in gateway status and hardening WSL link opening.
This also stabilizes markdown code/table block spacing and adds root-install guards so desktop dev runs use a healthy workspace dependency tree.
* Log detailed GUI websocket failure metadata.
Capture richer reject/disconnect/send/parse context for dashboard gateway websocket flows so GUI connection failures are diagnosable from logs.
* Default dashboard startup logging to GUI mode.
Detect the dashboard subcommand during early CLI bootstrap so gui.log is attached from process start and GUI startup failures are always captured.
* Clean up gateway status conditionals and logging bootstrap mode detection.
Simplify nested dashboard gateway status branches for readability and use a concise first-subcommand check when selecting early GUI logging mode.
* add logging to nsis installer
* feat: glass ui pass
* fix(desktop): persist inline assistant errors across hydrate/resume
- Detect provider failure text arriving via message.complete
(HTTP 4xx, "API call failed after N retries", Provider/Gateway
error: ...) and persist as an inline assistant error instead of
regular completion text, blocking the hydrate that was wiping it.
- preserveLocalAssistantErrors: merge by id so same-id hydrated
messages keep their local error, and preserve the optimistic
user+error pair as a unit (with tail-user dedupe).
- Hook all hydrate/resume writers (use-session-actions resume +
fallback, hydrateFromStoredSession, syncSessionStateToView) into
the merge so stale snapshots can't clobber a failed turn.
- Add error to chatMessagesEquivalent so the resume diff actually
sees error-only changes and paints them.
- editMessage on a failed turn now submits a plain resend (no
truncate_before_user_ordinal) and retries plainly on the
"no longer in session history" race.
Style polish on touched files:
- Inline error: text-only treatment (no card).
- User stop / edit-composer send: shared Tabler IconPlayerStopFilled
glyph + shared icon-button class slot for parity.
* feat(desktop): theme xterm with active light/dark mode
The right-sidebar terminal hardcoded a light palette, which read poorly
on the dark glass surface. Subscribe to `useTheme().resolvedMode` and
hot-swap `term.options.theme` so Shift+X (and any other mode change)
updates the terminal in place without tearing down the PTY session.
Dark mode uses xterm's built-in defaults (white fg/cursor + vivid ANSI
16) with just a transparent background so the glass shows through;
light mode keeps the existing hand-tuned overrides for legibility on a
bright surface.
* feat(sidebar): right-click + drag-reorder sessions and workspaces
- Wire right-click on session rows to open the same actions menu;
suppresses the OS-native context menu so Windows stops looking awful.
- Share dropdown + context menu items via useSessionActions() driving
a single declarative ItemSpec[]; render polymorphic over MenuItem.
- New shadcn ContextMenu primitive mirroring DropdownMenu styling.
- Restore drag-and-drop reordering for Agents (lost during the cwd
cleanup) and add reordering of workspace groups via a right-side
grab handle. Pinned reorder unchanged.
- Generic orderByIds<T> replaces the duplicated session/group orderers;
useSortableBindings() hook collapses the two Sortable wrappers.
- cursor-pointer on every actionable element; cursor-grab on handles.
- KISS pass: baseName() helper, AGE_TICKS table, single WORKSPACE_PAGE
constant, flatter SidebarSessionsSection render.
* feat(desktop): solarize the xterm palette in both light & dark
xterm's default ANSI 16 is tuned for dark and reads candy-bright on the
light glass surface (vivid cyans/greens). Ship the canonical Solarized
palette (Schoonover) for both modes — same 16 accents either way, only
fg/cursor swap between `base00/01` (light) and `base0/1` (dark), so a
prompt's colors look uniform across a Shift+X toggle.
Background stays transparent in both modes — Solarized's cream/slate
backgrounds would fight the glass.
* feat(desktop): virtualize chat thread + sidebar via TanStack Virtual
Replaces `use-stick-to-bottom` and per-row session rendering with
`@tanstack/react-virtual`, matching what Cursor uses.
Chat thread (`thread-virtualizer.tsx`):
- Natural-flow virtualization (padding spacers, not absolute items) so
`position: sticky` on the human bubble still resolves cleanly against
the scroller.
- Custom at-bottom anchor: pins when armed, disarms on user-driven
upward scroll, re-arms at bottom, jumps on session switch +
`thread.runStart`.
- Loading indicator and `--thread-last-message-clearance` move to a
real `[data-slot=aui_composer-clearance]` node; drops the brittle
`:nth-last-child(1 of …)` rule that can't fire reliably under
virtualization.
Sidebar (`virtual-session-list.tsx`):
- Flat agents list virtualizes at >=25 rows; pinned and
workspace-grouped paths stay direct-render.
- `SortableContext` keeps all IDs; only the window mounts; dnd-kit's
`setNodeRef` is merged with `virtualizer.measureElement` so rows
participate in both DnD hit-testing and TanStack measurement.
Drops `use-stick-to-bottom`. Streaming test gets a global
`offsetWidth/offsetHeight` stub so the virtualizer's viewport sizing
works in jsdom; the scroll-up-doesn't-pull-back invariant still passes.
* feat: more ui qa
* fix(desktop): trim sidebar terminal startup spacer
Drop zsh's initial spacer row before writing the first terminal prompt so new sidebar terminal sessions do not open with a selectable blank line.
* chore: uptick
* feat(desktop): thin installer + first-launch install.ps1 bootstrap
Converges the Windows packaged desktop installer onto a single canonical
install topology: drop the Electron shell only (~80MB instead of ~500MB),
clone Hermes Agent at a build-time-pinned commit on first launch via
install.ps1's stage protocol, and treat the resulting git checkout at
%LOCALAPPDATA%\hermes\hermes-agent\ as the canonical install location
(same path the CLI installer uses). Future updates flow through the
existing applyUpdates() git-pull path.
Replaces the previous fat-installer architecture where the .exe bundled
a pre-staged hermes-agent source tree under resources/hermes-agent/ that
was then sync'd into ACTIVE_HERMES_ROOT at launch -- a complicated
factory-vs-active dance with several footguns (FACTORY_HERMES_ROOT
mismatch on path resolve, isGitCheckout guard regressions, pyproject
hash drift detection inside the sync loop).
Architecture overview
---------------------
Build time
apps/desktop/scripts/write-build-stamp.cjs writes
apps/desktop/build/install-stamp.json with {commit, branch, builtAt,
dirty}. Honours $GITHUB_SHA / $GITHUB_REF_NAME in CI, falls back to
`git rev-parse HEAD` locally.
apps/desktop/scripts/stage-native-deps.cjs copies the runtime subset
of @homebridge/node-pty-prebuilt-multiarch from the workspace-root
node_modules into apps/desktop/build/native-deps/. Workspace dedup
hoists this dep to the root, out of reach of electron-builder's
`files:`-restricted collector; staging gives us a deterministic
path to extraResources.
electron-builder ships both into resources/install-stamp.json and
resources/native-deps/ respectively.
Boot resolver (electron/main.cjs)
Resolver order:
1. HERMES_DESKTOP_HERMES_ROOT override
2. SOURCE_REPO_ROOT (dev mode)
3. ACTIVE_HERMES_ROOT git checkout WITH .hermes-bootstrap-complete
marker -- the post-install fast path
4. `hermes` on PATH (CLI-installed user adding the desktop)
5. pip-installed hermes_cli via system Python
6. bootstrap-needed sentinel -> hand off to runBootstrap
Deletes the entire FACTORY_HERMES_ROOT / RUNTIME_MARKER /
syncTreeExcludingVenv machinery (-200 lines). The isGitCheckout
guard that bit us in the install.ps1 PR is gone.
First-launch bootstrap (electron/bootstrap-runner.cjs)
1. Resolve install.ps1: prefer SOURCE_REPO_ROOT/scripts (dev), else
download from GitHub raw at INSTALL_STAMP.commit (cached at
HERMES_HOME\bootstrap-cache\install-<sha>.ps1).
2. Fetch the stage manifest via install.ps1 -Manifest -Commit X
-Branch Y.
3. Iterate stages: install.ps1 -Stage <name> -NonInteractive -Json
-Commit X -Branch Y per stage.
4. On all stages green: write the .hermes-bootstrap-complete
marker with {schemaVersion, pinnedCommit, pinnedBranch,
completedAt, desktopVersion}.
Per-run log to HERMES_HOME\logs\bootstrap-<ts>.log. Cancellation
via AbortSignal. Manifest cache so retries don't re-download.
Install overlay (src/components/desktop-install-overlay.tsx)
Mounted alongside the existing onboarding overlay; flexbox card
with header (static) + middle (scrollable) + footer (failure-only,
static). Subscribes to hermes:bootstrap:event IPC + resyncs from
hermes:bootstrap:get on mount/reload. Renders:
- 14-stage checklist with per-stage state icons
- Overall progress bar + current-stage spotlight
- Auto-expanded installer-output panel on failure
- "Copy output" button (full ring buffer + error to clipboard)
- "Reload and retry" wired through hermes:bootstrap:reset to
clear main.cjs's latched failure
Synthetic empty-manifest event from main.cjs flips the overlay to
'active' immediately so the slow install.ps1 download doesn't
leave the user staring at the generic Preparing splash.
Failure latching (main.cjs)
bootstrapFailure module-scope variable holds the rejection after
install.ps1 fails. startHermes() throws the latched error
immediately when set, bypassing the entire ensureRuntime +
runBootstrap chain. Without this, the renderer's ensureGatewayOpen
retries would re-run install.ps1 in a 5-10 min hot loop while the
user was still reading the failure overlay. Cleared via
hermes:bootstrap:reset on user-driven retry.
Unsupported-platform overlay (1F)
macOS / Linux packaged builds (no install.sh stage protocol yet)
emit an unsupported-platform event with a copy-pasteable install
command + docs URL. Dedicated overlay branch with "Copy command"
+ "I've run it -- retry" buttons.
install.ps1 additions (Phase 1F.3 + 1F.5)
-----------------------------------------
New -Commit and -Tag string params. Precedence Commit > Tag >
Branch. Honoured by all three code paths (update / fresh clone /
ZIP fallback), with archive URL selection that handles each
ref-type variant. Detached-HEAD checkouts intentionally -- they're
pins, not branches the user pulls into.
EAP=Continue wrap around the new pin-step git invocations. `git
fetch origin <commit>` writes the routine 'From <url>' info line to
stderr; under the script's global EAP=Stop that terminates the
script even though fetch+checkout succeed. Matches the established
pattern in Install-Uv, Test-Python, _Run-NpmInstall.
Backend fix (hermes_cli/web_server.py)
--------------------------------------
CORS allow_origin_regex now accepts Origin: 'null'. Packaged
Electron loads index.html via file://; Chromium sets the WebSocket
upgrade Origin header to the opaque origin 'null', which the old
regex rejected with HTTP 403 before gateway_ws() ever ran. This
failure mode was masked in the older FACTORY_HERMES_ROOT
architecture because the resolver often found an existing hermes
on PATH with different binding behavior.
Security maintained: localhost-only bind keeps cross-machine pages
out; per-process session token still gates every authenticated
/api/ endpoint regardless of Origin.
Desktop QoL
-----------
DevTools is now enabled in packaged builds (F12 / Cmd+Opt+I).
Field-debugging trade-off: tiny attack surface increase versus
a much better support story when CSP / WS / theme issues surface.
NSIS prereq-check page deleted (-767 lines). The standard
Welcome -> License -> Directory -> InstallFiles -> Finish wizard
now installs without custom Python/Git/ripgrep detection -- those
prereqs are install.ps1's job at first launch.
Test infrastructure (Phase 1G)
------------------------------
apps/desktop/scripts/test-desktop.mjs rewritten as a cross-platform
bundle validator (was darwin-only and asserted on dead factory-
payload paths):
NEGATIVE: hermes_cli/main.py is NOT shipped (regression guard)
POSITIVE: install-stamp.json carries a real commit + branch
POSITIVE: node-pty native deps shipped under resources/native-deps
POSITIVE: renderer dist/index.html reachable (asar or unpacked)
New nsis mode and npm run test:desktop:nsis script.
Validated end-to-end on clean Win10 VM
--------------------------------------
Confirmed: NSIS installer drops Electron shell, app launches,
install overlay shows progress, install.ps1 clones the pinned
commit, 14 stages run to completion, marker written, backend
spawns, WebSocket connects, onboarding overlay asks for API key,
main UI loads, integrated terminal works.
Failures handled: bootstrap stays failed (no hot-loop retry),
"Copy output" gives actionable transcript, "Reload and retry"
explicitly re-runs install.ps1.
What's deferred
---------------
- MSIX wrapping (Phase 2): same Electron .exe under MSIX manifest
with runFullTrust, signed and submitted to Microsoft Store.
- install.sh stage protocol parity (Phase 2): once shipped, the
unsupported-platform overlay becomes drive-it-yourself and
macOS/Linux packaged installers gain feature parity with Windows.
* feat(desktop): persistent terminal pane + fullscreen takeover
Adds a VSCode-style "focus terminal" toggle to the right sidebar's Terminal
tab that takes over the chat pane area without unmounting the shell. The
xterm host is mounted once at the layout root and CSS-overlayed onto
whichever <TerminalSlot /> is currently active, so the PTY session,
scrollback, selection, focus, and WebGL renderer survive every toggle.
Also:
- WebGL renderer (matching dashboard ChatPage) so Hermes' TUI skins paint
faithfully instead of muting through xterm's default DOM renderer
- File drag/drop from the project tree or OS into xterm — paths are
shell-quoted (zsh/bash/pwsh/cmd) and written straight into the PTY
- Solarized dark canvas with brights promoted to real accent variants
(Schoonover's UI-gray brights washed out every TUI accent)
- Strip NO_COLOR/FORCE_COLOR/COLORFGBG/TERM=dumb leaking from non-tty
parents (CI runners, Cursor's agent shell) so the embedded shell gets
truecolor regardless of how Electron was launched
- rAF-debounced ResizeObserver — running fit.fit() synchronously during
sibling pane transitions crashed the WebGL texture-atlas rebuild
* fix(install.ps1): strip UTF-8 BOM regression that broke 'irm | iex'
The canonical install flow
irm https://raw.githubusercontent.com/.../scripts/install.ps1 | iex
fails on PowerShell 5.1 with a cascade of 'The assignment expression
is not valid' errors at every param() default value:
[string]$Branch = 'main',
~~~~~~
The assignment expression is not valid. The input to an assignment
operator must be an object that is able to accept assignments...
Root cause: scripts/install.ps1 carries a UTF-8 BOM (0xEF 0xBB 0xBF)
as its first three bytes. 'irm' returns the response body as a string;
on PS 5.1 the BOM survives into that string as a leading \ufeff
character. 'iex' then evaluates the string and PS's parser chokes
on the invisible character before param() -- error recovery proceeds
into the body but every assignment is reported as broken.
This was the exact failure mode the install.ps1 hardening pass (PR
#27224) deliberately fixed by stripping the BOM and ensuring the
file body is pure ASCII. Commit 4279da4db ('fix(windows): make
PowerShell installer parse in 5.1') re-introduced the BOM later,
unintentionally undoing the irm|iex compatibility fix; the merge
that brought it into bb/gui carried it forward.
Fix: strip the three BOM bytes. File body is verified pure ASCII
(any-byte > 127 returns false), so PS 5.1 with no BOM falls back to
Windows-1252 decoding which is identical to ASCII for our content.
Both install paths now work:
- 'irm ... | iex' (canonical CLI)
- 'powershell -File install.ps1' (programmatic / desktop bootstrap)
* install.ps1: detect ARM64 Windows reliably for Node and Git stages
Add a Get-WindowsArch helper that reads Win32_Processor.Architecture
via CIM (invariant to PowerShell host bitness) with PROCESSOR_ARCHITEW6432
fallback. Use it in:
- Install-Git: previously only triggered the arm64 PortableGit asset
when invoked from a native-ARM64 PowerShell host. WoW64 / emulated
x64 hosts (the default powershell.exe on Windows-on-ARM) saw
PROCESSOR_ARCHITECTURE=AMD64 and fell through to the x64 PortableGit
build, leaving ARM64 users on emulated Git for Windows.
- Test-Node: previously hardcoded the Node download to win-x64 on any
64-bit OS, so ARM64 users always got x64 Node under Prism emulation
even though Node ships an arm64 build for Windows. The winget
fallback now also passes --architecture arm64 on ARM64.
Python remains x86_64 by design: uv intentionally prefers
windows-x86_64 cpython on ARM64 hosts for ecosystem (wheel)
compatibility (see astral-sh/uv#19015).
* install.ps1: harden Install-SystemPackages against winget msstore failures
The previous winget invocation discarded stdout/stderr and trusted no
signal at all -- not the exit code (winget exits 0 even when it bails
"please specify --source"), not output (sent to Out-Null), not the
catch handler (winget returning 0 means no exception fires). The only
trust signal was a post-install Get-Command rg / Get-Command ffmpeg
check, which would also miss the package because %LOCALAPPDATA%\
Microsoft\WinGet\Links (where winget puts command aliases) is added to
PATH by AppExecutionAlias machinery only in fresh shells. End result on
machines where the msstore source has a cert problem (0x8a15005e --
common on Windows-on-ARM and some corporate networks): silent failure,
no log, no breadcrumb, and the user is told the install succeeded.
Specifically:
- Pin --source winget on every winget install call. Defeats the broken-
msstore-source path. We ship nothing from msstore so this is safe and
forward-compatible.
- Add --exact --id for a tighter package match.
- Capture each winget invocation's combined stdout/stderr + exit code to
%TEMP%\hermes-winget-<pkg>-<n>.log instead of Out-Null. On the happy
path the log is deleted after the post-install check confirms the
binary is on PATH; on failure the log is kept and its path is named in
a Write-Warn so the user has something to grep.
- Refresh PATH to include %LOCALAPPDATA%\Microsoft\WinGet\Links in
addition to the User/Machine env-var hives, so Get-Command sees newly-
installed winget aliases in the same process.
- No behavior change on the happy path. Same Write-Info/Success/Warn
cadence, same fallback order (winget -> choco -> scoop -> manual),
same $script:HasRipgrep / $script:HasFfmpeg outputs.
Verified end-to-end on a real Snapdragon ARM64 Windows host: ripgrep
uninstalled, stage re-run, [OK] ripgrep installed in 1.4s, ok:true.
* desktop: swap node-pty fork for upstream microsoft/node-pty 1.1.0
The previous dependency, @homebridge/node-pty-prebuilt-multiarch@0.13.1,
publishes no win32-arm64 prebuilds on its v0.13.x line, and its v0.14.x
betas (which do add an arm64 Windows build) ship no electron-vXXX-win32-
arm64 prebuilds at all -- so packaged Electron 40 builds (NMV 143) would
fail at runtime even on a successful npm install. Net effect: the
desktop's integrated terminal was unbuildable on Windows-on-ARM, in
both dev (npm install fails: 404 fetching the node-vXXX-win32-arm64
prebuilt) and packaged builds (no Electron-ABI prebuilt exists).
The homebridge fork was originally created because upstream node-pty
shipped no prebuilds at all. That hasn't been true since node-pty@1.0
(April 2024), which:
- bundles prebuilts for mac (arm64+x64) and Windows (arm64+x64) directly
inside the npm tarball -- no GitHub-Releases fetch, no missing-binary
failure mode
- uses N-API (node-addon-api) for ABI stability across Node and Electron
major versions, so the same pty.node binary loads under Node 22 (dev)
and Electron 40+ (packaged) without per-ABI rebuilds
- is what VS Code, Hyper, and Theia actually ship
API surface is identical (spawn / onData / onExit / write / resize /
kill) -- no call-site changes needed.
Specifically:
- apps/desktop/package.json: replace the @homebridge fork with
node-pty@1.1.0 (exact pin). Widen `asarUnpack` from `["**/*.node"]`
to also unpack `**/prebuilds/**`, because node-pty ships runtime-
execed helpers alongside its .node files (darwin spawn-helper has no
extension and would not be matched by `**/*.node`; conpty.dll,
OpenConsole.exe, winpty.dll, winpty-agent.exe on Windows are also
exec'd at runtime and cannot live inside asar).
- apps/desktop/electron/main.cjs: update both require() strings to
match the new package name and the new staged path under
resources/native-deps/node-pty/.
- apps/desktop/scripts/stage-native-deps.cjs: point at node_modules/
node-pty. node-pty's prebuilts live under prebuilds/<plat>-<arch>/
(not build/Release/), so update the include glob to copy that dir.
Per-arch staging keeps the resource bundle small (target arch comes
from npm_config_arch when electron-builder cross-builds, else
process.arch). Explicitly enumerate file types in the prebuilds glob
so the ~25 MB of .pdb debug symbols that prebuild-install bundles
for Windows crash analysis don't bloat the installer (29 MB -> 2.6 MB
staged on win32-arm64). Re-assert +x on the darwin spawn-helper
defensively, since a stripped mode bit would manifest as a silent
ENOENT at first pty.spawn().
- apps/desktop/scripts/test-desktop.mjs: update expectedNativeDepPaths()
and its assertion site to look at prebuilds/<plat>-<arch>/ instead of
build/Release/. Add an explicit spawn-helper-exists check on darwin
so a regression in the asarUnpack glob would fail loudly in CI rather
than at first PTY spawn.
Trade-off: Linux end-users lose prebuilts and fall back to building
node-pty from source on `npm install`. Acceptable because Hermes
ships no Linux desktop builds (desktop-release.yml matrix is mac + win
only, package.json declares no `linux` target), and Linux developers
hacking on the desktop already need a C++ toolchain for the rest of
the stack.
Verified on Windows 11 ARM64 (Snapdragon):
npm install -> exit 0
node -e "require('node-pty').spawn(...)" round-trip -> OK
stage-native-deps -> 27 files, 2.6 MB
load from staged tree (simulates packaged fallback) -> ConPTY
round-trip OK
* desktop+gateway: harden Slack socket recovery and Windows restart dedupe (#28873)
* desktop+gateway: harden Slack socket recovery and Windows restart dedupe
Fix Slack Socket Mode reliability by adding a watchdog/reconnect path so silent socket task drops no longer leave the adapter stuck. Harden Windows gateway lifecycle by avoiding desktop-binary path collisions, making gateway PID scans case/extension tolerant, and reusing in-flight restart actions to prevent duplicate gateway spawns.
* test(slack): add Socket Mode watchdog/reconnect behavioural coverage
Drive the new Slack Socket Mode self-healing logic through a fake AsyncSocketModeHandler so we can simulate the P0 silent-hang failure mode (task exit, transport disconnected, intentional shutdown, concurrent reconnect attempts) without touching real Slack.
* fix(slack,desktop): address Copilot review on watchdog races and path normalization
- connect(): explicitly cancel + await the prior socket watchdog before flipping _running, so an old monitor cannot exit between teardown and respawn (Copilot #1)
- _socket_watchdog_loop: wrap the body in try/except + add a done-callback that respawns on unexpected crash, so a transient bug cannot permanently disable self-healing (Copilot #2)
- normalizeExecutablePathForCompare: use the resolved path for realpathSync so non-string inputs cannot leak through (Copilot #3)
- Add tests for crash-recovery and atomic watchdog replacement across reconnects
* fix(slack): tighten connect() error path and clarify watchdog test intent
Address Copilot review round 2.
- connect(): wrap _start_socket_mode_handler/_ensure_socket_watchdog in a focused try/except so any failure rolls back partially-started handler/task state and leaves _running=False, ensuring the platform lock is always released by the outer finally
- Defer _running=True until after the handler is actually started so the watchdog observes a live socket task immediately and never spins against a half-built adapter
- Rename test_watchdog_self_restarts_after_unexpected_crash to test_watchdog_cancellation_does_not_respawn (matches what it actually asserts) and add test_watchdog_unexpected_exit_respawns_via_done_callback that drives a real RuntimeError through _on_socket_watchdog_done and verifies a fresh task replaces the crashed one
* fix(web_server): serialize action spawn check+store under a threading lock
Address Copilot review round 3.
FastAPI runs sync handlers on its threadpool, so two near-simultaneous /api/gateway/restart (or /api/hermes/update) requests could both observe "no live process" in _spawn_hermes_action's poll-based dedupe and double-spawn. Add a module-level _ACTION_SPAWN_LOCK around the entire check + Popen + _ACTION_PROCS store sequence so the dedupe is atomic across threads.
* fix: address Copilot review round 4
- slack.disconnect(): mirror connect()'s defensive cleanup — catch the broad Exception path on watchdog await so handler shutdown and lock release still run if the watchdog raised before cancellation took effect
- web_server._spawn_hermes_action: wrap subprocess.Popen in try/except so a missing executable / permission error closes the log file handle, writes a failure marker, and re-raises instead of leaking a file descriptor
- gateway._scan_gateway_pids: drop the over-broad "hermes.exe --profile" / "hermes.exe -p" patterns that would match any Hermes CLI subcommand using a profile flag (e.g. `hermes.exe --profile foo dashboard`); rely on the "hermes.exe gateway" + "hermes-gateway.exe" tokens instead
- tests: tighten _fake_create_task to assert coroutine input and return a real asyncio.Task that stays pending until pytest teardown, and update the three callsites whose mocked AsyncSocketModeHandler.start_async returned a non-coroutine value
* fix(slack): reset multi-workspace state on reconnect
Address Copilot review round 5.
connect() is reentrant (gateway restart, in-process reconnect), but it was leaving _bot_user_id / _team_clients / _team_bot_user_ids populated from the previous session. A reconnect that rotated the primary token or dropped a workspace would silently keep the stale bot user id and stale workspace client maps, leading to dispatch against gone workspaces.
Clear these three pieces of state right after _stop_socket_mode_handler() and before the auth_test loop, then let the loop repopulate from the current tokens. Add test_reconnect_refreshes_multi_workspace_state to lock it in.
* nix: package apps/desktop as .#desktop (#28964)
Adds nix/desktop.nix building the Electron renderer with buildNpmPackage
and wrapping nixpkgs' electron binary. Reuses .#default by setting
HERMES_DESKTOP_HERMES to its hermes binary, so the desktop's resolver
picks up the fully-wired nix hermes (venv, bundled skills/plugins,
runtime PATH) without reimplementing agent resolution.
- nix/desktop.nix: renderer + electron wrapper
- nix/hermes-agent.nix: finalAttrs form, exposes hermesDesktop in passthru
- nix/packages.nix: exposes .#desktop + adds to fix-lockfiles
- apps/desktop/package-lock.json: standalone hermetic lockfile
nix build .#desktop && nix run .#desktop both clean.
* fix(desktop): probe steps 4 & 5 of resolveHermesBackend before trusting
A user-reported failure on Windows-on-ARM: a pre-installed Python 3.13
on PATH makes findSystemPython() succeed, so resolveHermesBackend
returns a backend pointing at it -- but hermes_cli isn't in that
interpreter's site-packages. The spawn dies with ModuleNotFoundError
and the user sees a dead GUI instead of the first-launch installer.
Same shape can hit step 4 (existing `hermes` on PATH) when a stale
shim survives a partial uninstall.
Add cheap exit-code probes -- `python -c "import hermes_cli"` for
step 5, `<hermes> --version` for step 4 -- and fall through to step 6
(bootstrap-needed) on failure. install.ps1 then runs as if on a clean
box and the venv gets built.
Probes live in a standalone electron/backend-probes.cjs module so they
can be unit-tested with node --test, same pattern as bootstrap-platform.cjs
and hardening.cjs. New test file wired into test:desktop:platforms.
* test(desktop): allow `node-pty` bare-require in packaged entrypoints
Pre-existing failure on bb/gui since c858484b4 swapped the node-pty
fork for upstream microsoft/node-pty 1.1.0. main.cjs intentionally
bare-requires node-pty (it's hoisted by workspace dedup in dev, and
staged to resources/native-deps via scripts/stage-native-deps.cjs +
extraResources for packaged builds, with a try/catch fallback at
line ~38). The allowlist hadn't been updated to match -- same shape
as `electron`, which was already allowed.
* chore(deps): refresh root lockfile for dashboard @nous-research/ui 0.14.0
apps/dashboard/package.json was bumped to @nous-research/ui 0.14.0 (+
flag-icons ^7.5.0, motion ^12.38.0) but the root package-lock.json was
never refreshed. Running `npm install` from the repo root now
materialises 0.14.0's transitive closure (launder, bumps for
@nanostores/react, nanostores, sanitize-html, tailwind-merge).
No code changes; purely a lockfile catch-up so fresh checkouts on bb/gui
get a working dashboard install.
* chore(desktop): bump version to 0.0.1
First non-placeholder version so electron-builder's artifactName template
produces `Hermes-0.0.1-win-x64.exe` instead of the obviously-unreleased
`Hermes-0.0.0-...`. No release process yet; this just stops the artifact
filename from telling users "you got a debug build."
Bumped in three slots that all carry the desktop app's version:
- apps/desktop/package.json (source of truth)
- apps/desktop/package-lock.json (per-app lockfile, kept for CI parity)
- root package-lock.json's apps/desktop workspace entry
Identity-of-build for first-launch bootstrap continues to come from
build/install-stamp.json (commit SHA + builtAt), unchanged.
* fix: fs icon color
* perf(desktop): cut per-keystroke layout + listener churn in chat composer
Empirical work via CDP harnesses under apps/desktop/scripts/ (see
profile-typing-lag.md):
jsListeners growth (per round of 200 chars + GC):
before: +35 (verified leak — listeners stuck after 1st trigger popover use)
after: +0
Four narrow edits in src/app/chat/composer/index.tsx:
1. Drop the per-keystroke `editorRef.current.scrollHeight` read used to
decide composer expansion. Replace with `draft.length > 60` heuristic;
the existing ResizeObserver still catches edge cases. `scrollHeight`
is a forced-layout call and was firing on every char until the first
wrap.
2. Bucket measured composer height to 8px before writing
`--composer-measured-height` / `--composer-surface-measured-height`
on `documentElement`. Without this, the editor grows ~1px per char,
setProperty fires every keystroke, computed style is invalidated tree-
wide.
3. Remove the dead `$composerDraft` two-way sync. Nothing outside the
composer subscribed to that atom (verified via grep). Two useEffects
on `[draft]` were pushing draft→atom and atom→aui per keystroke for
no consumer. Also drop the per-keystroke
`reconcileComposerTerminalSelections` call; it was pruning stale
labels for `terminalContextBlocksFromDraft`, but that helper already
ignores labels not in the current submitted text, so pruning per
keystroke was just bookkeeping.
4. `refreshTrigger` fast-bails when the draft contains neither `@` nor
`/`. Previously `textBeforeCaret(editor)` ran on every input/keyup
regardless; `range.toString()` inside is O(n) over draft length.
Synthetic typing latency p50/p90/p99 is similar before vs after on a
freshly-loaded session (Blink can already handle ~30cps typing into a
contentEditable on its own); the real win is the listener leak being
gone and the global computed-style invalidations dropping ~8× when the
composer is sitting at a fixed height row.
The `Enter → stall` follow-up (see profile-typing-lag.md §"Submit /
TTFT stall") is unmeasured here — needs a throwaway session because
the harness fires a real prompt. Not blocking this commit.
* perf(desktop): cut FadeText forced layouts during streaming
The slowest user-felt path is typing into the composer while the
assistant is streaming. Profile (scripts/profile-under-stream.mjs):
FadeText measureOverflow self time: 35.8 ms → 18.1 ms (-50%)
total active CPU during 7s window: ~150 ms → ~50 ms
Two changes in src/components/ui/fade-text.tsx:
1. Drop the `useEffect([children])` that re-ran `measureOverflow`
(reads scrollWidth + clientWidth — forced layout) on every parent
re-render. `useResizeObserver` already fires the same callback on
mount and whenever the host span's box size changes; that covers
the only case where overflow state can legitimately change. The
previous explicit useEffect was a forced-layout flush on every
parent render, which during streaming meant every token tick.
2. Wrap the component in `memo` with a custom comparator that
short-circuits the entire render when scalar string `children` and
the className/fadeWidth/style props are unchanged. The hot path
was tool-fallback's title chips being re-rendered by parent
streaming updates even though their text was stable; memo+
comparator skips that.
Also adds two harness scripts under apps/desktop/scripts/:
- latency-under-stream.mjs (key→paint latency while a turn streams)
- profile-under-stream.mjs (CPU profile while a turn streams)
Updates profile-typing-lag.md with the streaming numbers and confirms
the Enter→paint submit path is already fast (≤320ms on the populated
session; the 2s "stall after Enter" the user noticed once was a
one-time cold-start, not reproducible at the UI layer).
I'd guess the felt jank in real use is fast-burst typing during a
long-form streaming reply (code blocks + markdown lists multiply the
per-token render cost). The CPU savings here scale linearly with
token volume.
* chore(desktop): drop diag scratch scripts no longer needed
* docs(desktop): correct leak-typing numbers on a real session
Re-ran the leak harness on a populated session (Phaser thread) for both
unpatched and patched builds. The original 'listener leak' was transient
warm-up cost, not a steady-state leak — both versions show 0 listener
growth/round in steady state.
The load-bearing number is forced layouts per character:
unpatched (HEAD~2): 7.02 layouts/char
patched (HEAD): 2.35 layouts/char (3× fewer)
The patches reduce per-char forced-layout work to Blink's natural floor.
Document node count and heap are flat in both builds.
* perf(desktop): fix "Enter jumps up" on long threads
User reported: after pressing Enter on a long thread, the view jumps up
— the just-submitted message disappears below the fold. Confirmed via
apps/desktop/scripts/measure-jump.mjs:
before: distFromBottom 0 → 49.5px, sticks there permanently
after: distFromBottom 0 → ~0 (worst case 4px for one frame)
Root cause in useThreadScrollAnchor (thread-virtualizer.tsx):
1. The sticky-bottom logic disarmed on any scroll event where
`scrollTop < lastTopRef.current`. That check can't distinguish a
user scrolling up from a programmatic `pinToBottom` write that
the browser clamped short of bottom (because content also grew in
the same frame, so `scrollTop = scrollHeight` lands at
`scrollHeight - clientHeight` for the OLD scrollHeight, which is
now below the NEW scrollHeight). Result: sticky-bottom disarmed
permanently on the user's first submit.
2. There was no synchronous pin tied to React's commit phase. By the
time the ResizeObserver fired and re-pinned, the user had already
seen ~50ms of "message below the fold" — visually that reads as the
view jumping up.
Fix:
- `programmaticScrollPendingRef` counter tracks scroll events we
expect to be ours (one per `pinToBottom` write). The scroll handler
skips the disarm check when consuming a pending tick, keeps the
arm bit true, and re-pins synchronously if the browser clamped us
short of bottom. A depth cap (8) breaks runaway loops in
pathological streaming-burst layouts.
- `useLayoutEffect` on `groupCount` increase pins BEFORE the browser
paints, eliminating the visible ~50ms window between optimistic
user-message insert and the RO/scroll-event chain firing.
Verified on the long Cloud Shadows thread (7-8 turns, ~11k px tall):
all three repro runs now hold within 0–4 px of bottom across the
post-Enter transition. Submit latency unchanged (paint 77–107 ms),
streaming-typing latency unchanged.
Also adds three debug harnesses:
- measure-jump.mjs — sample thread scroll across Enter
- probe-thread.mjs — dump current thread / scroll state
- diag-jump.mjs — intercept scrollTop + RO + mutations across Enter
* perf(desktop): rate-limit thread auto-pin during streaming
Follow-up to the Enter-jump fix. The first version did a synchronous
re-pin loop inside the on-scroll handler when the browser clamped our
`scrollTop = scrollHeight` write short of the new bottom; that gave a
tight 4 px visible jump on Enter, but during streaming the
ResizeObserver fires many times per second as content grows, and each
RO callback re-entered the pin loop. CPU profile showed
`Virtualizer.getMaxScrollOffset` climbing to 22 ms self over a typing-
during-streaming window — the sync re-pin path was paying tanstack-
virtual's recompute cost ~3× per token.
Re-architect:
- RO callback coalesces to one pin per animation frame. Streaming-rate
RO bursts now cost the same as a single per-frame pin.
- The on-scroll programmatic-counter guard remains (it's what prevents
the false-disarm bug when the browser clamps a write). It no longer
does sync re-pins; the next RO/rAF will catch up.
- The useLayoutEffect on groupCount (the path that fires on user
submit / new turn arrival) ALSO schedules one rAF pin in addition to
the synchronous pin. This catches the case where React mounts the
new message in a second commit (after our layout effect ran), which
grows scrollHeight again. Two pins instead of a tight loop, paid only
once per turn change.
Net effect on the Cloud Shadows long thread:
enter-jump transient: 12–20 px for 1 frame (was 49 px permanent)
CPU during stream+type: `getMaxScrollOffset` dropped out of top-5
self-time list
typing-during-stream: p50 ~10 ms paint, p99 ~20 ms (1 frame),
occasional 40 ms+ outliers during burst
token arrivals
Also adds scripts/profile-long-stream.mjs: 20-second streaming profile
with per-500ms FPS histogram + content-length tracking, so we can see
whether streaming render cost grows with message length (it doesn't —
sustained 60 fps).
* perf(desktop): use textContent for trigger precondition
Replace composerPlainText() call inside refreshTrigger's no-trigger
fast-bail with a textContent check. textContent is a browser-native
flat traversal; composerPlainText walks recursively with chip-aware
logic. We only need to know if @ or / appears; either way the trigger
char will be in textContent because chips contain @ in their refText.
Profile shows composerPlainText was ~18ms self over a 12s typing-during-
stream window, called from refreshTrigger on every keystroke. Most of
that was the precondition check (the trigger detection path is the
slow path but only runs when a trigger char is present).
* Revert "perf(desktop): use textContent for trigger precondition"
This reverts commit a6a78ff08a31129a3a47fa55aca260d93af913a5.
* Revert "perf(desktop): cut FadeText forced layouts during streaming"
This reverts commit 88e7d7537cdab87200405edf298e38cb37e0a950.
* Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer"
This reverts commit bff1b3261d18a2427ac6c345c99f8312728346dd.
* Revert "Revert "perf(desktop): cut per-keystroke layout + listener churn in chat composer""
This reverts commit b7b378e3a43f94b9f4a1a34155707c6301c0fd87.
* Revert "Revert "perf(desktop): use textContent for trigger precondition""
This reverts commit 0739588f4896902f7f0d4ded8b5eaeb92bfdf042.
* chore(desktop): synthetic-stream perf harness + scripts
Drops the React `<Profiler>` approach (no-op because Vite is currently
serving the production React build) in favor of an externally-observable
measurement stack: rAF frame intervals, `PerformanceObserver({entryTypes:
['longtask']})`, and a `MutationObserver` on the live streaming message.
Adds a synthetic stream driver — `window.__PERF_DRIVE__.stream({...})` —
that pushes tokens through the live `$messages` atom at a controlled rate,
so the assistant-ui runtime, incremental repository, and Streamdown
markdown pipeline see the same workload they'd see during a real LLM
stream, without the LLM cost.
The driver lives in `src/app/chat/perf-probe.tsx`; `main.tsx` side-imports
it under `import.meta.env.MODE !== 'production'` so it tree-shakes out of
prod builds. (Using `MODE` rather than `DEV` because our Vite setup
currently reports `DEV=false` even under `vite dev` — see the dev-build
note in `profile-typing-lag.md`.)
Scripts:
- measure-synthetic-stream.mjs drive synthetic + record frame/longtask/mutation
- profile-synth-stream.mjs CPU profile + top self-time during synthetic
- measure-real-stream.mjs same harness, real LLM stream
- profile-real-stream.mjs CPU profile bracketing the real stream window
- eval.mjs / reload.mjs small CDP helpers
A real-LLM measurement on Cloud Shadows (gpt-4o-mini, 39 s window) showed
12 longtasks in the same 75-127 ms range the synthetic predicted, so the
synthetic is a faithful proxy.
* perf(desktop): memo FadeText so it skips re-renders when text unchanged
FadeText is used 110+ times inside `tool-fallback.tsx` on a tool-heavy
thread. During streaming each parent re-render previously triggered the
component's `useEffect([children])`, which forced a `scrollWidth` layout
read even when the title text was unchanged. The `useResizeObserver` was
already covering the genuine resize case, so that effect was strictly
redundant work.
Drops the effect and wraps the component in `React.memo` with a custom
comparator that field-compares `className`, `fadeWidth`, and `style`,
plus identity-compares `children` (scalar fast-path; correct for JSX
nodes too since a new node should force a re-render).
Verified via temporary render counter on the 34 MB
`session_20260514_215353_fe0ac8` thread (110 FadeText instances): a
2 s synthetic stream went from ~11k FadeText render calls to 122 —
roughly one render per truly-new instance instead of one per parent
commit per instance.
Doesn't move the longtask needle on its own (Streamdown's markdown
re-parse dwarfs it) but eliminates a steady CPU floor and a class of
forced layouts during streaming. Profile-typing-lag.md documents the
full investigation, including the remaining Streamdown cost as the
real source of the perceived "5 fps moment" hitches.
* perf(desktop): memoize MarkdownText plugins to stop churning Streamdown
The inline `plugins={{ math: mathPlugin, ...(isStreaming ? {} : { code }) }}`
on `<StreamdownTextPrimitive>` constructed a new object literal on every
parent render. That broke `<Streamdown>`'s outer memo and forced its
internal `rehypePlugins` / `remarkPlugins` array useMemos to rebuild,
which propagates a new identity into every `<Block>` and defeats Block's
memoization for stable historical blocks.
After memoizing on `[isStreaming]` (the only real dimension of variance),
CPU profile during a 5 s synthetic stream on the 34 MB session shows
`parser` self-time dropping out of the top 10, `compile` cut roughly in
half, and `bn$1` / `m$1` (micromark internals) leaving the top entries.
Doesn't move the visible longtask count on its own — Streamdown's
per-Block parse cost still dominates whenever the last block's content
changes — but it removes a class of unnecessary re-parses for historical
blocks during streaming. See `scripts/profile-typing-lag.md` for the
full investigation.
* perf(desktop): floor assistant-text flush gap to 33ms for predictable batching
`scheduleDeltaFlush` previously coalesced via `requestAnimationFrame`
only. The "at most one flush per frame" guarantee that gives you is fine
for fast streams (>~80 tok/sec) where multiple tokens arrive within a
single frame, but breaks down at typical LLM token rates (30-80 tok/sec)
where each token arrives slower than the rAF cadence and triggers its
own React commit + Streamdown markdown re-parse.
Track `lastFlushAt` and require at least 33 ms between two flushes.
React 18+ auto-batching probabilistically already collapsed some of
these, but the floor makes it deterministic.
A/B on the 34 MB session, 300 tokens at 50 tok/sec (markdown chunks):
| | avgFps | p99 frame | LTs / 5 s | max LT |
|---|---|---|---|---|
| no floor (current rAF) | 54.0 | 38 ms | 2.0 | 145 ms |
| 33 ms floor (this PR) | 54.3 | 41 ms | 1.7 | 110 ms |
`inter-mutation` p50 also tightens from 22-28 ms to a clean 33 ms,
which is the expected signature of a deterministic floor. Doesn't fully
solve the user's perceived hitches — Streamdown's per-Block parse cost
when the last block grows past ~2 k chars is still the elephant — but
it consistently shaves the worst-case longtask and makes the streaming
cadence visibly steadier.
Also threads a matching `flushMinMs` option through the synthetic
stream driver in `perf-probe.tsx` + `scripts/measure-synthetic-stream.mjs`
so the harness can A/B both regimes without spending LLM credits.
See `scripts/profile-typing-lag.md` for the full investigation.
* perf(desktop): useDeferredValue for streaming markdown so parses don't block input
Streamdown's per-Block parse cost grows with the live tail's length and
is unavoidable inside the block-memo pattern (industry standard, see
findings doc). The fix is to stop having that work block the main thread.
`<DeferStreamingText>` is a 12-line wrapper that reads message-part state
via `useMessagePartText`, runs it through `useDeferredValue`, and
re-publishes via assistant-ui's `<TextMessagePartProvider>`. The inner
`<StreamdownTextPrimitive>` reads the deferred value through the normal
`useMessagePartText` hook — no fork, no internal-path imports, fully on
assistant-ui's public API. React's concurrent scheduler then:
- abandons in-flight deferred renders when a newer token arrives, so
intermediate states get skipped under fast streams
- deprioritises the markdown render when the main thread has urgent
work (typing, scroll), so input stays responsive even while a
100ms parse is queued
Streamdown already uses `useTransition` for its block-array setState;
this lifts the deferral up to the consumer boundary so it covers the
whole pipeline (preprocess → split → repair → parse → render).
A/B on the 34 MB session, 300 tokens at 50 tok/sec, markdown chunks
(four trials each, with the 33ms flush throttle on for both):
| | avgFps | p99 frame | LTs/5s | max LT | typing-while-stream p95 |
|---|---|---|---|---|---|
| pre | 54.3 | 41 ms | 1.7 | 110 ms | ~17 ms |
| post | 58.5 | 31 ms | 2.0 | 117 ms | 14-18 ms |
Longtask count + max LT unchanged — useDeferredValue doesn't reduce
CPU, only its priority. The avgFps lift and p99 frame drop are the
proof that the existing CPU is no longer blocking 60 fps cadence. One
clean run logged MUTATIONS=0 — React skipped every intermediate text
state and only committed the final one (textbook deferred-value
behaviour).
The actually-reduce-CPU path is replacing the parser with a state
machine like Flowdown — left for a future PR; see
`apps/desktop/scripts/profile-typing-lag.md` for the full investigation.
* feat(desktop): add hermes gui launcher
* feat(desktop): launch packaged gui builds by default
* bump gui version to 0.0.2
* fix(dashboard): allow file:// origin on loopback WS + diagnostic logging
Upstream commit 2e66eefbc ("fix(dashboard): validate WebSocket Host
and Origin") added a WebSocket Host/Origin guard to block DNS
rebinding against the dashboard. The guard rejects any Origin whose
scheme is not http/https or whose netloc is empty — which includes
Electron's renderer Origin: file:// when the desktop app loads its
bundle from disk in production mode.
That makes the bb/gui Electron desktop unable to open the gateway
WebSocket against the embedded backend on Windows / macOS prod
builds. The renderer reports "Desktop boot failed" and the backend
logs:
WARNING hermes_cli.web_server: gateway-ws reject
peer=127.0.0.1:NNNN reason=non_loopback_or_bad_origin
bound_host=127.0.0.1 close_code=4403
DNS-rebinding requires a DNS-resolvable hostname; file:// has no
host component and therefore cannot be the attack vector this guard
exists to block. When bound to a loopback interface (127.0.0.1 /
::1 / localhost), accept file:// origins so desktop wrappers can
attach. Non-loopback binds (operator opted into network exposure)
keep rejecting file:// — the loose policy doesn't apply.
Also adds per-reason diagnostic logging in
_ws_host_origin_is_allowed, so future ws-guard rejections name the
specific clause that fired (bad_host / bad_origin_scheme /
origin_host_mismatch) instead of the opaque
"non_loopback_or_bad_origin" surfaced at the call site.
Verified against tests/hermes_cli/test_web_server_host_header.py
(all 11 upstream tests still pass) and hand-tested by opening the
bb/gui Electron desktop dev build against the patched backend.
* fix(tui_gateway): restore _content_display_text helper
Bb/gui had dropped the helper but the orchestrator code merged from main
still calls it (_inflight_text, _message_preview). Re-add the definition
verbatim from main so session.create / _start_inflight_turn don't crash
with NameError on first prompt submit.
* fix(tui-gateway): restore _content_display_text helper lost in main merge
The May 27 merge of origin/main into bb/gui re-introduced two callers of
_content_display_text (in _inflight_text and _history_to_messages) but
dropped the helper definition itself, leaving an unresolved reference.
NameError fires on every user message via _start_inflight_turn ->
_inflight_text, taking down both the TUI and the desktop (which share
this gateway backend) the moment input is dispatched.
Restores the helper verbatim from main (commit 36c99af37) -- pure
structured-content text extractor, no other dependencies.
* fix(telegram): import Set for _dm_topic_chat_ids annotation
self._dm_topic_chat_ids: Set[str] = {...} at line 460 references Set
but only Dict, List, Optional, Any are imported from typing. The file
has no 'from __future__ import annotations', so the annotation is
evaluated at runtime and raises NameError on TelegramAdapter
construction.
* fix(setup): drop shadowing inner importlib.util re-imports
_print_setup_summary and _setup_tts_provider each had 'import
importlib.util' inside a try: block nested deeper in the function
body. Python flips importlib to function-local for the whole scope,
so earlier references in the same function (the neutts branches at
lines 493 / 1109) hit UnboundLocalError before the late import can
run.
The top-of-module 'import importlib.util' at line 14 already covers
both call sites, so dropping the redundant inner imports restores
the intended behavior.
* feat(install.ps1): add -IncludeDesktop switch + Stage-Desktop
The new Hermes-Setup.exe (Tauri bootstrap installer) passes -IncludeDesktop
so users who install via the GUI end up with a launchable Hermes.exe at
apps/desktop/release/<os>-unpacked/. Existing flows are unchanged:
* The 'irm install.ps1 | iex' CLI one-liner omits the flag — terminal
users don't need a prebuilt desktop binary; 'hermes desktop' builds
on demand.
* The Electron desktop's bootstrap-runner.cjs also omits the flag —
rebuilding apps/desktop from inside a running Hermes.exe would try
to overwrite the live binary on disk and fail.
Stage-Desktop runs after Stage-NodeDeps so workspace npm is already
installed when electron-builder fires. It does:
1. 'npm install' at repo root so apps/* workspaces resolve their deps
(Electron itself arrives via npm here, ~150MB)
2. 'npm run pack' in apps/desktop (tsc + vite + electron-builder --dir)
3. Probes apps/desktop/release/{win-unpacked,win-arm64-unpacked}/Hermes.exe
The --dir mode produces an unpacked launchable binary without an NSIS/MSI
installer artifact — we don't need one because Hermes-Setup.exe spawns the
unpacked binary directly via launch_hermes_desktop.
* feat(installer): Tauri bootstrap installer for first-time onboarding
Hermes-Setup.exe is a small signed Rust+Tauri binary that drives
scripts/install.ps1 stage-by-stage with a native UI matching the
desktop's design language. Replaces the chicken-and-egg pattern of
shipping a 200MB Electron app whose first launch existed only to
run install.ps1.
The architecture:
Rust backend (src-tauri/):
bootstrap.rs orchestrator -- Tauri commands, stage iteration
install_script.rs resolve install.ps1 (dev checkout, cache, GitHub raw)
powershell.rs spawn powershell, line-stream stdout/stderr, parse JSON
events.rs BootstrapEvent types -- mirror bootstrap-runner.cjs
paths.rs HERMES_HOME resolution + tracing log setup
build.rs bakes BUILD_PIN_COMMIT / BUILD_PIN_BRANCH from
'git rev-parse HEAD' at compile time
React frontend (src/):
Tauri webview rendering 4 screens (welcome / progress / success /
failure), driven by nanostores subscribing to the Rust event stream.
Visual layer reuses the desktop's styles.css wholesale via @import
so the installer and desktop never drift visually.
Distribution:
targets = ['app', 'dmg', 'appimage'] -- no NSIS/MSI wrapper. The
raw target/release/Hermes-Setup.exe IS the artifact on Windows;
.dmg + .app on macOS; AppImage on Linux. One file, double-click,
no installer-installing-an-installer pattern.
Compile-time pinning:
build.rs reads 'git rev-parse HEAD' and emits
cargo:rustc-env=BUILD_PIN_COMMIT=<sha> + BUILD_PIN_BRANCH=<branch>.
bootstrap.rs's option_env!() picks these up so the binary fetches
install.ps1 from the exact SHA it was tested against. CI / release
builds can override via HERMES_BUILD_PIN_COMMIT env var.
Windows manifest:
hermes-setup.manifest declares level='asInvoker' so the
productName 'Hermes Setup' doesn't trip Windows's installer-
detection heuristic and refuse to launch without elevation.
Also declares PerMonitorV2 DPI + UTF-8 active code page + Common
Controls v6.
Limitations of this initial version:
* No code signing -- Windows SmartScreen will warn once on Hermes-Setup.exe
('More info -> Run anyway'). The downstream binaries it produces
(Hermes.exe in win-unpacked/, the hermes CLI) are locally-built and
therefore don't carry MOTW, so they launch without SmartScreen
intervention. Cert procurement tracked separately.
* macOS and Linux build paths defined but untested -- Windows-only V1.
* fix(installer): pass -IncludeDesktop to manifest, surface launch errors, alias hermes desktop
Three bugs found in the first VM end-to-end test:
1. install.ps1 -Manifest was called WITHOUT -IncludeDesktop, so the
manifest came back with the 14-stage list (no desktop stage), the
UI showed '14 steps' and Stage-Desktop never ran. Pass the flag to
both the manifest fetch and the per-stage runs — install.ps1 gates
the desktop stage's inclusion on the flag.
2. The Success screen's Launch button silently swallowed the Tauri
error when no Hermes.exe existed (e.g. Stage-Desktop was skipped).
Wire the error through to inline UI with an alert callout, so the
user gets actionable text ('Hermes.exe missing, run hermes desktop
from a terminal') instead of an unresponsive button.
3. The Success screen tells users to run 'hermes desktop' from a
terminal but the CLI only accepted 'hermes gui' — invalid choice
for 'desktop'. Rename the subcommand canonically to 'desktop' with
'gui' as a backwards-compatible alias. Update the _SUBCOMMANDS sets
used by session-flag arg parsing + logging-mode probe so both names
route to the same logic.
* fix(install.ps1): pre-warm electron-builder winCodeSign cache + fix Stage-Desktop $HasNode false-skip
Two bugs caught in the second VM end-to-end run:
1. electron-builder's winCodeSign extraction fails on grandma-class
Windows boxes because the .7z archive contains macOS symlinks
(darwin/10.12/lib/libcrypto.dylib and libssl.dylib pointing at
versioned siblings). Creating symlinks on Windows requires
SeCreateSymbolicLinkPrivilege, a per-user right that non-admin
accounts don't have on stock Windows. Result: every fresh install
on a non-admin user fails Stage-Desktop with a 7-Zip 'cannot create
symbolic link' error, retried four times, then bails.
Fix: Initialize-ElectronBuilderCache pre-extracts winCodeSign-2.6.0.7z
ourselves with -snl (don't preserve symlinks, store as resolved file
content) AND -x!darwin (skip the entire macOS subtree — irrelevant
on Windows). Writes to electron-builder's expected cache dir before
electron-builder gets a chance to try its own broken extraction.
Idempotent — fast-paths via signtool.exe sentinel check.
2. Install-Desktop's first guard was 'if (-not $HasNode) skip'.
$HasNode is set by Stage-Node into $script:HasNode, but in
cross-process driver mode (each -Stage NAME is a fresh powershell.exe
spawned by Hermes-Setup.exe), that script-scope variable from the
PREVIOUS process is invisible — so the guard always fired and
Install-Desktop returned in 900ms with a misleading
'Node.js not available' reason. The real npm probe below it never
got to run. Fix: re-probe npm directly via Get-Command when $HasNode
is empty/false, since by that point Stage-Node has already verified
Node is installed and the only question is whether *this* process
can see it on PATH (it can — installer-wide PATH update from Stage-Node).
* fix(install.ps1): tell electron-builder we're NOT signing instead of pre-extracting winCodeSign
The previous commit (c7e46f9f3) worked around the winCodeSign-symlinks-
on-Windows extraction crash by pre-extracting the archive ourselves with
-snl + -x!darwin. That fix was correct but addressed the wrong layer.
The deeper question: why was electron-builder fetching winCodeSign at all
when we have no signing cert configured? Answer: electron-builder
unconditionally pre-warms the toolchain assuming any build MIGHT sign.
The cert auto-discovery never finds anything (we never set CSC_LINK
or anything else), so the signing never happens — but the 100MB fetch
of winCodeSign and its broken-on-Windows symlink extraction does.
Set CSC_IDENTITY_AUTO_DISCOVERY=false (with WIN_CSC_LINK and
WIN_CSC_KEY_PASSWORD also explicitly cleared as belt-and-suspenders)
before invoking npm run pack, and electron-builder skips the entire
winCodeSign apparatus. No download, no extraction, no privilege check.
Env vars are saved/restored around the invocation so we don't leak
the override into Stage-PlatformSdks etc.
Net: removes the 100-line Initialize-ElectronBuilderCache helper that
manually downloaded + extracted winCodeSign-2.6.0.7z. Replaced with
3 env-var assignments. The produced Hermes.exe is functionally
identical — just no longer carries a code-signing-machinery dependency
we never used.
* fix(installer): bump bootstrap-installer.log to capture stage transitions + every install.ps1 line
Diagnosing the second VM failure was impossible because bootstrap-installer.log
contained only the 'starting' banner. Two causes:
1. emit_log() inside run_bootstrap() was tracing::debug! — dropped on the
floor under the default INFO env-filter.
2. The per-stage sink callbacks (on_stdout_line / on_stderr_line) only
emitted Tauri events to the frontend; they never tee'd to the log file
at all. When the failure route mounts, the Tauri event stream is the
only place the script output lived, and it gets discarded.
3. The Failed / Stage / Manifest / Complete lifecycle frames in emit_event()
were also Tauri-only — so even the 'which stage failed' frame never
reached the log.
Fixes:
* emit_log() → tracing::info!
* Sink callbacks tee stdout to info!, stderr to warn!, with stage label
as a structured field for grep'ability
* emit_event() now matches on the variant and logs each lifecycle frame
at the right level: Failed → tracing::error!, others → info!
Result: a failing install leaves a complete forensic trail in
bootstrap-installer.log — manifest stage list, every install.ps1
stdout/stderr line tagged by stage, the stage transitions, and the
final error. Same path as before so nothing the user does changes.
* fix(install.ps1): Stage-NodeDeps cross-process $HasNode + stream npm install output to bootstrap log
VM run 3 diagnosis: node-deps stage skipped on the VM (logged
'Skipping Node.js dependencies (Node not installed)') and then
desktop's npm install failed with exit 1 and zero diagnostic detail.
Two root causes:
1. $HasNode false-skip in Stage-NodeDeps — same cross-process bug
pattern we fixed for Stage-Desktop in c7e46f9f3. Stage-Node ran
in process A and set $script:HasNode = $true, then exited. Stage-
NodeDeps ran in fresh process B (Hermes-Setup.exe -Stage NAME
spawns each stage independently), where that variable doesn't
exist. Re-probe via Get-Command npm instead of trusting the
stale script-scope global. The previous stage already verified
Node so the re-probe succeeds.
2. npm install --silent + Tee to TEMP file hid the real error.
When the workspace install failed on the VM, the actual reason
was buffered in $env:TEMP\hermes-npm-desktop-install-*.log and
the user saw only 'exit 1'. Drop --silent so npm streams its
full output, drop the TEMP-file dance — the Tauri installer's
streaming sink already tees every stdout/stderr line to the
rolling bootstrap-installer.log, so a side log file is dead
weight that hides the very error we need.
After this, the bootstrap log on a failure will contain npm's full
output (deprecation warnings, ETARGET, native-module compile errors,
whatever) tagged with stage=desktop, making the actual cause
diagnosable instead of an opaque exit code.
* fix(install.ps1): restore Initialize-ElectronBuilderCache (CSC env vars alone aren't enough)
VM run 4 diagnosis: even with CSC_IDENTITY_AUTO_DISCOVERY=false set,
electron-builder still fetches winCodeSign and signs bundled binaries.
The log shows the signing happens BEFORE the cache extraction:
• signing with signtool.exe ...\winpty-agent.exe
• signing with signtool.exe ...\OpenConsole.exe
• downloading winCodeSign-2.6.0.7z
• <symlink privilege error>
Cause: node-pty's bundled prebuilds are listed in apps/desktop's
asarUnpack ['**/*.node', '**/prebuilds/**']. electron-builder
re-signs anything unpacked from asar, regardless of whether OUR
binary gets signed. The signtool invocation needs winCodeSign on
disk, which needs the .7z extracted, which hits the macOS-symlink
crash on non-admin Windows.
The CSC env vars I added in d5fe46727 only kill IDENTITY DISCOVERY
(so OUR Hermes.exe stays unsigned, which is fine — we have no cert).
They don't prevent the toolchain fetch for the bundled-prebuild
re-sign. I removed the pre-extract in d5fe46727 thinking the env
vars subsumed it; that was wrong. Both are needed.
Restoring Initialize-ElectronBuilderCache verbatim from c7e46f9f3
and keeping the CSC env vars. Wrote a clearer doc-comment at the
call site explaining the two-knob interaction so future maintainers
don't drop one half again.
* fix(desktop): disable signtool via signtoolOptions.sign=null, drop dead winCodeSign pre-extract
VM run 5 diagnosis: the pre-extract from 3b29e65c1 ran (extracted 83
files, 24MB) but produced ZERO files at the expected sentinel path
'/winCodeSign-2.6.0/windows-10/x64/signtool.exe'.
Cause: the .7z archive's root entries are 'windows-10/', 'darwin/',
'linux/', etc. — not 'winCodeSign-2.6.0/<arch>'. Extracting with
'-o$cacheRoot' put files at $cacheRoot/windows-10/..., NOT at
$cacheRoot/winCodeSign-2.6.0/windows-10/.... I had the directory
nesting wrong from the start.
And then we observed: electron-builder downloads winCodeSign-2.6.0.7z
under a random numeric filename ('384387955.7z') regardless of what's
already extracted in the parent dir. The cache key isn't the dirname;
it's content-addressed. So the pre-extract approach was doomed even
if the path nesting had been right.
Actual fix: signtoolOptions.sign=null in apps/desktop/package.json's
win build config. electron-builder honors this and skips the bundled-
prebuild signing entirely — no signtool invocation, no winCodeSign
fetch, no symlink-privilege crash. The previous failures all stemmed
from electron-builder pre-signing node-pty's bundled .exes
(winpty-agent.exe, OpenConsole.exe) which are already author-signed
upstream; re-signing with our nonexistent cert was overwriting good
sigs with nothing useful anyway.
Cost: when we DO get a real cert later, we'll add it back with the
sign function pointing at the cert chain. Until then, all-null is
the correct config and unblocks every non-admin Windows user.
Removed Initialize-ElectronBuilderCache (the dead pre-extract).
Removed the call site. Kept the CSC_IDENTITY_AUTO_DISCOVERY env
vars as belt-and-suspenders against a future electron-builder
change that might revive cert auto-discovery.
* fix(desktop): use no-op sign function instead of sign=null
VM run 6 still hit the symlink crash even with signtoolOptions.sign=null.
electron-builder 26.8.1 treats null as 'use the default signtool path'
rather than 'skip signing', so the winCodeSign fetch + extraction still
fired for the bundled prebuild re-sign.
The Electron docs (electronjs.org/docs/latest/tutorial/code-signing)
make it clear signing is OPTIONAL and unsigned apps work fine — users
just see SmartScreen on first launch. The electron-builder mechanism
for 'don't actually sign anything' is to supply a custom sign function
(via signtoolOptions.sign: '<path-to-cjs-module>') that resolves
without invoking signtool.
build-noop-sign.cjs is that module — a 5-line async function that
returns undefined. electron-builder calls it for every binary it would
have signed, gets back a resolved promise, and considers each binary
'signed.' No signtool spawn, no winCodeSign fetch, no symlink crash.
When Nous's cert arrives, replace this file with a real signing hook
(@electron/windows-sign-based or a direct signtool invocation). The
architecture's signing-ready and the cutover is a one-file edit.
* fix(desktop): signAndEditExecutable=false to skip signtool path entirely
After reading app-builder-lib/winPackager.js line 216 + 231 directly:
signAndEditExecutable is the ACTUAL hardcoded gate that short-circuits
both signApp() (which signs Hermes.exe + every shouldSignFile match
including bundled prebuilds) AND createTransformerForExtraFiles().
None of signtoolOptions.sign / sign:null / sign:<custom-fn> gate the
winCodeSign download — that happens before they're consulted.
What we lose: rcedit also runs through signAndEditResources, so
disabling this drops PE metadata (file properties showing 'Hermes' /
'Nous Research' / file description). Cost is real but bounded:
* Hermes.exe filename, icon, asar contents, app identity intact
* Task Manager shows 'Hermes.exe' (the filename) not 'Hermes' (PE
description) — minor downgrade
* Start menu, taskbar, window title all work normally
* SmartScreen will warn once (unsigned, same as before)
When the cert lands, flip signAndEditExecutable back to default true,
both signing AND rcedit return, PE metadata is restored.
Removes the no-op sign function (build-noop-sign.cjs) since
signAndEditExecutable=false prevents signtool from being invoked at
all — the custom hook never gets called either.
* feat(install.ps1): write .hermes-bootstrap-complete marker at end of install
The desktop app's main.cjs resolver ladder has a 'bootstrap-needed' rung
that fires when .hermes-bootstrap-complete is missing from
ACTIVE_HERMES_ROOT. Pre-Hermes-Setup, this marker was written by the
packaged-desktop's own bootstrap-runner.cjs at the end of its install
flow. Now that Hermes-Setup.exe runs install.ps1 directly, install.ps1
needs to own the marker — otherwise the desktop sees no marker on first
launch and triggers its legacy first-launch bootstrap (re-running
install.ps1 from inside Electron, the exact recursion Hermes-Setup.exe
was supposed to obviate).
Implementation:
* New Stage-BootstrapMarker (worker) → Write-BootstrapMarker (helper)
* Slotted in the manifest right after platform-sdks, before the
interactive configure/gateway stages, so it runs unconditionally
when the install reaches the finalize phase
* Schema mirrors apps/desktop/electron/main.cjs writeBootstrapMarker /
isBootstrapComplete EXACTLY: {schemaVersion: 1, pinnedCommit,
pinnedBranch, completedAt}. Schema version stays at 1 so old
desktops that read marker files written by future install.ps1s
can still parse them.
* pinnedCommit comes from -Commit flag (Hermes-Setup.exe passes it)
or falls back to 'git rev-parse HEAD' in InstallDir
* pinnedBranch from -Branch flag, defaults to 'main' matching
install.ps1's own param default
Two PS-5.1 gotchas baked into comments:
* The ?. null-conditional operator doesn't exist pre-PS7; use
explicit if-checks on Get-Command results
* Set-Content -Encoding UTF8 emits a BOM in 5.1 and Node's plain
JSON.parse rejects BOM — write via .NET's UTF8Encoding(false)
to produce BOM-less JSON the desktop's readJson() can parse
* feat(installer): drive in-app updates through the Tauri installer
Converge update on the same principle as bootstrap: one driver owns all
repo mutation. The desktop becomes a pure consumer that hands off to
Hermes-Setup.exe --update instead of re-implementing git/pip in Electron.
- hermes desktop --build-only: build without launching, so the installer
owns the post-update launch (CLI keeps build logic single-sourced).
- Installer AppMode {Install,Update} from argv; get_mode exposed to the UI.
- Installer self-copies to HERMES_HOME/hermes-setup.exe on install success
(no-op guard during --update re-invocation to avoid the locked-exe copy).
- Installer --update flow (update.rs): wait for the desktop to release the
venv shim, run 'hermes update --yes --gateway' (branch on exit 0/2/other),
then 'hermes desktop --build-only', then launch the rebuilt desktop. Reuses
the bootstrap event channel + progress UI via a synthetic two-stage manifest.
- Desktop applyUpdates() gutted (~105 lines of git/stash/pull/pyproject/pip
removed) -> thin handoff: spawn updater, app.quit() to free the shim.
Detection (checkUpdates, commit changelog, behind-count) kept intact.
- install.ps1 creates Start Menu + Desktop shortcuts to the packed Hermes.exe
(never bare 'hermes desktop', which would rebuild every launch).
* test update
* fix(installer): pass --branch to hermes update in the --update flow
The install is a detached-HEAD checkout of a pinned commit. Without
--branch, 'hermes update' fell back to its default (main) and switched
the checkout to main — a divergent branch that lacks the desktop CLI
command — so the update targeted the wrong branch and the rebuild stage
failed with 'invalid choice: desktop'.
Thread BUILD_PIN_BRANCH (the branch this installer was built against,
and the same branch the desktop detected the update on) into
'hermes update --branch <b>' so update + rebuild stay on-branch.
* test update
* fix(installer): stamp Hermes icon onto Hermes.exe via rcedit (no winCodeSign)
The unpacked Hermes.exe showed the stock Electron icon + name in the
taskbar because build.win.signAndEditExecutable=false disables BOTH
electron-builder's signing AND its rcedit metadata/icon stamping. That
flag is load-bearing: enabling it re-triggers signtool -> winCodeSign,
whose macOS symlinks crash 7-Zip on non-admin Windows (unfixable dead end).
Decouple identity-stamping from signing entirely: after npm run pack,
run rcedit ourselves on the produced exe.
- Add rcedit as a direct devDependency of apps/desktop (the transitive
electron-winstaller copy is fragile).
- apps/desktop/scripts/set-exe-identity.cjs: Node helper that calls
rcedit's named export to set icon + ProductName/FileDescription/
CompanyName. Node builds argv natively — avoids the PowerShell->exe
->JSON double-escaping that broke the app-builder rcedit path.
- install.ps1 Set-DesktopExeIdentity invokes the script after the build,
before shortcuts. Best-effort: failure keeps the stock icon, never
fails the install. rcedit is a pure PE editor — no signtool, no
winCodeSign, no symlinks.
Verified locally: stamping a copy of the built Hermes.exe embeds the
32x32 icon and sets ProductName=Hermes.
Also fix update-path success-screen flash: in update mode the installer
hands off + exits in ~600ms, so don't route to the 'launch Hermes'
success view (it flashed before the window closed).
* update test
* fix(desktop): show 'hermes update' guidance for CLI installs instead of dead-end error
A user who installed via the CLI (irm|iex / install.sh) then ran
`hermes desktop` has no staged hermes-setup.exe, so clicking Update
in-app hit resolveUpdaterBinary()=null and showed a misleading error
('re-run the Hermes installer') with a Try-again button that could
never succeed — a dead loop for a perfectly valid install.
Treat the no-updater case as an intentional outcome, not a failure:
- main.cjs applyUpdates returns { ok:true, manual:true, command:'hermes update' }
(no throw, no 'error' stage) when no updater binary exists.
- New 'manual' update stage + apply-state.command thread the command to the UI.
- updates-overlay ManualView: a polished terminal-native card with the
exact command and a copy button, framed as the correct path for a CLI
user rather than an error.
GUI-installer users are unaffected — hermes-setup.exe present => seamless
auto-update runs as before. Zero new process orchestration; can't fail
the update demo.
* update test
* fix(gui): pin /api/hermes/update to the current branch
The desktop command-center 'update' action hits POST /api/hermes/update,
which spawned bare `hermes update` with no --branch. cmd_update then
falls back to its default (main) and checks the working tree OUT of the
tracked branch — a bb/gui install silently jumped to main and lost the
desktop CLI.
Resolve the checkout's current branch and pass --branch <current> from
this endpoint only. The engine default (main) is DELIBERATELY unchanged:
bare `hermes update` from a terminal, the gateway /update bot command,
and the CLI/TUI relaunch path all keep their long-standing 'update against
main' contract for the existing user base. Only the GUI button is scoped
to update-the-branch-you're-on. Detached HEAD / git failure falls back to
the bare default.
* update test
* fix(desktop): branch-pin the CLI manual-update command card
The 'Update from your terminal' card (shown to CLI installs with no staged
updater) hardcoded bare `hermes update` — which defaults to main and would
switch a bb/gui (or any non-main) checkout off-branch. Same bug we fixed for
the GUI button, leaked into the card's copy text.
Resolve the checkout's current branch and show `hermes update --branch
<current>` for non-main checkouts; keep it bare for main so the card stays
clean. Best-effort: bare fallback if branch detection fails. Matches the
GUI button + installer --update contract; bare terminal/bot/TUI update
paths still default to main, unchanged.
* docs: phragg was here
* feat(desktop): lead onboarding with Nous Portal + fix fresh-install detection (#34970)
- Feature Nous Portal as the primary onboarding card (Recommended tag,
app logo, single pitch line); collapse other OAuth providers behind an
"Other providers" disclosure whose open/closed state persists.
- Surface OpenRouter as a one-click API-key option inside the disclosure;
move "I have an API key" to a quiet bottom-right link.
- Treat "no provider configured" as a normal onboarding state, not a red
error banner (provider-setup-errors copy match).
- Fix setup.runtime_check: it reported ready when the resolved runtime had
an empty credential or only implicit Bedrock/IAM, so fresh installs never
saw onboarding. Now requires a usable credential.
- Auto-wire Windows fonts for WSL2 users so the renderer renders real
Segoe UI instead of the DejaVu fallback; make WSL detection env-independent
via the /proc kernel marker.
* feat(desktop): live elapsed timer on install bootstrap steps
The first-launch install overlay showed a static "Installing" with no
motion, so long steps (notably the repo clone) looked frozen. Stamp each
stage's start time on the running transition and tick once a second so the
active step shows live elapsed (e.g. "Installing · 1:23"), plus elapsed on
the overall current-step line. Completed steps keep their final duration.
* fix(desktop): resolve PortableGit for update checks + reserve titlebar tools space
- runGit() hardcoded spawn('git'), which ENOENTs on fresh installer-driven
Windows installs (git is PortableGit under %LOCALAPPDATA%\hermes\git, never
on PATH) — so "Check for updates" failed with "Couldn't check for updates".
Add resolveGitBinary() mirroring findGitBash (PortableGit → Git-for-Windows
→ PATH) and use it in runGit.
- PageSearchShell rendered a full-width search input in the titlebar row, so
on Windows its right edge slid under the fixed top-right tools + native
window controls. Reserve that footprint via --titlebar-tools-* vars.
* fix(desktop): stop streaming caret from shifting layout on completion
The streaming caret (::after on the running message's last child) was an
in-flow inline-block adding ~0.78em of inline width, which could wrap the
last line mid-stream; when the caret is removed on completion the line
un-wraps and reflows — the visible post-response layout shift. Net-zero its
inline advance with a compensating negative margin so it paints at the text
end without consuming layout width.
* fix(desktop): stop completed-message layout shift while streaming
The assistant message action bar used `hideWhenRunning`, which unmounts it
whenever the thread is streaming. Since the bar reserves vertical space in
each completed assistant message's footer (it's invisible-until-hover via
opacity, not via mount), unmounting it collapsed every prior turn by the
bar's height — then remounting on resolve grew them back, shifting the whole
conversation (visible as "padding appears above the last user message").
Drop hideWhenRunning so the footer height is constant; the bar stays
invisible during streaming via its existing opacity/pointer-events gating.
* fix(merge): keep windows-footgun suppressions inline
* fix(merge): keep remaining gateway footgun suppressions inline
* fix(merge): restore contracts caught by main-target CI
* fix(dashboard): honor injected HERMES_DASHBOARD_SESSION_TOKEN
The desktop shell mints a session token and signs its /api + /api/ws
calls with it via HERMES_DASHBOARD_SESSION_TOKEN, but the main-merge
restored a web_server.py that ignored the env var and minted its own
random _SESSION_TOKEN -- so every desktop request 401'd and the UI
reported "gateway offline". Read the injected token (fall back to a
fresh random one) so loopback HTTP + WS auth line up.
Adds a regression test so a future merge can't silently drop the read.
* fix(desktop): align fresh-install home so upgraders don't brick
Two related first-launch bugs on machines with a legacy ~/.hermes:
- install.ps1 hardcoded $HermesHome/$InstallDir to %LOCALAPPDATA%\hermes
and ignored the HERMES_HOME the desktop passes through. The desktop
freezes HERMES_HOME at module load and prefers a legacy ~/.hermes when
%LOCALAPPDATA%\hermes is absent, so the installer wrote to a different
home than the shell read -> "Could not connect to Hermes gateway". Honor
$env:HERMES_HOME in the param defaults.
- isBootstrapComplete() trusted the marker + checkout without verifying a
runnable venv, so an interrupted/split install spawned a dead backend
instead of re-bootstrapping. Also require the venv python to exist.
* fix(dashboard): allow packaged desktop file:// origin on loopback WS
The packaged Electron desktop loads its renderer over file://, so its
/api/ws handshake carries Origin: file:// (or null). The DNS-rebinding
WebSocket Origin guard only accepted http(s) origins matching the bound
host, so it rejected the desktop's own renderer with 4403 -> "Could not
connect to Hermes gateway" on macOS.
A browser DNS-rebinding attacker can only ever present an http(s) origin
(the site hosting the malicious page); it cannot forge file://, null, or
a custom app scheme AND hold the loopback session token. So on loopback
binds we now trust non-web origins -- the token in _ws_auth_ok remains
the real authenticator. Public/gated binds still reject them, and
cross-site http(s) origins are still rejected everywhere.
* fix(desktop): resolve renderer assets relative to BASE_URL
Absolute public asset paths (/apple-touch-icon.png, /ds-assets/...) work
under the dev server but break in the packaged app, where the renderer is
loaded from file://.../index.html and a leading slash resolves to the
filesystem root -> broken onboarding provider icon and backdrop image on
macOS. Prefix these with import.meta.env.BASE_URL so they resolve next to
the bundled index.html in both dev and packaged builds.
* feat(desktop): automate first-launch bootstrap on macOS/Linux
Previously a packaged macOS/Linux app with no Hermes install hit a
dead-end ("first-launch install is not yet automated -- run install.sh
manually") because install.sh lacked the staged protocol install.ps1
exposes. Now both platforms bootstrap on first launch with the same
structured, per-step progress UI as Windows.
- install.sh: add --manifest / --stage / --json / --non-interactive plus
a stage dispatcher (prerequisites, repository, venv, python-deps,
node-deps, path, config, setup, gateway, complete). User-input stages
(setup, gateway) are skipped under --non-interactive; the in-app
onboarding overlay owns API keys/model, matching the Windows flow.
Each stage runs inside the install dir (its own process) and a new
--commit flag pins the checkout to the build-stamp SHA.
- bootstrap-runner.cjs: drive the staged manifest/stage/JSON protocol for
both install.ps1 (PowerShell) and install.sh (bash), selected by
installer kind; removed the single-blob POSIX shim.
- main.cjs: drop the macOS/Linux unsupported-platform dead-end so the
bootstrap-needed path runs the installer on every platform.
* fix(dashboard): return 404 JSON for unmatched /api paths instead of SPA HTML
The SPA catch-all (serve_spa) served index.html for any unmatched GET,
including unregistered /api/* endpoints. A missing API route therefore
came back as <!doctype html> with status 200, and JSON clients (the
desktop app's fetchJson) crashed with an opaque
'SyntaxError: Unexpected token <' instead of a clear error.
- web_server.py: unmatched /api or /api/... now returns 404 JSON
('No such API endpoint'); non-api paths still serve the SPA for
client-side routing.
- main.cjs fetchJson: detect an HTML body / text/html content-type on a
2xx response and reject with a clear message naming the URL, rather
than a raw JSON.parse SyntaxError. Empty bodies resolve to null;
malformed JSON reports the URL plus a snippet.
* say 'OS appearance' instead of 'macOS appearance'
* feat(install): add --include-desktop stage + PowerShell-style flags to install.sh
Brings install.sh to parity with install.ps1's bootstrap surface so the
shared Rust/Tauri bootstrapper (apps/bootstrap-installer) can drive a
macOS/Linux install the same way it drives Windows.
- Accept the PowerShell-style aliases the bootstrapper emits to both
installers: -Commit / -Branch (alongside existing -Manifest / -Stage /
-Json / -NonInteractive).
- Add --include-desktop / -IncludeDesktop. When set, the manifest gains a
'desktop' stage (immediately before 'complete'), and a new install_desktop
runs a root workspace `npm install` + `npm run pack` (electron-builder
--dir, signing auto-discovery disabled) to produce release/mac*/Hermes.app
-- mirroring install.ps1's Install-Desktop / Stage-Desktop.
- The flag is opt-in, exactly like Windows: the signed bootstrap installer
passes it; the Electron app's own first-launch bootstrap and the CLI
one-liner omit it (building the desktop from inside the running app would
clobber it).
* fix: tts endpoints
* macOS desktop: install + in-app self-update (#35607)
* fix(installer): align macOS HERMES_HOME with the rest of the stack
paths.rs computed the macOS Hermes home as ~/Library/Application Support/
hermes, but nothing else does: hermes_constants.get_hermes_home() (Python),
scripts/install.sh, and the Electron desktop's resolveHermesHome() all use
~/.hermes on macOS. The drift meant the Tauri installer wrote the install to
one directory and the desktop looked for it in another, so a fresh GUI
install never found its backend (the file's own comment warned this exact
drift would break things). Use ~/.hermes on macOS to match.
* fix(install.sh): always emit a stage result frame on failure
Stage helpers (clone_repo, install_deps, check_python, …) were written for
the monolithic flow and call `exit 1` on failure. Under `--stage`, that
terminated the process before the JSON result frame was printed, so the
installer's parse_stage_result saw "no frame" instead of a clean
{ok:false,...} contract response. Run the stage body in a subshell so an
`exit` only unwinds the subshell and the parent still emits the frame.
* feat(install.sh): auto-provision git on macOS/Linux (parity with install.ps1)
install.ps1 downloads PortableGit on Windows, but install.sh just printed a
"please install git" hint and exited — so a fresh Mac with no developer tools
(no Xcode CLT → no git) couldn't get past the clone step. check_git now tries
to install git before bailing:
- macOS: Homebrew if present (headless), else `xcode-select --install`
(the CLT prompt also provides the compiler some wheels need), polling for
git to appear.
- Linux: apt/dnf/pacman via sudo when available.
Falls back to the manual instructions only if auto-provision fails.
* feat(desktop): in-app GUI+backend self-update on macOS/Linux
On Windows the staged Hermes-Setup binary drives updates (quit → hermes
update → hermes desktop --build-only → relaunch). The mac drag-install has no
such binary, so "Update now" previously just printed `hermes update`.
Since there's no venv-shim file lock on POSIX, the desktop can drive the whole
update itself. applyUpdates now, when no staged updater exists on mac/linux:
1. runs `hermes update --yes [--branch <current>]` (backend git pull + deps),
2. runs `hermes desktop --build-only` (OS-aware GUI rebuild) with the
Hermes-managed Node + venv on PATH,
3. spawns a detached swapper that waits for this process to exit, dittos the
freshly built Hermes.app over the running bundle, clears quarantine, and
relaunches.
Degrades to "backend updated — restart to load the new GUI" if the rebuild
fails or there's no .app bundle to swap (dev run, Linux AppImage).
* chore: uptick
* chore: uptick
* chore: linux build
* fix(install): detect xcode-select git stub on fresh macOS
* chore: bump
* fix(desktop): repair voice dictation on Windows
Voice dictation was broken on Windows in two ways:
1. Mic access was denied. The Electron permission request handler only
granted 'media' requests whose details.mediaTypes included 'audio',
but Chromium on Windows frequently fires the mic request with an empty
mediaTypes array, so getUserMedia threw NotAllowedError. The handler
now grants audio-capture when mediaTypes includes 'audio' OR is
empty/absent, handles the 'audioCapture' permission name, and adds a
setPermissionCheckHandler (the synchronous path Chromium also consults
for getUserMedia on Windows). Video is still denied.
2. Transcripts went nowhere. The composer's insertText handler (used by
dictation and other inserts) only updated the assistant-ui composer
store via setText, never the contentEditable editor DOM. The
draft->editor sync effect only re-renders the editor when it is NOT
focused, and dictation runs while the editor has/regains focus, so the
transcript was stored but never shown and could not be sent. insertText
now renders into the editor DOM and places the caret, mirroring
appendExternalText.
Also hardens fetchJson: a 2xx response with an HTML body (or text/html
content-type) now rejects with a clear message naming the URL instead of
an opaque JSON.parse 'Unexpected token <' error.
* feat(desktop): route Nous subscribers onto the Tool Gateway from the GUI
When the GUI sets the main provider to Nous via POST /api/model/set, call
the same apply_nous_managed_defaults the CLI uses after model selection, so
GUI/onboarding users land on the Nous Tool Gateway the same way CLI users do
— no separate prompt, no duplicated logic.
Purely additive: apply_nous_managed_defaults skips any tool where the user
has a direct key (FIRECRAWL_API_KEY, FAL_KEY, etc.) or explicit config, so it
never overwrites a user's own setup. Only unconfigured tools get routed.
- web_server.py: in set_model_assignment (scope=main, provider=nous), resolve
enabled toolsets and apply managed defaults; guarded so a Portal hiccup never
blocks saving the model. Returns routed tools as gateway_tools.
- onboarding.ts: surface a 'Tool Gateway enabled' toast listing routed tools.
- types/hermes.ts: add gateway_tools to ModelAssignmentResponse.
- tests: cover nous-applies, non-nous-skips, and failure-doesnt-block-save.
* feat(desktop): mirror hermes model free/paid curation in GUI onboarding
GUI onboarding picked models[0] from /api/model/options, which ignores the
Nous free/paid tier — a free user could land on a paid default (e.g.
anthropic/claude-opus-4). Now the recommended default mirrors what `hermes
model` does.
- web_server.py: new GET /api/model/recommended-default?provider=<slug>. For
Nous it runs the same curation as the CLI (get_curated_nous_model_ids +
pricing + check_nous_free_tier + union_with_portal_{free,paid}_recommendations
+ partition_nous_models_by_tier) so free users get a free model and paid users
get the curated default. Other providers fall back to the first curated model.
Never 500s — returns empty model on error so onboarding degrades gracefully.
- hermes.ts: getRecommendedDefaultModel client + RecommendedDefaultModel type.
- onboarding.ts: fetchProviderDefaultModel prefers the recommended endpoint,
falls back to models[0] when unavailable.
- tests: free-tier picks free model, paid-tier picks curated default, failure
returns empty without 500.
* feat(desktop): show model pricing + free/paid tier gating in GUI picker
The CLI `hermes model` picker shows per-model $/Mtok pricing and gates paid
models on free Nous accounts. The GUI picker showed bare model names. Bring it
to parity across both the model-picker dialog and onboarding confirm card.
Backend:
- inventory.build_models_payload gains a pricing=True flag → _apply_pricing
enriches each provider row with formatted per-model pricing
({input,output,cache,free}) via the same _format_price_per_mtok the CLI uses,
and for Nous adds free_tier + unavailable_models (paid models a free user
can't select) via check_nous_free_tier + partition_nous_models_by_tier.
Best-effort: any pricing/tier failure is swallowed and fails open (no gating).
- /api/model/options and TUI model.options now pass pricing=True so the
global picker and in-session picker both carry pricing.
Frontend:
- ModelOptionProvider gains pricing/free_tier/unavailable_models; new
ModelPricing type.
- model-picker dialog renders In/Out $/Mtok (or a Free pill) per model, a
Free tier/Pro badge on the Nous heading, and disables + grays unavailable
paid models for free users with a 'Pro models need a paid subscription' note.
- onboarding confirm card shows the chosen model's price + tier badge.
Tests: test_inventory_pricing covers price formatting, free-tier gating,
paid no-gating, providers without pricing, and swallowed failures.
* fix(desktop): GUI model picker shows curated Nous list in curated order
Two bugs made the GUI Nous model list diverge from the `hermes model` CLI picker:
1. Backend (model_switch.py): the Nous row in list_authenticated_providers
fell through to cached_provider_model_ids("nous"), dumping the full live
/v1/models catalog (~50 vendor-prefixed models, alphabetical). Now it uses
the curated list AND applies the Portal free/paid recommendation union —
exactly like _model_flow_nous in main.py — so newly-launched models such as
stepfun/step-3.7-flash:free surface in curated order. Best-effort: falls
back to the curated list alone if the Portal fetch fails.
2. Frontend (model-picker.tsx): cmdk's Command had shouldFilter on (default),
which re-sorts items by fuzzy-match score (≈alphabetical) and ignores array
order. Set shouldFilter={false} + own the search term and do an
order-preserving substring filter, so the backend's curated order is shown
verbatim.
* feat(desktop): add/switch providers from the model picker via onboarding reuse
The model picker could only select models from already-authenticated
providers. Switching to a new provider had no in-app path. Rather than
duplicate provider UI, reuse the existing onboarding provider selector
(featured Nous + other providers + API-key form + device-code/PKCE flow +
model-confirm with pricing/tier).
- onboarding store: add a 'manual' flag with startManualOnboarding() /
closeManualOnboarding(). Manual mode forces the onboarding overlay to show
even when configured===true and refreshOnboarding no longer auto-dismisses
on runtime-ready (the app is already working — the user is just adding or
switching a provider).
- onboarding overlay: render when manual even if configured; show a Close
button (the first-run flow has none since the app can't run yet).
- model picker: 'Add provider' footer button opens the onboarding selector;
ModelResults lists only configured (model-bearing) providers.
* feat(desktop): add PUT /api/tools/toolsets/{name} enable/disable endpoint
* feat(desktop): add toggleToolset RPC binding
* feat(desktop): toolset enable/disable switch in Tools settings
* feat(desktop): tool configuration parity in GUI Tools settings
Bring the desktop GUI Tools settings to parity with the CLI `hermes tools`
for provider selection and API-key configuration.
Backend (hermes_cli/web_server.py):
- GET /api/tools/toolsets/{name}/config - provider matrix + key status
- PUT /api/tools/toolsets/{name}/provider - persist provider selection
Shared core (hermes_cli/tools_config.py):
- Extract apply_provider_selection / _write_provider_config from the
interactive _configure_provider so the CLI and GUI write identical
config keys (web.backend, tts.provider, browser.cloud_provider, plugin
image/video providers, use_gateway flags) through one code path.
Desktop UI:
- ToolsetConfigPanel: provider list with select, per-provider API-key
entry (set/replace/clear/reveal via the shared env RPCs), Ready/Needs
keys state, guidance for Nous-auth and post-setup providers.
- Wire the Configured/Needs keys pill to expand the panel inline; refresh
the toolset list after key changes so the pill updates live.
- Add getToolsetConfig / selectToolsetProvider RPC bindings + types.
Post-setup (OAuth/install) flows still defer to the CLI; see
docs spike findings for the planned /api/tools/setup/* endpoint family.
Tests: backend round-trip + 400 cases for the new endpoints and
apply_provider_selection; desktop vitest coverage for the config panel
(provider render, select, key save). No change-detector tests.
Also removes three stale completed plan docs.
* fix(desktop): show real Hermes version + sync package.json on release
The desktop app version was disconnected from the Hermes version: the
release script bumped pyproject.toml + hermes_cli/__init__.py but never
touched apps/desktop/package.json, which sat stale at 0.0.2 (lockfile at
0.0.1).
- main.cjs: hermes:version IPC now resolves __version__ from
hermes_cli/__init__.py (the canonical source release.py bumps) via a new
resolveHermesVersion() helper, falling back to app.getVersion() when the
source tree isn't readable. The About panel now always shows the live
Hermes version and can't drift.
- release.py: update_version_files() also bumps apps/desktop/package.json
in lockstep with pyproject (top-level version only; dep specs untouched).
- One-time catch-up: package.json 0.0.2 -> 0.15.1 and the lockfile root
mirrors 0.0.1 -> 0.15.1.
* fix(desktop): stamp exe identity in afterPack hook so updates stay branded
The packed Hermes.exe reverted to the stock Electron icon + "Electron" name
after an in-app update. The icon/identity stamp (rcedit) lived only in
install.ps1, but the installer's --update path rebuilds the desktop via
`hermes desktop --build-only` -> `npm run pack`, which never ran install.ps1
and so never stamped the rebuilt exe.
Move the stamp into an electron-builder afterPack hook so it runs for EVERY
packed build regardless of caller (first install, hermes desktop, the update
rebuild, or a manual npm run pack):
- set-exe-identity.cjs: refactor to export stampExeIdentity(exe, desktopRoot);
still runnable as a standalone CLI.
- after-pack.cjs (new): afterPack hook calling stampExeIdentity. Windows-only
guard; best-effort (logs + resolves on failure, never fails the build).
- package.json: register build.afterPack.
- install.ps1: remove the now-redundant Set-DesktopExeIdentity function + call;
the hook handles it during npm run pack.
electron-builder's own rcedit step stays disabled (signAndEditExecutable=false)
to avoid the signtool -> winCodeSign -> 7-Zip macOS-symlink crash on non-admin
Windows; the hook runs rcedit directly (pure PE resource edit, no signing).
* fix(desktop): export afterPack hook as exports.default so electron-builder runs it
The afterPack hook used `module.exports = fn`, which electron-builder's hook
loader doesn't pick up — it expects the function as the module's default
export (the same shape afterSign/notarize.cjs uses). The hook silently never
ran, so even first install shipped the stock "Electron" exe.
Switch to `exports.default = async function afterPack(...)`. Verified with a
real `npm run pack`: electron-builder now invokes the hook and the produced
release/win-unpacked/Hermes.exe carries ProductName/FileDescription=Hermes.
* chore(desktop): drop auto-build release CI in favor of manual build + upload
Remove desktop-release.yml (nightly-on-main + stable publish). Installers
are now built locally per platform and uploaded to a GitHub Release by hand;
the website points at them via NEXT_PUBLIC_HERMES_DL_* env. Update README +
docs and drop the dead desktop-nightly channel links.
* fix(desktop): stable shortcut icon + bust icon cache so updates repaint
Symptom on a freshly-installed laptop: Hermes.exe itself shows the correct
Hermes icon (Explorer reads the live exe's stamped PE resource), but the
desktop shortcut still draws the stock Electron icon.
Cause: New-DesktopShortcuts set IconLocation to "<exe>,0", so Windows cached
the icon it extracted from the exe at shortcut-creation time. On an update the
exe gets re-stamped, but the shortcut keeps rendering the stale cached bitmap.
- package.json: ship assets/icon.ico beside the exe via extraResources
(-> resources/icon.ico). Verified with a real npm run pack.
- install.ps1 New-DesktopShortcuts: point IconLocation at resources/icon.ico
(fallback to <exe>,0 if absent) — a dedicated .ico is cache-stable and skips
the per-exe extraction that goes stale. Then run `ie4uinit.exe -show` to bust
the shell icon cache so the shortcut repaints immediately instead of showing
the old Electron icon until reboot.
Both best-effort; never fail an otherwise-good install.
* dummy update
* feat(desktop): self-heal update branch + backend contract guard
Two fixes for the bb/gui→main transition:
- Self-update self-heals: if the tracked branch (e.g. bb/gui) no longer
exists on origin (merged + deleted), the desktop updater falls back to
main and persists it. Read-only ls-remote probe that only flips on a
definitive "ref absent" (exit 2), never on a transient network error, so
already-installed clients migrate themselves with no manual flip.
- Backend contract guard: tui_gateway reports DESKTOP_BACKEND_CONTRACT in
session runtime info; the desktop warns with a one-click "Update Hermes"
when the backend predates the GUI's required contract (e.g. a bb/gui app
pointed at a main checkout) instead of failing cryptically downstream.
* docs(desktop): rewrite README to match current install/update/build flow
The old README contradicted itself (claimed a bundled Python payload while
also saying it no longer bundles source) and predated cross-platform support.
Rewrite for accuracy: Linux is a first-class build target, install.sh/install.ps1
both drive the staged bootstrap, the real self-update handoff (Windows
Hermes-Setup vs in-app macOS/Linux), and the bb/gui→main self-heal + backend
contract guard.
* docs(desktop): rewrite README as a real product readme
Lead with what the app is and how to get it (download an installer, or
`hermes desktop` for existing CLI users) plus a plain-language feature list,
then keep contributor/build/internals as a clearly separated secondary section.
* docs(desktop): fix install framing — releases no longer auto-build installers
Lead with the install-with-Hermes path (`--include-desktop` / `hermes desktop`),
which always works, and describe prebuilt installers as manually published when
a release ships them rather than implying CI attaches them to every release.
* docs(desktop): match base repo README style
Adopt the root README's conventions: centered title + badge row, bold
one-liner intro, a feature <table> grid, --- section dividers, and a
Community / License footer.
* feat(desktop): recover from gateway boot failures + validate API keys on entry (#35864)
Fresh installs that hit a gateway boot failure had no recovery path: the
shell rendered dead ("gateway offline"), logs were undiscoverable, and a
mistyped API key was accepted because onboarding only checked credential
presence, not validity.
- Add BootFailureOverlay: a top-level recovery surface (Retry, Repair
install, Use local gateway, Open logs + inline recent logs) that mounts
on any hard boot failure, including post-install. Trims the now-redundant
recovery button from the onboarding Preparing panel.
- Add hermes:logs:reveal / :recent IPC (reveal desktop.log) and a
hermes:bootstrap:repair IPC that drops the bootstrap marker to force a
clean reinstall. Surface "Open logs" in Gateway settings too.
- Add POST /api/providers/validate: a live per-provider probe
(OpenRouter/OpenAI/xAI/Gemini key check, local endpoint connectivity)
wired into saveOnboardingApiKey so a rejected key blocks before it's
persisted, while an unreachable probe falls through (offline-safe).
* test(model-catalog): fix stale nous picker test after curated-list change
ac2e48907 made the GUI/picker Nous row use the curated list (curated["nous"]
= get_curated_nous_model_ids()) + Portal union, matching the `hermes model`
CLI — but test_picker_nous_row_uses_manifest still asserted the old 2-model
manifest snapshot, breaking the test shard.
Rewrite it as an invariant: stub the Portal union to passthrough and assert the
row equals get_curated_nous_model_ids() computed under the same conditions, so
it tracks the real contract instead of a hardcoded model list that rots on every
catalog update.
---------
Co-authored-by: emozilla <emozilla@nousresearch.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Austin Pickett <pickett.austin@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: ethernet <arilotter@gmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
The Windows installer fetched the latest git-for-windows release via
api.github.com/repos/git-for-windows/git/releases/latest, which is
rate-limited to 60 requests/hour/IP for unauthenticated callers. Users
behind CGNAT, corporate NAT, dorm WiFi, or shared ISP routinely hit the
limit, and the installer aborts asking them to install Git manually.
Switch to a pinned release tag (v2.54.0.windows.1) and a static
github.com/.../releases/download/<tag>/<asset> URL. Static download
URLs are served by GitHub's blob storage and are not subject to the
API rate limit.
Trade-offs:
- We have to bump the pin when we want a newer Git for Windows. The
installer doesn't depend on Git features beyond 'works', so this is
a once-a-year maintenance cost at most.
- Loses the (cosmetic) MB size display, since we no longer have asset
metadata. Replaced with the version string in the 'Downloading ...'
line instead.
Three install.ps1 improvements pulled from the thin-installer work on
bb/gui (PR #27822) that benefit the canonical CLI install flow on main:
1. Strip UTF-8 BOM from scripts/install.ps1.
The canonical 'irm <raw URL> | iex' install flow has been broken
since commit 4279da4db re-introduced a UTF-8 BOM that PR #27224
had explicitly stripped. PowerShell 5.1's 'irm' returns the
response body as a string with the BOM surviving as a leading
\ufeff character; 'iex' then evaluates that string and the parser
chokes on the invisible character before param(), surfacing as a
cascade of 'The assignment expression is not valid' errors at
every param default value.
File body is verified pure ASCII (no character above byte 127),
so PS 5.1 with no BOM falls back to Windows-1252 decoding which
is identical to ASCII for our content. Both install paths work:
- 'irm ... | iex' (canonical one-liner)
- 'powershell -File install.ps1' (programmatic / desktop bootstrap)
2. New -Commit and -Tag string params for reproducible pinning.
Higher-precedence variants of -Branch. When set, the repository
stage clones $Branch (fast partial fetch) and then 'git checkout's
the exact ref. Precedence: Commit > Tag > Branch. Honoured by all
three code paths:
- Update path (existing valid checkout): fetch + checkout
--detach <commit|tag> instead of checkout + pull.
- Fresh clone: clone --branch $Branch, then post-clone
'git checkout --detach' to the requested ref.
- ZIP fallback: pick archive URL for the most-specific ref
(commit -> archive/<sha>.zip, tag -> archive/refs/tags/
<tag>.zip, else archive/refs/heads/<branch>.zip).
Used by the Hermes desktop's first-launch bootstrap to pin the
.exe to the exact commit it was built against, so the cloned
Hermes Agent tree always matches what the .exe was tested with.
Also enables release-bundle pinning (e.g. Microsoft Store builds
pinning to a release tag) and CI reproducibility.
3. EAP=Continue wrap around the new pin-step git invocations.
'git fetch origin <commit>' writes the routine 'From <url>' info
line to stderr. Under the script's global $ErrorActionPreference
= 'Stop' that stderr line is wrapped as an ErrorRecord and
terminates the script even though fetch+checkout actually succeed.
Same EAP=Stop + native-stderr footgun we hit during the install.ps1
hardening pass in Install-Uv, Test-Python, _Run-NpmInstall.
Wrap both the update-path fetch/checkout block AND the post-clone
pin block in $ErrorActionPreference = 'Continue' (restored in
finally). Real failures still caught by $LASTEXITCODE checks.
* feat(dep_ensure): complete Windows bootstrap — dep_ensure + install.ps1 + detection
dep_ensure.py gains Windows awareness: PowerShell invocation, platform-
specific browser detection, (path, shell) tuple returns.
install.ps1 gains -Ensure/-PostInstall modes using npm -g --prefix
(aligned with install.sh) and agent-browser install for Chromium.
browser_tool.py gains node/ in candidate dirs for Windows .cmd shims.
Both install scripts bundled in pip wheel.
Tracking: #27826
* fix(install.ps1): add --ignore-scripts to npm install for camofox
@askjo/camofox-browser has a dependency (impit) whose postinstall
script runs `npx only-allow pnpm`, which fails under npm. Adding
--ignore-scripts avoids the spurious failure without affecting
functionality.
Tracking: #27826
* fix: remove duplicate install scripts from git
CI already copies scripts/install.{sh,ps1} into hermes_cli/scripts/
during wheel build. No need to commit copies — .gitignore keeps them
out, _find_install_script() falls back to scripts/ for git-clone users.
Tracking: #27826
* fix: address review — remove env_extra, fix ps1 error handling
- Remove unused env_extra parameter from ensure_dependency()
- Invoke-EnsureMode node case now uses Test-Node consistently
- Install-AgentBrowser uses throw instead of exit 1
Two protocol-correctness gaps from review:
1. Stage-Node used [void](Test-Node) which discarded Test-Node's return
value, so the JSON frame always reported ok=true even when Node
install fully failed. A GUI driver consuming the manifest couldn't
tell 'node ready' from 'node missing'. Wire a soft-skip channel
($script:_StageSkippedReason) that workers can populate to surface
'ran, but the thing it was supposed to set up is not available' as
skipped=true with a reason in the JSON, without aborting the install
(Node is optional -- browser tools degrade gracefully, matches
Write-Completion's existing 'Note: Node.js could not be installed'
behavior). Reset before each stage so a prior reason can't leak.
2. The -Stage dispatch used 'if ($Stage)' which is falsy for empty
string, so 'install.ps1 -Stage ""' fell through to Main and silently
kicked off a full destructive install. Switch to
PSBoundParameters.ContainsKey('Stage') so an explicit empty value
surfaces as unknown-stage exit 2 with a structured JSON frame, the
way every other bad stage name does.
Address the two cosmetic items from review:
- Completion banner middle line was 62 chars vs 59-char top/bottom borders
(replacing the 1-char checkmark with [OK] added width that wasn't
reflected in the trailing whitespace). Drop 3 trailing spaces.
- Smoke test file had a single em-dash in a comment -- the only
non-ASCII byte across both files. Replace with -- for consistency
with install.ps1's pure-ASCII goal.
Three issues flagged by the Copilot review on this PR:
1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name>
ran a worker that threw, Invoke-Stage's finally emitted a JSON result
frame AND the entry-point catch emitted a second error frame --
producing two concatenated JSON objects on stdout and breaking the
one-line-per-invocation contract that drivers parse against. Same
issue applied to -Json mode on a full install (every stage's finally
plus a final error frame missing duration_ms/skipped).
Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame
when it emits a failure frame; the entry-point catch checks the flag
and skips its own emit, still exit 1.
2. $prevEAP uninitialized on early try-block throw (Copilot #3). In
Install-Uv, Test-Python, Test-Node's winget fallback,
_Run-NpmInstall, and the playwright block, '$prevEAP =
$ErrorActionPreference' lived as the first statement INSIDE the
try. If anything between 'try {' and that line threw (Write-Info on
an unusual host, the npx-finding loop, etc.), the catch's
'if ($prevEAP) { ... }' restore was a no-op and EAP could remain
relaxed.
Fix: hoist '$prevEAP = $ErrorActionPreference' to the line
immediately before 'try {' in all five sites. Catch's restore is
now always meaningful regardless of where in the try the throw
originated.
No change to Invoke-Stage's success path or to the four lint-clean EAP
sites (Test-Node was the only winget-related catch). All 19 metadata
smoke tests still pass.
Adds an opt-in stage protocol that lets programmatic drivers (the
desktop GUI's onboarding wizard, CI, future install.sh parity) drive
install.ps1 one step at a time with structured JSON results. Default
invocation (`irm | iex` one-liner) behaves unchanged.
Entry points:
install.ps1 Today's interactive install (unchanged)
install.ps1 -ProtocolVersion Emit protocol version integer
install.ps1 -Manifest Emit JSON manifest of available stages
install.ps1 -Stage <name> Run one stage, emit JSON result
install.ps1 -NonInteractive Suppress Read-Host prompts (skips the
setup wizard and gateway autostart)
install.ps1 -Json Machine-readable completion frame
Manifest exposes 14 stages across prereqs/install/finalize/post-install
categories, with 2 (configure, gateway) flagged needs_user_input=true
so GUI drivers can skip them and handle the equivalent UX themselves.
Along the way, clean-VM testing on stock Windows 10/11 surfaced a
series of latent install.ps1 bugs that were never exercised by
developer machines. Fixed in the same commit:
* Encoding: file is now pure ASCII with no BOM. Windows PowerShell
5.1 reads BOM-less files as Windows-1252 and chokes on em-dashes
(and other UTF-8 sequences), while iex chokes on a leading U+FEFF.
Pure-ASCII satisfies both invocation paths.
* EAP=Stop + native `2>&1` captures: PowerShell wraps stderr lines
from native commands as ErrorRecord objects under EAP=Stop and
throws even when the command exits 0. Relaxed to EAP=Continue
around the astral.sh uv installer, `uv python install`, `npm
install`, `npx playwright install`, the venv import probes, and
the Node winget fallback. Check $LASTEXITCODE for the real signal.
* Cross-process state: each `-Stage <name>` invocation spawns a
fresh powershell child. $script:UvCmd set by Stage-Uv was invisible
to Stage-Python; PATH updated by Stage-Git/Stage-Node was invisible
to subsequent stages spawned by the driver shell. Added Resolve-UvCmd
helper called at the top of every stage that needs uv, and a
Sync-EnvPath helper called at the top of Invoke-Stage to refresh
PATH from the registry.
* UAC avoidance: `winget install OpenJS.NodeJS.LTS` triggers a UAC
prompt that often appears minimized in the taskbar -- looks like a
hang. Switched Test-Node to prefer the official portable Node zip
dropped into %LOCALAPPDATA%\hermes\node\ (mirrors the PortableGit
pattern Install-Git already uses). winget kept as fallback.
* npx hangs on confirmation: `npx playwright install chromium` blocks
on stdin waiting for "Need to install playwright@X.Y.Z (y/N)" when
playwright isn't in local node_modules. Tee-Object pipelines
disconnect stdin from the user's TTY so the install hangs forever.
Pass `--yes` to auto-accept.
* Silent long-running installs: `*> $logPath` redirected every stream
to disk and left the user staring at a frozen "Installing..." line
for the 5-10 minutes Playwright Chromium takes to download. Switched
to `2>&1 | ForEach-Object { "$_" } | Tee-Object -FilePath $log` so
output streams live to the console AND captures to log for failure
diagnostics. ForEach-Object coercion strips PowerShell's red
NativeCommandError formatter from stderr items.
* Console encoding: forced [Console]::OutputEncoding to UTF-8 so
playwright/git/npm progress bars, box-drawing, and check marks render
correctly instead of as IBM437/Windows-1252 mojibake.
* Performance: set $ProgressPreference = "SilentlyContinue" so
Invoke-WebRequest doesn't paint its per-chunk progress bar. The
PS 5.1 progress UI throttles downloads by 10-100x (a 57MB PortableGit
grab takes 5 minutes with the bar on vs ~20 seconds with it off,
same network). Affects PortableGit, Node portable zip, and the
Hermes repo zip fallback.
Tests: scripts/tests/test-install-ps1-stage-protocol.ps1 provides 19
metadata-only assertions covering -ProtocolVersion, -Manifest schema,
and unknown -Stage error frame. No install side effects.
End-to-end validated on a clean Windows 10 VM via:
1. `irm <branch>/scripts/install.ps1 | iex` (canonical CLI path)
2. `powershell -File install.ps1 -Stage X` iterated through every
stage (GUI driver path, exercises cross-process fixes)
Fresh Windows installs were failing on first run with:
⚠ uv python install error: Downloading cpython-3.11.15-windows-x86_64-none (24.5MiB)
✗ Installation failed: Python was not found; run without arguments
to install from the Microsoft Store...
Two bugs compounding:
1) EAP=Stop swallows uv's stderr progress as an exception. uv writes
download progress ("Downloading cpython-3.11.15-windows-x86_64-none
(24.5MiB)") to stderr. With $ErrorActionPreference = "Stop" set at
the top of the script plus 2>&1 capture, PowerShell wraps each stderr
line as an ErrorRecord and throws on the first one — even though uv
exits 0 and Python was installed successfully. This was previously
fixed in commit ec1714e71 (May 8) but lost in the May 12 release
squash (413990c94). Reapply the EAP=Continue + verify-via
'uv python find' pattern.
2) System-python fallback invokes the Microsoft Store stub. When the uv
paths fall through, the legacy 'python --version' check invokes
%LOCALAPPDATA%\\Microsoft\\WindowsApps\\python.exe, a 0-byte
reparse-point stub that prints 'Python was not found...' to stdout
and exits non-zero. Get-Command matches it. The resulting error
message is what the user sees as the final installer crash. Detect
and skip the stub by checking for the \\WindowsApps\\ path
component or a 0-byte file size before invoking python.
Also save/restore EAP defensively in the catch blocks so a throw before
the assignment can't leave EAP in 'Continue'.
* fix(cli): allow rotating broken OpenRouter / AI Gateway key in `hermes model` flow
Before: when `OPENROUTER_API_KEY` (or `AI_GATEWAY_API_KEY`) was already
set in ~/.hermes/.env, `hermes model openrouter` / `hermes model
ai-gateway` skipped the API-key prompt entirely and jumped straight to
the model picker. Users with a broken / expired / wrong key had no way
to replace it without editing ~/.hermes/.env by hand or re-running
`hermes setup` from scratch.
Both flows now route through the existing `_prompt_api_key()` helper,
which surfaces [K]eep / [R]eplace / [C]lear when a key is already
configured — the same UX the generic API-key providers (z.ai, MiniMax,
Gemini, etc.) and the Daytona setup already use.
* fix(install.ps1): pin uv sync target to venv\, verify baseline imports
Two related Windows-installer bugs that produce a broken venv with
`ModuleNotFoundError: No module named 'dotenv'` on first `hermes` run.
## Bug 1: uv sync ignores VIRTUAL_ENV, syncs into .venv\ instead of venv\
`Install-Dependencies` creates the venv at `venv\` via `uv venv venv`,
sets `$env:VIRTUAL_ENV = "$InstallDir\venv"`, then runs
`uv sync --extra all --locked`. Modern uv (>=0.5) ignores `VIRTUAL_ENV`
for the `sync` subcommand and uses the project default `.venv\`
instead. Result: deps land in `$InstallDir\.venv\`, `venv\` stays
empty except for the python.exe stub from the earlier `uv venv` call,
`hermes.exe` ends up wired to the wrong site-packages.
The bash installer (`scripts/install.sh`) already worked around this in
`install_deps()` line 1127 by passing `UV_PROJECT_ENVIRONMENT` — that
flag tells uv exactly where to put the project env regardless of
`VIRTUAL_ENV`. Port the same fix to PowerShell.
## Bug 2: no post-install verification
If the sync still misdirects for any other reason (uv version drift,
filesystem quirk, user re-run scenarios), the installer reports success
and the user only finds out by running `hermes` and getting an
unhelpful traceback. Add a baseline-import probe that runs the venv's
own python against the four packages every `hermes` invocation needs
(`dotenv`, `openai`, `rich`, `prompt_toolkit`). On failure, throw
with a recovery command tailored to whether a sibling `.venv\` exists.
User report (Windows 11, Python 3.13.5, Hermes v0.13.0): manual repro
steps were exactly this — `uv sync` landed in `.venv\`, recovered by
junctioning `venv\` → `.venv\` to bridge the path mismatch.
* fix(install): use `--extra all` not `--all-extras`; drop lazy-covered extras from [all]
Two coupled fixes for the Windows install hang where uv sync built
python-olm from sdist and failed on missing make.
# Root cause: --all-extras vs --extra all (credit: ethernet)
`uv sync --all-extras` installs every key in [project.optional-
dependencies], bypassing the curated [all] extra entirely. So even
when [all] excluded [matrix], [rl], [yc-bench], etc., the installer
pulled them anyway because they were still defined as extras. On
Windows that meant python-olm (no wheel, needs make to build from
sdist) and the install died there.
The right flag is `--extra all` — install just the [all] extra's
contents, respecting curation. Empirically verified via dry-run:
--all-extras: pulls python-olm, mautrix, ctranslate2, onnxruntime,
atroposlib, tinker, wandb, modal, daytona, vercel,
python-telegram-bot, discord.py, slack-bolt,
dingtalk-stream, lark-oapi, anthropic, boto3,
edge-tts, elevenlabs, exa-py, fal-client, faster-
whisper, firecrawl-py, honcho-ai, parallel-web
--extra all: pulls none of those — just [all]'s curated set
Dockerfile already uses `--extra all` (with comment explaining the
gotcha) — knowledge existed; the gap was install.sh / install.ps1 /
setup-hermes.sh.
Sites fixed: scripts/install.sh L1118, scripts/install.ps1 L809,
setup-hermes.sh L245.
# Companion fix: drop lazy-covered extras from [all]
`tools/lazy_deps.py` already covers anthropic, bedrock, exa,
firecrawl, parallel-web, fal, edge-tts, elevenlabs, modal, daytona,
vercel, all messaging platforms (telegram/discord/slack/matrix/
dingtalk/feishu), honcho, and faster-whisper. They were ALSO in
[all], which defeats the whole point of lazy-install — fresh
installs eager-pulled them and inherited whatever was broken
upstream (the matrix → python-olm → no Windows wheel chain being
the proximate symptom).
[all] now contains only what genuinely can't be lazy-installed:
cron, cli, dev, pty, mcp, homeassistant, sms, acp, google, web,
youtube. Same trim applied to [termux-all]. New regression test
asserts the contract: every extra in LAZY_DEPS must NOT also appear
in [all].
# Companion fix: surface uv progress + errors
setup-hermes.sh's hash-verified path swallowed uv's stderr to a
tempfile, identical to the install.sh bug fixed in PR #24504. Same
fix applied: stream stderr through directly so users see live
progress instead of staring at a frozen prompt.
# Files
- pyproject.toml: trim [all] and [termux-all] to non-lazy extras only.
- scripts/install.sh: --all-extras → --extra all; trim _ALL_EXTRAS /
_PYPI_EXTRAS to match.
- scripts/install.ps1: --all-extras → --extra all; trim $allExtras /
$pypiExtras to match.
- setup-hermes.sh: --all-extras → --extra all; stream stderr.
- tests/test_project_metadata.py: invert matrix-in-[all] assertion;
add lazy-coverage contract test.
- uv.lock: regenerated.
# Validation
5/5 metadata tests pass. 37/37 in update_autostash + tool_token_
estimation. `uv lock --check` passes. Empirical dry-run confirms
`--extra all` excludes python-olm + RL chain on the new lockfile.
* fix(install): parse [all] from pyproject.toml instead of mirroring it
ethernet's review point: the previous patch left two hand-mirrored
copies of [all]'s contents (in install.sh's $_ALL_EXTRAS and
install.ps1's $allExtras). That guarantees future drift the next
time pyproject.toml's [all] changes.
Now both scripts parse pyproject.toml at install time using stdlib
tomllib (Python 3.11+, which the bootstrap step already requires).
Single source of truth. The only purpose of the parsed list is to
build the 'Tier 2: [all] minus broken extras' fallback spec — so we
parse, filter against $brokenExtras, and rebuild the .[a,b,c] spec.
Also: removed redundant fallback tiers.
Before: Tier 1 [all]
Tier 2 [all] minus broken
Tier 3 PyPI-only extras (no git deps)
Tier 4 [web,mcp,cron,cli,messaging,dev]
Tier 5 .
After: Tier 1 [all]
Tier 2 [all] minus broken
Tier 3 .
Tier 3 (PyPI-only) and Tier 4 (dashboard+core) used to dodge the [rl]
git+sdist deps and the [matrix] python-olm build. Both are no longer
in [all] post-2026-05-12 lazy-install migration, so the carve-out
tiers had no remaining content. Tier 4 also referenced [messaging],
which is now lazy-installed — the hardcoded fallback was actually
inconsistent with the new policy.
Defensive fallback: if tomllib parse fails (corrupted pyproject,
unexpected schema), Tier 2 collapses to '.[all]' (same as Tier 1) so
the broken-extras path becomes a no-op rather than crashing.
* fix(gateway): hide Matrix from setup picker on Windows
Matrix is the one messaging platform that has no working install path
on Windows: [matrix] -> mautrix[encryption] -> python-olm, which has
Linux-only wheels and needs make + libolm to build from sdist. The
[all] cleanup in this PR keeps mautrix out of fresh installs, but a
user who picked Matrix in 'hermes setup gateway' would still walk
into the same sdist build failure when the wizard tried to install
the extra.
Hide the option at the picker so users never get the chance to try.
The gate lives in _all_platforms() — single source of truth for the
setup wizard, the curses gateway-config menu, and any future picker.
Adapter loading at runtime is intentionally NOT gated: users who
already have MATRIX_* env vars set (e.g. config copied from a Linux
install) keep working if they somehow have python-olm available.
This is the lowest-friction fix — picker visibility only.
Tests cover linux/darwin/win32 and verify other platforms aren't
collateral damage.
* feat(security): supply-chain advisory checker + lazy-install framework + tiered install fallback
Three coordinated mitigations for the Mini Shai-Hulud worm hitting
mistralai 2.4.6 on PyPI (2026-05-12) and for the next single-package
compromise that follows.
# What this PR makes true
1. Users with the poisoned mistralai 2.4.6 in their venv get a loud
detection banner with copy-pasteable remediation steps the moment
they run hermes (and on every gateway startup).
2. One quarantined / yanked PyPI package can no longer silently demote
a fresh install to 'core only' — the installer keeps every other
extra and tells the user which tier landed.
3. Future opt-in backends (Mistral, ElevenLabs, Honcho, etc.) can
lazy-install on first use under a strict allowlist, instead of
eagerly pulling everything at install time.
# Detection: hermes_cli/security_advisories.py
- ADVISORIES catalog (one entry currently: shai-hulud-2026-05 for
mistralai==2.4.6). Adding the next one is a single dataclass.
- detect_compromised() uses importlib.metadata.version() — no pip
dependency, works in uv venvs that lack pip.
- Banner cache (~/.hermes/cache/advisory_banner_seen) rate-limits
the startup banner to once per 24h per advisory.
- Acks persisted to security.acked_advisories in config.yaml; never
re-banner after ack.
- Wired into:
* hermes doctor — runs first, prints full remediation block
* hermes doctor --ack <id> — dismisses an advisory
* cli.py interactive run() and single-query branches — short
stderr banner pointing at hermes doctor
* gateway/run.py startup — operator-visible warning in gateway.log
# Lazy-install framework: tools/lazy_deps.py
- LAZY_DEPS allowlist maps namespaced feature keys (tts.elevenlabs,
memory.honcho, provider.bedrock, etc.) to pip specs.
- ensure(feature) installs missing deps in the active venv via the
uv → pip → ensurepip ladder (matches tools_config._pip_install).
- Strict spec safety regex rejects URLs, file paths, shell metas,
pip flag injection, control chars — only PyPI-by-name accepted.
- Gated on security.allow_lazy_installs (default true) plus the
HERMES_DISABLE_LAZY_INSTALLS env var for restricted/audited envs.
- Migrated three backends as proof of pattern:
* tools/tts_tool.py — _import_elevenlabs() calls ensure first
* plugins/memory/honcho/client.py — get_honcho_client lazy-installs
* tts.mistral / stt.mistral entries pre-registered for when PyPI
restores mistralai
# Installer fallback tiers
scripts/install.sh, scripts/install.ps1, setup-hermes.sh:
- Centralised _BROKEN_EXTRAS list (currently: mistral). Edit one
array when a transitive breaks; users keep every other extra.
- New 'all minus known-broken' tier between [all] and the existing
PyPI-only-extras tier. Only kicks in when [all] fails resolve.
- All three tiers explicit: every fallback announces which tier
landed and prints a re-run hint when not on Tier 1.
- install.ps1 and install.sh both regenerate their tier specs from
the same _BROKEN_EXTRAS array so updates stay in sync.
Side effect: install.ps1 Tier 2 spec previously hardcoded 'mistral'
in its extra list — bug fixed by the refactor (mistral is filtered
out).
# Config
hermes_cli/config.py — DEFAULT_CONFIG.security gains:
- acked_advisories: [] (advisory IDs the user has dismissed)
- allow_lazy_installs: True (security gate for ensure())
No config version bump needed — both keys nest under existing
security: block, and load_config's deep-merge picks up DEFAULT_CONFIG
defaults for users with older configs.
# Tests
tests/hermes_cli/test_security_advisories.py — 23 tests covering:
- detect_compromised matches/non-matches, wildcard frozenset
- ack persistence, idempotence, blank rejection, config-failure path
- banner cache rate limiting + 24h re-banner + ack-stops-banner
- short_banner_lines / full_remediation_text / render_doctor_section /
gateway_log_message
- shipped catalog well-formedness invariant
tests/tools/test_lazy_deps.py — 40 tests covering:
- spec safety: 11 safe parametrized + 18 unsafe parametrized
- allowlist: unknown-feature rejection, namespace.name shape,
every shipped spec passes the safety regex
- security gating: config flag, env var, default, fail-open
- ensure() happy/sad paths: already-satisfied, install success,
pip stderr surfaced on failure, install-succeeds-but-still-missing
- is_available, feature_install_command
Combined: 63 new tests, all passing under scripts/run_tests.sh.
# Validation
- scripts/run_tests.sh tests/hermes_cli/test_security_advisories.py
tests/tools/test_lazy_deps.py → 63/63 passing
- scripts/run_tests.sh tests/hermes_cli/test_doctor.py
tests/hermes_cli/test_doctor_command_install.py
tests/tools/test_tts_mistral.py tests/tools/test_transcription_tools.py
tests/tools/test_transcription_dotenv_fallback.py → 165/165 passing
- scripts/run_tests.sh tests/hermes_cli/ tests/tools/ →
9191 passed, 8 pre-existing failures (verified on origin/main
before this change)
- bash -n on install.sh and setup-hermes.sh → OK
- py_compile on all modified .py files → OK
- End-to-end smoke test of detect_compromised + render_doctor_section
+ gateway_log_message with mocked installed version → produces
copy-pasteable remediation output
# Community
Full advisory + remediation steps:
website/docs/community/security-advisories/shai-hulud-mistralai-2026-05.md
Short-form post drafts (Discord, GitHub pinned issue, README banner):
scripts/community-announcement-shai-hulud.md
Refs: PR #24205 (mistral disabled), Socket Security advisory
<https://socket.dev/blog/mini-shai-hulud-worm-pypi>
* build(deps): pin every direct dep to ==X.Y.Z (no ranges)
Companion to the supply-chain advisory work: replace every >=/</~= range
in pyproject.toml's [project.dependencies] and [project.optional-dependencies]
with an exact ==X.Y.Z pin sourced from uv.lock.
Why: ranges allow PyPI to ship a fresh version of any direct dep at any
time without a code review on our side. With ranges, the malicious
mistralai 2.4.6 release would have been pulled by every fresh
'pip install -e .[all]' for the hours between upload and PyPI's
quarantine — exactly the install window we got hit on. Exact pins close
that window: the only way a new package version reaches a user is via
an intentional update on our end.
What the user-facing change is: nothing, behavior-wise. Every package
resolves to the same version it was already resolving to via uv.lock —
the pins just remove the resolver's freedom to pick a different one.
Cost: any user installing Hermes alongside another package that requires
a newer pin gets a resolver conflict. Acceptable for our isolated-venv
install path; documented in the new comment block.
Build-system requires line (setuptools>=61.0) is intentionally left
as a range — pinning the build backend would block fresh pip from
bootstrapping the build on architectures where that exact wheel isn't
available.
mistral extra (mistralai==2.3.0) is pinned but stays out of [all]
(per PR #24205). 'uv lock' regeneration will fail until PyPI restores
mistralai; lockfile regeneration is gated behind that, NOT on every PR.
LAZY_DEPS in tools/lazy_deps.py also moved to exact pins so the lazy-
install pathway can never resolve a different version than the one
declared in pyproject.toml.
Validation:
- Cross-checked all 77 pinned direct deps in pyproject.toml against
uv.lock — every pin matches the resolved version exactly.
- Cross-checked all LAZY_DEPS specs against uv.lock — same.
- 'uv pip install -e .[all] --dry-run' resolves 205 packages cleanly.
- tests/tools/test_lazy_deps.py + tests/hermes_cli/test_security_advisories.py
→ 63/63 passing (every shipped spec passes the safety regex).
- Doctor + TTS + transcription targeted suite → 146/146 passing.
* build(deps): hash-verify transitives via uv.lock; remove unresolvable [mistral] extra
You asked: 'what about the dependencies the dependencies rely on?' —
correctly noting that exact-pinning direct deps in pyproject.toml does
NOT cover the transitive graph. `pip install` and `uv pip install` both
re-resolve transitives fresh from PyPI at install time, so a compromised
transitive (e.g. `httpcore` if it got worm-poisoned tomorrow) would
still hit our users even with every direct dep exact-pinned.
# What this commit fixes
1. **Both real installer scripts now prefer `uv sync --locked` as Tier 0.**
uv.lock records SHA256 hashes for every transitive — a compromised
package with a different hash gets REJECTED. Falls through to the
existing `uv pip install` cascade if the lockfile is missing or
stale, with a loud warning that the fallback path does NOT
hash-verify transitives. Previously only `setup-hermes.sh` (the dev
path) used the lockfile; `scripts/install.sh` and `scripts/install.ps1`
(the paths fresh users actually run) skipped it.
2. **Removed the `[mistral]` extra entirely.** The `mistralai` PyPI
project is fully quarantined right now — every version returns 404,
so any pin we wrote was unresolvable, which broke `uv lock --check`
in CI. Restoration is documented in pyproject.toml as a 5-step
checklist (verify, re-add extra, re-enable in 4 modules, regenerate
lock, optionally re-add to [all]).
3. **Regenerated uv.lock.** 262 packages, mistralai/eval-type-backport/
jsonpath-python pruned. `uv lock --check` now passes.
# Defense-in-depth view
| Layer | Where | Protects against |
|----------------------------|-------------------|-------------------------------------------|
| Exact pins in pyproject | direct deps | new mistralai 2.4.6-style direct compromise |
| uv.lock + `--locked` install | transitive graph | transitive worm injection |
| Tier-0 hash-verified path | install.sh / .ps1 | actually USE the lockfile in fresh installs |
| `uv lock --check` CI gate | every PR | drift between pyproject and lockfile |
| `hermes_cli/security_advisories.py` | runtime | cleanup for users who already got hit |
The exact pinning + hash verification together close the supply-chain
gap. Without the lockfile path, exact pins alone are theater.
# Validation
- `uv lock --check` → passes (262 packages resolved, no drift).
- `bash -n` on install.sh + setup-hermes.sh → OK.
- 209/209 tests passing across new + adjacent test files
(test_lazy_deps.py, test_security_advisories.py, test_doctor.py,
test_tts_mistral.py, test_transcription_tools.py).
- TOML parse OK.
* chore: remove community announcement drafts (PR body covers it)
* build(deps): lazy-install every opt-in backend (anthropic, search, terminal, platforms, dashboard)
Extends the lazy-install framework to cover everything that's not used by
every hermes session. Base install drops from ~60 packages to 45.
Moved out of core dependencies = []:
- anthropic (only when provider=anthropic native, not via aggregators)
- exa-py, firecrawl-py, parallel-web (search backends; only when picked)
- fal-client (image gen; only when picked)
- edge-tts (default TTS but still optional)
New extras in pyproject.toml: [anthropic] [exa] [firecrawl] [parallel-web]
[fal] [edge-tts]. All added to [all].
New LAZY_DEPS entries: provider.anthropic, search.{exa,firecrawl,parallel},
tts.edge, image.fal, memory.hindsight, platform.{telegram,discord,matrix},
terminal.{modal,daytona,vercel}, tool.dashboard.
Each import site now calls ensure() before importing the SDK. Where the
module had a top-level try/except (telegram, discord, fastapi), the
graceful-fallback pattern was extended to lazy-install on first
check_*_requirements() call and re-bind module globals.
Updated test_windows_native_support.py tzdata check from snapshot
(>=2023.3 literal) to invariant (any version + win32 marker).
Validation:
- Base install: 45 packages (was ~60); 6 newly-extracted packages absent
- uv lock --check: passes (262 packages, no drift)
- 209/209 lazy_deps + advisory + doctor + tts/transcription tests passing
- py_compile clean on all 12 modified modules
Commit 3dfb35700 accidentally saved scripts/install.ps1 with a UTF-8 BOM
(EF BB BF) at byte 0. PowerShell's normal file-execution path (`& .\install.ps1`)
handles BOMs fine, but the curl-and-iex one-liner documented in the README
uses `[scriptblock]::Create((irm ...))` which does NOT strip BOMs — the
BOM lands inside the param() block and fails with 'The assignment
expression is not valid' on $Branch and $HermesHome.
teknium1 hit this trying to reinstall from the PR branch after Brooklyn's
commits landed. Every user trying the PR branch install-one-liner hit
it too until we notice.
Saved without BOM, verified via xxd: file now starts with '# =====' at
byte 0 instead of EF BB BF.
## Two residual Windows fixes that were hanging from earlier commits.
### 1. `hermes gateway status` reported 2 PIDs per gateway — TWO bugs compounded
Diagnosed with psutil parent/child walk against live gateway PIDs:
**Bug A (the real one): `_get_parent_pid` silently failed on Windows.**
The helper shelled out to `ps -o ppid= -p <pid>`, which doesn't exist
on Windows — `FileNotFoundError` → returns `None` → the ancestor walk
terminated at `os.getpid()` alone. Consequence: the PID table scan in
`_scan_gateway_pids` couldn't filter out `hermes gateway status`'s own
launcher stub (a venv `pythonw.exe`/`python.exe` that matches the same
`-m hermes_cli.main gateway` pattern as the gateway). Every status
call saw "itself" as a second gateway.
Fix: `_get_parent_pid` now calls `psutil.Process(pid).ppid()` first
(psutil is a core dependency since 3dfb35700) and falls back to `ps`
only when `shutil.which("ps")` succeeds — matching the Windows-footgun
checker's "always guard `ps` / `wmic` / etc. with `shutil.which`" rule.
Before: `Gateway process running (PID: 21952, 46880)` — 46880 changing
on every call (the status invocation's own launcher, which died by the
time the next status call looked).
After (5 consecutive calls):
```
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
✓ Gateway process running (PID: 21952)
```
Ancestor walk on the fix: 14 PIDs (full chain through bash/explorer)
instead of the broken 1-PID set.
**Bug B (the cosmetic one): venv-launcher dedup.** Standard Windows
CPython venv behaviour is that `<venv>/Scripts/pythonw.exe` is a ~5 MB
launcher stub that spawns the base Python (`C:\\Program Files\\Python311
\\pythonw.exe`) with the same command line and waits. Our process
scanner sees two PIDs for every gateway: launcher + interpreter, same
cmdline. Bug A masked this by accidentally counting the status call
AS one of them; with Bug A fixed, we see both the real launcher and
real interpreter for the gateway process itself.
Fix: `_filter_venv_launcher_stubs` at the tail of `_scan_gateway_pids`
walks each matched PID's ppid via psutil. Any PID that's the PARENT
of another matched PID is a launcher stub — drop it, keep the child.
Scoped to Windows (`is_windows() and len(pids) > 1`) and no-ops when
psutil isn't importable.
Net effect: `gateway status` now reports one PID per gateway — the
interpreter — matching POSIX behaviour and user expectations.
### 2. `install.ps1`: bootstrap pip + auto-install platform SDKs
New `Install-PlatformSdks` function wired between `Invoke-SetupWizard`
and `Start-GatewayIfConfigured`. Fixes two related issues on fresh
Windows installs:
1. The tiered `uv pip install` cascade (introduced in 87fca8342)
correctly falls through when tier 1 `.[all]` fails on the RL git
deps, but the fallback tiers can silently skip SDKs from `[messaging]`
when there's a partial-resolve. Result: user sets `DISCORD_BOT_TOKEN`
in `.env`, fires up gateway, hits "discord module not installed".
2. `uv` creates venvs WITHOUT pip by default, so the user's escape
hatch (`pip install discord.py` in the venv) doesn't exist either.
The new function:
- Skips if `-NoVenv` (nothing to bootstrap into).
- Scans `~/.hermes/.env` for messaging tokens (TELEGRAM_BOT_TOKEN,
DISCORD_BOT_TOKEN, SLACK_BOT_TOKEN, SLACK_APP_TOKEN, WHATSAPP_ENABLED),
filtering placeholder values.
- For each token that's set, runs `python -c "import <sdk>"` to verify.
- If any import fails: runs `python -m ensurepip --upgrade` to bootstrap
pip into the venv (idempotent — no-ops if pip is already present),
then `pip install <spec>` for each missing SDK with specs mirroring
pyproject.toml's `[messaging]` extra to avoid version drift.
The `$ErrorActionPreference = "SilentlyContinue"` spans are not
cosmetic — PowerShell wraps native-stderr from a non-zero-exit
subprocess as a `NativeCommandError` that prints even through
`*> $null` / `2>$null`. Save + restore EAP over the import-probe
and pip-install blocks keeps the output clean.
Verified on this Windows 10 box:
- Initial state: telegram+fastapi+psutil present, discord+slack_sdk
missing (tier 1 `.[all]` had failed — `.tirith-install-failed`
marker in `%LOCALAPPDATA%\\hermes`).
- First run with discord+slack tokens in .env: detects both missing,
ensurepip (skipped — pip was already bootstrapped earlier this
session for telegram), installs `discord.py[voice]==2.7.1` +
`PyNaCl` + `davey`, installs `slack-sdk==3.41.0`. All imports
succeed on verify.
- Second run: all three SDKs report OK, function no-ops.
Pip spec strings mirror pyproject.toml's `[messaging]` extra verbatim
so a bump to the extra picks up here automatically — no drift.
### Files
- `hermes_cli/gateway.py`: `_get_parent_pid` rewritten (psutil-first);
`_filter_venv_launcher_stubs` added; `_scan_gateway_pids` dedups
launchers on Windows when it finds >1 match.
- `scripts/install.ps1`: new `Install-PlatformSdks` function (~85
lines); wired into the main flow at line 1438.
### Verification
- `venv/Scripts/python.exe scripts/check-windows-footguns.py --all`
→ `✓ No Windows footguns found (380 file(s) scanned).`
- `ast.parse` passes on gateway.py.
- `[System.Management.Automation.Language.Parser]::ParseFile` passes
on install.ps1.
- Live gateway (PID 21952, running since 12:33 today) survived 5x
stress loop of `hermes gateway status` without dying.
install.ps1 had three related problems that compounded into `hermes dashboard`
failing to boot on Windows with 'No module named fastapi':
1. UTF-8 BOM missing. Windows PowerShell 5.1 (the default on Windows 10/11,
which is what `irm | iex` runs under) reads files without a BOM as
cp1252. install.ps1 has em-dashes, arrows, check marks, etc. — PS 5.1
mangled them and the file failed to parse. Added UTF-8 BOM so PS 5.1,
PS 7, and the in-memory `irm | iex` path all read the file identically.
2. `uv pip install -e .[all]` had a single-tier silent fallback to bare
`.` on any failure, with `2>&1 | Out-Null` swallowing the error. Any
transient extras install failure (network hiccup, wheel build issue,
etc.) would drop every optional extra including [web], and the installer
would still print 'Main package installed'. Replaced with a four-tier
fallback (.[all] -> PyPI-only extras -> dashboard+core -> bare) that
prints output at every step and a targeted [web] verify+repair at the
end so `hermes dashboard` specifically is never silently broken.
3. tinker-atropos was installed unconditionally after the main install.
tinker-atropos/pyproject.toml pulls atroposlib and tinker from
git+https://github.com/... which can fail on locked-down networks,
flaky DNS, or rate-limited github.com and would half-install the venv.
install.sh already skipped it by default with a one-liner for users
who actually do RL training — install.ps1 now matches that behavior.
Parse-checked clean under Windows PowerShell 5.1.26100.8115
(5318 tokens, 0 parse errors).
scripts/install.sh runs 'npx playwright install --with-deps chromium'
on every Linux distro after the npm-install step, which is why browser
tools Just Work on Linux. scripts/install.ps1 never did the equivalent
step, so on native Windows installs check_browser_requirements() in
tools/browser_tool.py would return False (no Chromium under
%LOCALAPPDATA%\ms-playwright) and every browser_* tool got silently
filtered out of the agent's tool schema — no error, no log entry, user
just wondered why the tools didn't exist.
Two-part fix:
1. scripts/install.ps1: after 'npm install' in InstallDir succeeds, run
'npx playwright install chromium'. Resolves npx via the same
execution-policy-aware logic already used for npm (prefer npx.cmd
next to npmExe, fall back to Get-Command). Surfaces a warning +
manual-recovery hint when the install fails, matching install.sh
behaviour for distros.
2. hermes_cli/doctor.py: after the agent-browser check, lazily import
tools.browser_tool and reuse the exact same _chromium_installed()
predicate check_browser_requirements() uses, so the doctor signal
cannot drift from the runtime gate. Skip the check when Camofox /
CDP override / a cloud provider / Lightpanda is configured (those
bypass local Chromium). On missing Chromium, the hint is
platform-correct: '--with-deps' on POSIX, plain 'install chromium'
on win32.
Verified on Windows 10:
- 'npx playwright install chromium' completes successfully, drops
Chrome Headless Shell under %LOCALAPPDATA%\ms-playwright
- check_browser_requirements() flips from False -> True
- 'hermes doctor' now prints either '✓ Playwright Chromium (browser
engine)' or '⚠ Playwright Chromium not installed' + fix command
- tests/hermes_cli/test_doctor.py: 38/38 pass
- tests/tools/test_browser_chromium_check.py: 16/16 pass
Two fixes from teknium1's next install run:
1. **npm install: "npm.ps1 cannot be loaded because running scripts is
disabled on this system."** Get-Command's default PATHEXT ordering
picked up ``npm.ps1`` (the PowerShell shim) ahead of ``npm.cmd`` (the
batch shim). Most Windows users have PowerShell's execution policy
set to Restricted or RemoteSigned, which blocks unsigned ``.ps1``
files. ``npm.cmd`` has no such restriction and works universally.
Install-NodeDeps now detects when Get-Command returned npm.ps1, looks
for a sibling npm.cmd in the same directory, and prefers it. Prints
an info line so the user sees why. Emits a warning + hint if only
npm.ps1 is available.
2. **"Launch hermes chat now? Y" crashes with "%1 is not a valid Win32
application" on Windows installs.** The setup wizard calls
``relaunch(["chat"])``; ``resolve_hermes_bin()`` returned
``sys.argv[0]`` which was ``...\\hermes_cli\\main.py`` (because hermes
was launched via ``python -m hermes_cli.main`` during setup).
On Windows, ``os.access(script.py, os.X_OK)`` returns True because
PATHEXT lists ``.py`` when the Python launcher is registered — but
``subprocess.run([script.py, ...])`` can't actually execute a ``.py``
directly. CreateProcessW needs a real PE file.
Fixed ``resolve_hermes_bin`` to reject ``.py``/``.pyc`` argv0 values
on Windows specifically. Falls through to ``shutil.which("hermes")``
(hermes.exe in the venv Scripts dir) or, as a final fallback, lets
build_relaunch_argv build ``[sys.executable, "-m", "hermes_cli.main"]``
which is bulletproof. POSIX behaviour unchanged — ``.py`` argv0 with
a shebang + chmod+x is still a valid exec target there.
3 new tests cover the Windows paths: .py argv0 + hermes.exe on PATH →
returns hermes.exe; .py argv0 + no PATH → returns None (caller uses
python -m); POSIX + executable .py → still accepted.
26 relaunch tests pass, no POSIX regressions.
Three bugs from teknium1's successful install + diagnostic chat on Windows:
1. **Start-Process -FilePath npm.cmd fails with "%1 is not a valid Win32
application".** Start-Process bypasses cmd.exe and PATHEXT to call
CreateProcessW directly, which refuses .cmd batch shims. Switched
Install-NodeDeps to use PowerShell's invocation operator (``& $npmExe
install --silent *> $log``) which DOES honour PATHEXT. Extracted a
``_Run-NpmInstall`` helper so the browser + TUI paths share the same
logic. Captures $LASTEXITCODE correctly, still surfaces the real
stderr on failure with a log-file pointer for the full output.
2. **patch tool returns false-negative on Windows due to CRLF round-trip.**
Root cause was upstream of patch: ``subprocess.Popen(..., text=True,
stdin=PIPE)`` on Windows translates ``\\n`` → ``\\r\\n`` when data flows
through the stdin pipe. ``_pipe_stdin()`` was writing the patch's
new_content string through a text-mode pipe, bash then wrote those
CRLF bytes to disk, and patch's post-write verify compared the
on-disk CRLF bytes against the original LF-only string — fail.
Fixed in two places for defense in depth:
- ``_pipe_stdin()`` now writes through ``proc.stdin.buffer`` with
explicit UTF-8 encoding, bypassing Python's newline translation on
every platform. No behaviour change on POSIX (bytes are identical)
but stops the CRLF injection on Windows.
- ``patch_replace``'s post-write verify normalizes CRLF→LF on both
sides before comparing, so even if some future backend still
translates newlines the patch tool won't report a bogus failure.
3. **SOUL.md gets a UTF-8 BOM on Windows PowerShell 5.1.** ``Set-Content
-Encoding UTF8`` on PS5.1 writes UTF-8 WITH a byte-order-mark (changed
in PS7 via ``utf8NoBOM``). Hermes's prompt-injection scanner sees
the BOM (U+FEFF invisible char) and refuses to load the file, so
SOUL.md's persona instructions never get applied.
Fixed by writing the file via ``[System.IO.File]::WriteAllText``
with an explicit ``UTF8Encoding($false)`` — BOM-free on every
PowerShell version.
All POSIX behaviour verified unchanged: 198 tests pass across
test_file_operations, test_local_env_cwd_recovery, test_code_execution,
test_windows_native_support, test_windows_compat.
User hit 'fatal: not in a git directory' on re-install because:
1. They ran Remove-Item -Force $env:LOCALAPPDATA\hermes -ErrorAction
SilentlyContinue WHILE cd'd inside the install dir. Windows
silently refuses to delete a directory any shell is currently cd'd
inside and leaves the skeleton intact, but the -ErrorAction
SilentlyContinue swallowed every partial-delete failure so they
thought the wipe succeeded.
2. The installer then walked into Install-Repository, saw $InstallDir
still exists with a partial .git stub, my repo-validity probe
returned success (the probe's git rev-parse may have exit-code-zeroed
in a way I didn't expect), and the real git fetch died with three
'fatal: not a git repository' errors.
Two fixes belt-and-braces:
- Main() now cds to $env:USERPROFILE at start if the current shell
is inside $InstallDir. Harmless when the user ran from elsewhere;
critical when they didn't. This alone fixes the user's case.
- Install-Repository's 'is this a valid repo' probe now runs BOTH
git rev-parse --is-inside-work-tree AND git status, resets
$LASTEXITCODE before each to avoid picking up a stale 0, and
requires BOTH to succeed. Also requires rev-parse's output to
match 'true' (not just exit 0) to rule out exit-0-with-empty-output
edge cases.
teknium1 hit "fatal: not in a git directory" on re-install when the previous
install left a $InstallDir\.git stub that Test-Path matched but git didn't
recognize (three "fatal: not a git repository" lines, then the script
exited before touching anything).
Two bugs:
1. Test-Path "$InstallDir\.git" was a weak gate — it matches .git
whether it's a directory, file, symlink, submodule gitfile, OR a
broken stub from a failed previous Remove-Item. Replaced with a
real repo probe: Push-Location + git rev-parse --is-inside-work-tree
+ $LASTEXITCODE check. If git itself can't see a repo, we treat
the directory as not-a-repo and fall through to fresh clone.
2. The original update path ignored $LASTEXITCODE. fetch/checkout/pull
all emitted fatals but the script kept going. Now each command
checks $LASTEXITCODE and throws with an explicit message.
Also: when the directory exists but isn't a valid repo, the new code
wipes it (Remove-Item -ErrorAction Stop) and falls through to fresh
clone, instead of dying with the old "Directory exists but is not a git
repository" error. If the wipe itself fails (file locked, hermes still
running), we throw with a user-readable "close any programs using files
in <dir>" hint.
Refactored the function to use a $didUpdate flag instead of my earlier
draft's early `return` — that was skipping the submodule init block at
the bottom of the function. Both the update and fresh-clone paths now
fall through to the submodule init step, which is correct (git pull
doesn't auto-update submodules).
PowerShell structural check: 21 functions defined, braces balanced.
Three real bugs from teknium1's first Windows install run:
1. **MinGit has no bash.exe.** MinGit is the minimal-automation Git for Windows
distribution — it ships git.exe but deliberately strips bash and the POSIX
coreutils. Installer logged "Could not locate bash.exe" and Hermes would
fail to run any shell command. Switched to PortableGit — the full Git for
Windows minus the installer UI. PortableGit ships bash.exe at
<root>\bin\bash.exe plus sh, awk, sed, grep, curl, ssh in usr\bin\. ARM64
variant is detected separately (PortableGit-*-arm64.7z.exe). 32-bit falls
back to MinGit-32-bit with a warning (PortableGit is 64-bit only).
PortableGit ships as a 7z self-extractor (56MB vs MinGit's 38MB). We
invoke it with `-o<target> -y` to extract silently — no 7z install needed,
it's self-contained.
Updated tools/environments/local.py::_find_bash candidate order to prefer
the PortableGit layout (<root>\bin\bash.exe) with the MinGit layout
(<root>\usr\bin\bash.exe) as a fallback so existing installs keep working.
2. **os.execvp "Exec format error" on Windows.** Setup wizard's "Launch
hermes chat now? Y" called `os.execvp(["hermes", "chat"])` which on
Windows can only swap to real Win32 .exe files — chokes with OSError(8)
on .cmd batch shims and Python console-script wrappers. Added a
win32 branch in hermes_cli/relaunch.py::relaunch() that uses
subprocess.run + sys.exit — functionally identical (user sees "hermes
exited, then new hermes started") with one extra PID in play. POSIX
path is UNCHANGED — still uses os.execvp for in-place replacement.
Catches OSError in the Windows branch and surfaces a "open a new
terminal so PATH picks up, then re-run hermes" hint instead of a
cryptic traceback.
3. **npm install failures silent on Windows.** The install.ps1 was invoking
`npm install --silent 2>&1 | Out-Null` inside a try/catch. PowerShell's
try/catch does NOT trigger on non-zero process exit codes — only on
unhandled .NET exceptions — so npm failing printed a generic "npm
install failed" with zero information about WHY. The silent pipe ate
the stderr.
Rewrote Install-NodeDeps to:
- Resolve npm.cmd via Get-Command (respects PATHEXT) instead of
relying on bare `npm` name resolution.
- Use Start-Process with -PassThru to capture the actual exit code.
- Redirect stderr to a temp log and surface the first ~800 chars of
the real npm error when install fails, plus the log path for the
full text.
- Fail loudly with the right exit code instead of a misleading success.
- Bail cleanly with a helpful message when npm isn't on PATH at all.
4. **"True" printing to console after Node check.** `Test-Node` returns $true;
installer called it as a bare statement (no assignment, no cast). PowerShell
prints bare return values. Wrapped the call in `[void](Test-Node)`.
## Tests
- Added 3 new tests in tests/hermes_cli/test_relaunch.py covering the
Windows branch: subprocess is called (not execvp), child exit code
propagates, OSError surfaces a helpful message. All 23 tests pass
(20 existing + 3 new).
- 77 Windows-compat tests still pass, POSIX behaviour unchanged.
User hit a real failure case: their system Git was in a half-installed state
(can neither uninstall nor reinstall) and winget refused to work around it.
We were one step away from shipping an installer that would have left users
with exactly the problem he already had.
What other agents do (reality check):
- Claude Code: requires pre-installed Git; breaks if user doesn't have it.
- OpenCode, Codex: don't need bash at all — PowerShell-first design.
- Cline: uses whatever shell VSCode is configured with; installs nothing.
None of them solve the "broken system Git" problem. We need to own our Git.
Changes:
- scripts/install.ps1::Install-Git: dropped winget path entirely. Now:
(1) use existing git if present; (2) download portable MinGit from the
official git-for-windows GitHub release to %LOCALAPPDATA%\hermes\git.
No winget, no admin, no Windows installer registry, no system impact.
- Added %LOCALAPPDATA%\hermes\git\{cmd,usr\bin} to User PATH so git + bash
+ POSIX coreutils (which, env, grep, …) resolve in fresh shells.
- tools/environments/local.py::_find_bash: reorder so Hermes' portable
MinGit install is checked BEFORE falling through to shutil.which("bash")
or system install locations. This way a broken system Git can't
hijack the bash lookup.
- README + installation docs reworded to reflect the new story: "portable
Git Bash, isolated from any system install, recoverable via rm -rf if it
ever breaks."
Recoverability: if Hermes' Git install ever breaks, ``Remove-Item %LOCALAPPDATA%\hermes\git``
and re-run the installer — no system impact, no uninstall drama, no winget
to fight with.
Remove eager npm install of @whiskeysockets/baileys during
install.sh, install.ps1, and Docker build. The bridge deps are
already installed on-demand by `hermes whatsapp` (Step 4 checks
for node_modules and runs npm install if missing), so there is no
need to pay the cost at initial install for users who never use
WhatsApp.
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804):
- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
(still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub
The orphaned mini-swe-agent/ directory on disk needs manual removal:
rm -rf mini-swe-agent/
Git for Windows can completely fail to write files during clone due to
antivirus software, Windows Defender Controlled Folder Access, or NTFS
filter drivers. Even with windows.appendAtomically=false, the checkout
phase fails with 'unable to create file: Invalid argument'.
New install strategy (3 attempts):
1. git clone with -c windows.appendAtomically=false (SSH then HTTPS)
2. If clone fails: download GitHub ZIP archive, extract with
Expand-Archive (Windows native, no git file I/O), then git init
the result for future updates
3. All git commands now use -c flag to inject the atomic write fix
Also passes -c flag on update path (fetch/checkout/pull) and makes
submodule init failure non-fatal with a warning.
Move Windows install location from ~\.hermes (user profile root) to
%LOCALAPPDATA%\hermes (C:\Users\<user>\AppData\Local\hermes).
The user profile directory is prone to issues from OneDrive sync,
Windows Defender Controlled Folder Access, and NTFS filter drivers
that break git's atomic file operations. %LOCALAPPDATA% is the
standard Windows location for per-user app data (used by VS Code,
Discord, etc.) and avoids these issues.
Changes:
- Default HermesHome to $env:LOCALAPPDATA\hermes
- Set HERMES_HOME user env var so Python code finds the new location
- Auto-migrate existing ~\.hermes installations on first run
- Update completion message to show actual paths