feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Kanban dashboard plugin — backend API routes.
|
|
|
|
|
|
|
|
|
|
Mounted at /api/plugins/kanban/ by the dashboard plugin system.
|
|
|
|
|
|
|
|
|
|
This layer is intentionally thin: every handler is a small wrapper around
|
|
|
|
|
``hermes_cli.kanban_db`` or a direct SQL query. Writes use the same code
|
|
|
|
|
paths the CLI and gateway ``/kanban`` command use, so the three surfaces
|
|
|
|
|
cannot drift.
|
|
|
|
|
|
|
|
|
|
Live updates arrive via the ``/events`` WebSocket, which tails the
|
|
|
|
|
append-only ``task_events`` table on a short poll interval (WAL mode lets
|
|
|
|
|
reads run alongside the dispatcher's IMMEDIATE write transactions).
|
|
|
|
|
|
|
|
|
|
Security note
|
|
|
|
|
-------------
|
|
|
|
|
The dashboard's HTTP auth middleware (``web_server.auth_middleware``)
|
|
|
|
|
explicitly skips ``/api/plugins/`` — plugin routes are unauthenticated by
|
|
|
|
|
design because the dashboard binds to localhost by default. For the
|
|
|
|
|
WebSocket we still require the session token as a ``?token=`` query
|
|
|
|
|
parameter (browsers cannot set the ``Authorization`` header on an upgrade
|
|
|
|
|
request), matching the established pattern used by the in-browser PTY
|
|
|
|
|
bridge in ``hermes_cli/web_server.py``. If you run the dashboard with
|
|
|
|
|
``--host 0.0.0.0``, every plugin route — kanban included — becomes
|
|
|
|
|
reachable from the network. Don't do that on a shared host.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
from __future__ import annotations
|
|
|
|
|
|
|
|
|
|
import asyncio
|
|
|
|
|
import hmac
|
|
|
|
|
import json
|
|
|
|
|
import logging
|
|
|
|
|
import sqlite3
|
|
|
|
|
import time
|
|
|
|
|
from dataclasses import asdict
|
|
|
|
|
from typing import Any, Optional
|
|
|
|
|
|
|
|
|
|
from fastapi import APIRouter, HTTPException, Query, WebSocket, WebSocketDisconnect, status as http_status
|
|
|
|
|
from pydantic import BaseModel, Field
|
|
|
|
|
|
|
|
|
|
from hermes_cli import kanban_db
|
|
|
|
|
|
|
|
|
|
log = logging.getLogger(__name__)
|
|
|
|
|
|
|
|
|
|
router = APIRouter()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Auth helper — WebSocket only (HTTP routes live behind the dashboard's
|
|
|
|
|
# existing plugin-bypass; this is documented above).
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
def _check_ws_token(provided: Optional[str]) -> bool:
|
|
|
|
|
"""Constant-time compare against the dashboard session token.
|
|
|
|
|
|
|
|
|
|
Imported lazily so the plugin still loads in test contexts where the
|
|
|
|
|
dashboard web_server module isn't importable (e.g. the bare-FastAPI
|
|
|
|
|
test harness).
|
|
|
|
|
"""
|
|
|
|
|
if not provided:
|
|
|
|
|
return False
|
|
|
|
|
try:
|
|
|
|
|
from hermes_cli import web_server as _ws
|
|
|
|
|
except Exception:
|
|
|
|
|
# No dashboard context (tests). Accept so the tail loop is still
|
|
|
|
|
# testable; in production the dashboard module always imports
|
|
|
|
|
# cleanly because it's the caller.
|
|
|
|
|
return True
|
|
|
|
|
expected = getattr(_ws, "_SESSION_TOKEN", None)
|
|
|
|
|
if not expected:
|
|
|
|
|
return True
|
|
|
|
|
return hmac.compare_digest(str(provided), str(expected))
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def _resolve_board(board: Optional[str]) -> Optional[str]:
|
|
|
|
|
"""Validate and normalise a board slug from a query param.
|
|
|
|
|
|
|
|
|
|
Raises :class:`HTTPException` 400 on malformed slugs so the browser
|
|
|
|
|
sees a clean error instead of a 500. Returns the normalised slug,
|
|
|
|
|
or ``None`` when the caller omitted the param (which then falls
|
|
|
|
|
through to the active board inside ``kb.connect()``).
|
|
|
|
|
"""
|
|
|
|
|
if board is None or board == "":
|
|
|
|
|
return None
|
|
|
|
|
try:
|
|
|
|
|
normed = kanban_db._normalize_board_slug(board)
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
if normed and normed != kanban_db.DEFAULT_BOARD and not kanban_db.board_exists(normed):
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=404,
|
|
|
|
|
detail=f"board {normed!r} does not exist",
|
|
|
|
|
)
|
|
|
|
|
return normed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _conn(board: Optional[str] = None):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Open a kanban_db connection, creating the schema on first use.
|
|
|
|
|
|
|
|
|
|
Every handler that mutates the DB goes through this so the plugin
|
|
|
|
|
self-heals on a fresh install (no user-visible "no such table"
|
|
|
|
|
error if somebody hits POST /tasks before GET /board).
|
|
|
|
|
``init_db`` is idempotent.
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
|
|
|
|
|
``board`` is the query-param slug (already normalised by
|
|
|
|
|
:func:`_resolve_board`). When ``None`` the active board is used
|
|
|
|
|
via the resolution chain (env var → ``current`` file → ``default``).
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""
|
|
|
|
|
try:
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
kanban_db.init_db(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
except Exception as exc:
|
|
|
|
|
log.warning("kanban init_db failed: %s", exc)
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
return kanban_db.connect(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Serialization helpers
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
# Columns shown by the dashboard, in left-to-right order. "archived" is
|
|
|
|
|
# available via a filter toggle rather than a visible column.
|
|
|
|
|
BOARD_COLUMNS: list[str] = [
|
|
|
|
|
"triage", "todo", "ready", "running", "blocked", "done",
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show``
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.
Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.
This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.
* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
uses a single window-function SELECT so the dashboard board endpoint
doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
when ``tasks.result`` is empty but a run has produced a summary
(the existing "Result:" section still wins when populated, so the
back-compat path for hand-edited results is untouched). JSON output
gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
``latest_summary`` field on every task. Cards on /board carry a
200-character preview (cheap to render, plenty for "what did this
worker do?" at a glance); the drawer/detail endpoint returns the
full summary.
* Five new tests cover: empty-runs case, post-complete surface,
newest-of-multiple selection, empty-string skip, batch with
missing tasks + empty input.
Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.
Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:14:25 +00:00
|
|
|
_CARD_SUMMARY_PREVIEW_CHARS = 200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _task_dict(
|
|
|
|
|
task: kanban_db.Task,
|
|
|
|
|
*,
|
|
|
|
|
latest_summary: Optional[str] = None,
|
|
|
|
|
) -> dict[str, Any]:
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
d = asdict(task)
|
|
|
|
|
# Add derived age metrics so the UI can colour stale cards without
|
|
|
|
|
# computing deltas client-side.
|
|
|
|
|
d["age"] = kanban_db.task_age(task)
|
feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show``
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.
Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.
This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.
* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
uses a single window-function SELECT so the dashboard board endpoint
doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
when ``tasks.result`` is empty but a run has produced a summary
(the existing "Result:" section still wins when populated, so the
back-compat path for hand-edited results is untouched). JSON output
gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
``latest_summary`` field on every task. Cards on /board carry a
200-character preview (cheap to render, plenty for "what did this
worker do?" at a glance); the drawer/detail endpoint returns the
full summary.
* Five new tests cover: empty-runs case, post-complete surface,
newest-of-multiple selection, empty-string skip, batch with
missing tasks + empty input.
Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.
Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:14:25 +00:00
|
|
|
# Surface the latest non-null run summary so dashboards don't show
|
|
|
|
|
# blank cards/drawers for tasks where the worker handed off via
|
|
|
|
|
# ``task_runs.summary`` (the kanban-worker pattern) instead of
|
|
|
|
|
# ``tasks.result``. ``None`` when no run has produced a summary yet.
|
|
|
|
|
d["latest_summary"] = latest_summary
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
# Keep body short on list endpoints; full body comes from /tasks/:id.
|
|
|
|
|
return d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _event_dict(event: kanban_db.Event) -> dict[str, Any]:
|
|
|
|
|
return {
|
|
|
|
|
"id": event.id,
|
|
|
|
|
"task_id": event.task_id,
|
|
|
|
|
"kind": event.kind,
|
|
|
|
|
"payload": event.payload,
|
|
|
|
|
"created_at": event.created_at,
|
|
|
|
|
"run_id": event.run_id,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _comment_dict(c: kanban_db.Comment) -> dict[str, Any]:
|
|
|
|
|
return {
|
|
|
|
|
"id": c.id,
|
|
|
|
|
"task_id": c.task_id,
|
|
|
|
|
"author": c.author,
|
|
|
|
|
"body": c.body,
|
|
|
|
|
"created_at": c.created_at,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _run_dict(r: kanban_db.Run) -> dict[str, Any]:
|
|
|
|
|
"""Serialise a Run for the drawer's Run history section."""
|
|
|
|
|
return {
|
|
|
|
|
"id": r.id,
|
|
|
|
|
"task_id": r.task_id,
|
|
|
|
|
"profile": r.profile,
|
|
|
|
|
"step_key": r.step_key,
|
|
|
|
|
"status": r.status,
|
|
|
|
|
"claim_lock": r.claim_lock,
|
|
|
|
|
"claim_expires": r.claim_expires,
|
|
|
|
|
"worker_pid": r.worker_pid,
|
|
|
|
|
"max_runtime_seconds": r.max_runtime_seconds,
|
|
|
|
|
"last_heartbeat_at": r.last_heartbeat_at,
|
|
|
|
|
"started_at": r.started_at,
|
|
|
|
|
"ended_at": r.ended_at,
|
|
|
|
|
"outcome": r.outcome,
|
|
|
|
|
"summary": r.summary,
|
|
|
|
|
"metadata": r.metadata,
|
|
|
|
|
"error": r.error,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
# Hallucination-warning event kinds — see complete_task() in kanban_db.py.
|
|
|
|
|
# completion_blocked_hallucination: kernel rejected created_cards with
|
|
|
|
|
# phantom ids; task stays in prior state.
|
|
|
|
|
# suspected_hallucinated_references: prose scan found t_<hex> in summary
|
|
|
|
|
# that doesn't resolve; completion succeeded, advisory only.
|
|
|
|
|
_WARNING_EVENT_KINDS = (
|
|
|
|
|
"completion_blocked_hallucination",
|
|
|
|
|
"suspected_hallucinated_references",
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
def _compute_task_diagnostics(
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
conn: sqlite3.Connection,
|
|
|
|
|
task_ids: Optional[list[str]] = None,
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
) -> dict[str, list[dict]]:
|
|
|
|
|
"""Run the diagnostic rule engine against every task (or a subset)
|
|
|
|
|
and return ``{task_id: [diagnostic_dict, ...]}``.
|
|
|
|
|
|
|
|
|
|
Tasks with no active diagnostics are omitted from the result.
|
|
|
|
|
Uses ``hermes_cli.kanban_diagnostics`` — see that module for the
|
|
|
|
|
rule definitions.
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
"""
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
from hermes_cli import kanban_diagnostics as kd
|
|
|
|
|
|
|
|
|
|
# Build the candidate task list. We need each task's row + its
|
|
|
|
|
# events + its runs. Doing N separate queries works but scales
|
|
|
|
|
# poorly; do three aggregate queries instead.
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
if task_ids is not None:
|
|
|
|
|
if not task_ids:
|
|
|
|
|
return {}
|
|
|
|
|
placeholders = ",".join(["?"] * len(task_ids))
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
rows = conn.execute(
|
|
|
|
|
f"SELECT * FROM tasks WHERE id IN ({placeholders})",
|
|
|
|
|
tuple(task_ids),
|
|
|
|
|
).fetchall()
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
else:
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
rows = conn.execute(
|
|
|
|
|
"SELECT * FROM tasks WHERE status != 'archived'",
|
|
|
|
|
).fetchall()
|
|
|
|
|
|
|
|
|
|
if not rows:
|
|
|
|
|
return {}
|
|
|
|
|
|
|
|
|
|
# Index events + runs by task id. For very large boards this will
|
|
|
|
|
# slurp a lot — acceptable on the dashboard's typical working set
|
|
|
|
|
# (hundreds of tasks), but we can add pagination / filtering later
|
|
|
|
|
# if profiling shows it's a hotspot.
|
|
|
|
|
row_ids = [r["id"] for r in rows]
|
|
|
|
|
placeholders = ",".join(["?"] * len(row_ids))
|
|
|
|
|
events_by_task: dict[str, list] = {tid: [] for tid in row_ids}
|
|
|
|
|
for ev_row in conn.execute(
|
|
|
|
|
f"SELECT * FROM task_events WHERE task_id IN ({placeholders}) ORDER BY id",
|
|
|
|
|
tuple(row_ids),
|
|
|
|
|
).fetchall():
|
|
|
|
|
events_by_task.setdefault(ev_row["task_id"], []).append(ev_row)
|
|
|
|
|
runs_by_task: dict[str, list] = {tid: [] for tid in row_ids}
|
|
|
|
|
for run_row in conn.execute(
|
|
|
|
|
f"SELECT * FROM task_runs WHERE task_id IN ({placeholders}) ORDER BY id",
|
|
|
|
|
tuple(row_ids),
|
|
|
|
|
).fetchall():
|
|
|
|
|
runs_by_task.setdefault(run_row["task_id"], []).append(run_row)
|
|
|
|
|
|
|
|
|
|
out: dict[str, list[dict]] = {}
|
|
|
|
|
for r in rows:
|
|
|
|
|
tid = r["id"]
|
|
|
|
|
diags = kd.compute_task_diagnostics(
|
|
|
|
|
r,
|
|
|
|
|
events_by_task.get(tid, []),
|
|
|
|
|
runs_by_task.get(tid, []),
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
)
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
if diags:
|
|
|
|
|
out[tid] = [d.to_dict() for d in diags]
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
return out
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
def _warnings_summary_from_diagnostics(
|
|
|
|
|
diagnostics: list[dict],
|
|
|
|
|
) -> Optional[dict]:
|
|
|
|
|
"""Compact summary for cards: {count, highest_severity, kinds,
|
|
|
|
|
latest_at}. Replaces the old hallucination-only ``warnings`` object
|
|
|
|
|
— same shape additions plus ``highest_severity`` so the UI can color
|
|
|
|
|
badges per diagnostic severity.
|
|
|
|
|
|
|
|
|
|
Returns None when ``diagnostics`` is empty.
|
|
|
|
|
"""
|
|
|
|
|
if not diagnostics:
|
|
|
|
|
return None
|
|
|
|
|
from hermes_cli.kanban_diagnostics import SEVERITY_ORDER
|
|
|
|
|
|
|
|
|
|
kinds: dict[str, int] = {}
|
|
|
|
|
latest = 0
|
|
|
|
|
highest_idx = -1
|
|
|
|
|
highest_sev: Optional[str] = None
|
|
|
|
|
count = 0
|
|
|
|
|
for d in diagnostics:
|
|
|
|
|
kinds[d["kind"]] = kinds.get(d["kind"], 0) + d.get("count", 1)
|
|
|
|
|
count += d.get("count", 1)
|
|
|
|
|
la = d.get("last_seen_at") or 0
|
|
|
|
|
if la > latest:
|
|
|
|
|
latest = la
|
|
|
|
|
sev = d.get("severity")
|
|
|
|
|
if sev in SEVERITY_ORDER:
|
|
|
|
|
idx = SEVERITY_ORDER.index(sev)
|
|
|
|
|
if idx > highest_idx:
|
|
|
|
|
highest_idx = idx
|
|
|
|
|
highest_sev = sev
|
|
|
|
|
return {
|
|
|
|
|
"count": count,
|
|
|
|
|
"kinds": kinds,
|
|
|
|
|
"latest_at": latest,
|
|
|
|
|
"highest_severity": highest_sev,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
def _links_for(conn: sqlite3.Connection, task_id: str) -> dict[str, list[str]]:
|
|
|
|
|
"""Return {'parents': [...], 'children': [...]} for a task."""
|
|
|
|
|
parents = [
|
|
|
|
|
r["parent_id"]
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
"SELECT parent_id FROM task_links WHERE child_id = ? ORDER BY parent_id",
|
|
|
|
|
(task_id,),
|
|
|
|
|
)
|
|
|
|
|
]
|
|
|
|
|
children = [
|
|
|
|
|
r["child_id"]
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
"SELECT child_id FROM task_links WHERE parent_id = ? ORDER BY child_id",
|
|
|
|
|
(task_id,),
|
|
|
|
|
)
|
|
|
|
|
]
|
|
|
|
|
return {"parents": parents, "children": children}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# GET /board
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/board")
|
|
|
|
|
def get_board(
|
|
|
|
|
tenant: Optional[str] = Query(None, description="Filter to a single tenant"),
|
|
|
|
|
include_archived: bool = Query(False),
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board: Optional[str] = Query(None, description="Kanban board slug (omit for current)"),
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
):
|
|
|
|
|
"""Return the full board grouped by status column.
|
|
|
|
|
|
|
|
|
|
``_conn()`` auto-initializes ``kanban.db`` on first call so a fresh
|
|
|
|
|
install doesn't surface a "failed to load" error on the plugin tab.
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
|
|
|
|
|
``board`` selects which board to read from. Omitting it falls
|
|
|
|
|
through to the active board (``HERMES_KANBAN_BOARD`` env → on-disk
|
|
|
|
|
``current`` pointer → ``default``).
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
tasks = kanban_db.list_tasks(
|
|
|
|
|
conn, tenant=tenant, include_archived=include_archived
|
|
|
|
|
)
|
|
|
|
|
# Pre-fetch link counts per task (cheap: one query).
|
|
|
|
|
link_counts: dict[str, dict[str, int]] = {}
|
|
|
|
|
for row in conn.execute(
|
|
|
|
|
"SELECT parent_id, child_id FROM task_links"
|
|
|
|
|
).fetchall():
|
|
|
|
|
link_counts.setdefault(row["parent_id"], {"parents": 0, "children": 0})[
|
|
|
|
|
"children"
|
|
|
|
|
] += 1
|
|
|
|
|
link_counts.setdefault(row["child_id"], {"parents": 0, "children": 0})[
|
|
|
|
|
"parents"
|
|
|
|
|
] += 1
|
|
|
|
|
|
|
|
|
|
# Comment + event counts (both cheap aggregates).
|
|
|
|
|
comment_counts: dict[str, int] = {
|
|
|
|
|
r["task_id"]: r["n"]
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
"SELECT task_id, COUNT(*) AS n FROM task_comments GROUP BY task_id"
|
|
|
|
|
)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Progress rollup: for each parent, how many children are done / total.
|
|
|
|
|
# One pass over task_links joined with child status — cheaper than
|
|
|
|
|
# N per-task queries and the plugin uses it to render "N/M".
|
|
|
|
|
progress: dict[str, dict[str, int]] = {}
|
|
|
|
|
for row in conn.execute(
|
|
|
|
|
"SELECT l.parent_id AS pid, t.status AS cstatus "
|
|
|
|
|
"FROM task_links l JOIN tasks t ON t.id = l.child_id"
|
|
|
|
|
).fetchall():
|
|
|
|
|
p = progress.setdefault(row["pid"], {"done": 0, "total": 0})
|
|
|
|
|
p["total"] += 1
|
|
|
|
|
if row["cstatus"] == "done":
|
|
|
|
|
p["done"] += 1
|
|
|
|
|
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
# Diagnostics rollup for this board — see kanban_diagnostics.
|
|
|
|
|
# We get the full structured list per task AND a compact
|
|
|
|
|
# summary for the card badge (so cards don't carry the detail
|
|
|
|
|
# text; the drawer fetches that via /tasks/:id or /diagnostics).
|
|
|
|
|
diagnostics_per_task = _compute_task_diagnostics(conn, task_ids=None)
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
latest_event_id = conn.execute(
|
|
|
|
|
"SELECT COALESCE(MAX(id), 0) AS m FROM task_events"
|
|
|
|
|
).fetchone()["m"]
|
|
|
|
|
|
|
|
|
|
columns: dict[str, list[dict]] = {c: [] for c in BOARD_COLUMNS}
|
|
|
|
|
if include_archived:
|
|
|
|
|
columns["archived"] = []
|
|
|
|
|
|
feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show``
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.
Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.
This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.
* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
uses a single window-function SELECT so the dashboard board endpoint
doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
when ``tasks.result`` is empty but a run has produced a summary
(the existing "Result:" section still wins when populated, so the
back-compat path for hand-edited results is untouched). JSON output
gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
``latest_summary`` field on every task. Cards on /board carry a
200-character preview (cheap to render, plenty for "what did this
worker do?" at a glance); the drawer/detail endpoint returns the
full summary.
* Five new tests cover: empty-runs case, post-complete surface,
newest-of-multiple selection, empty-string skip, batch with
missing tasks + empty input.
Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.
Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:14:25 +00:00
|
|
|
# Batch-fetch the latest non-null run summary per task in one
|
|
|
|
|
# window-function query (avoids N+1 ``latest_summary`` calls
|
|
|
|
|
# for boards with hundreds of tasks). Truncated to a card-size
|
|
|
|
|
# preview here — the full text is available via /tasks/:id.
|
|
|
|
|
summary_map = kanban_db.latest_summaries(conn, [t.id for t in tasks])
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
for t in tasks:
|
feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show``
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.
Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.
This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.
* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
uses a single window-function SELECT so the dashboard board endpoint
doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
when ``tasks.result`` is empty but a run has produced a summary
(the existing "Result:" section still wins when populated, so the
back-compat path for hand-edited results is untouched). JSON output
gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
``latest_summary`` field on every task. Cards on /board carry a
200-character preview (cheap to render, plenty for "what did this
worker do?" at a glance); the drawer/detail endpoint returns the
full summary.
* Five new tests cover: empty-runs case, post-complete surface,
newest-of-multiple selection, empty-string skip, batch with
missing tasks + empty input.
Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.
Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:14:25 +00:00
|
|
|
full = summary_map.get(t.id)
|
|
|
|
|
preview = (
|
|
|
|
|
full[:_CARD_SUMMARY_PREVIEW_CHARS] if full else None
|
|
|
|
|
)
|
|
|
|
|
d = _task_dict(t, latest_summary=preview)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
d["link_counts"] = link_counts.get(t.id, {"parents": 0, "children": 0})
|
|
|
|
|
d["comment_count"] = comment_counts.get(t.id, 0)
|
|
|
|
|
d["progress"] = progress.get(t.id) # None when the task has no children
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
diags = diagnostics_per_task.get(t.id)
|
|
|
|
|
if diags:
|
|
|
|
|
# Full list goes into the payload so the drawer can render
|
|
|
|
|
# without a second round-trip. The board-level badge only
|
|
|
|
|
# needs the summary.
|
|
|
|
|
d["diagnostics"] = diags
|
|
|
|
|
d["warnings"] = _warnings_summary_from_diagnostics(diags)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
col = t.status if t.status in columns else "todo"
|
|
|
|
|
columns[col].append(d)
|
|
|
|
|
|
|
|
|
|
# Stable per-column ordering already applied by list_tasks
|
|
|
|
|
# (priority DESC, created_at ASC), keep as-is.
|
|
|
|
|
|
|
|
|
|
# List of known tenants for the UI filter dropdown.
|
|
|
|
|
tenants = [
|
|
|
|
|
r["tenant"]
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
"SELECT DISTINCT tenant FROM tasks WHERE tenant IS NOT NULL ORDER BY tenant"
|
|
|
|
|
)
|
|
|
|
|
]
|
|
|
|
|
# List of distinct assignees for the lane-by-profile sub-grouping.
|
|
|
|
|
assignees = [
|
|
|
|
|
r["assignee"]
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
"SELECT DISTINCT assignee FROM tasks WHERE assignee IS NOT NULL "
|
|
|
|
|
"AND status != 'archived' ORDER BY assignee"
|
|
|
|
|
)
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
return {
|
|
|
|
|
"columns": [
|
|
|
|
|
{"name": name, "tasks": columns[name]} for name in columns.keys()
|
|
|
|
|
],
|
|
|
|
|
"tenants": tenants,
|
|
|
|
|
"assignees": assignees,
|
|
|
|
|
"latest_event_id": int(latest_event_id),
|
|
|
|
|
"now": int(time.time()),
|
|
|
|
|
}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# GET /tasks/:id
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/tasks/{task_id}")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def get_task(task_id: str, board: Optional[str] = Query(None)):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
task = kanban_db.get_task(conn, task_id)
|
|
|
|
|
if task is None:
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
feat(kanban): surface task_runs.summary on dashboard cards + ``kanban show``
The kanban-worker skill (built into the gateway dispatcher's spawn
prompt) instructs every worker to hand off via
``kanban_complete(summary=..., metadata=...)``. That writes the summary
onto the closing ``task_runs`` row, NOT onto ``tasks.result`` — the
latter is left NULL unless the caller passes ``result=`` explicitly.
Result: a glance at the dashboard or ``hermes kanban show <id>`` shows
a blank "Result:" section even when the worker did real work, which
on 2026-05-05 caused a Mac false-alarm ("Hermes did nothing") on a
task that had a 10-line completion summary on its run.
This patch surfaces the latest non-null run summary as
``latest_summary`` so the worker's actual handoff lands in front of
operators.
* New helpers ``kanban_db.latest_summary(conn, task_id)`` and
``kanban_db.latest_summaries(conn, task_ids)``. The batch variant
uses a single window-function SELECT so the dashboard board endpoint
doesn't pay an N+1 cost on multi-hundred-task boards.
* CLI ``hermes kanban show <id>`` prints a "Latest summary:" block
when ``tasks.result`` is empty but a run has produced a summary
(the existing "Result:" section still wins when populated, so the
back-compat path for hand-edited results is untouched). JSON output
gains a top-level ``latest_summary`` field.
* Dashboard ``/board`` and ``/tasks/{id}`` now include a
``latest_summary`` field on every task. Cards on /board carry a
200-character preview (cheap to render, plenty for "what did this
worker do?" at a glance); the drawer/detail endpoint returns the
full summary.
* Five new tests cover: empty-runs case, post-complete surface,
newest-of-multiple selection, empty-string skip, batch with
missing tasks + empty input.
Smoke-tested locally against the live profile DB on the three
acceptance-criterion targets (t_f08fef91 cron-hygiene-audit,
t_007b7f1c EMA-analysis, t_05746fa4 self-assessment) — all three now
return their populated summaries via both ``latest_summary`` and
``latest_summaries``.
Test plan: 255/255 kanban tests pass + 91/91 dashboard plugin tests
pass. No regression on tasks where ``tasks.result`` is explicitly
populated (the existing "Result:" branch is preserved).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:14:25 +00:00
|
|
|
# Drawer/detail view returns the FULL summary (no truncation) so
|
|
|
|
|
# operators can read the complete worker handoff without making
|
|
|
|
|
# a second round-trip. Cards on /board carry a 200-char preview.
|
|
|
|
|
full_summary = kanban_db.latest_summary(conn, task_id)
|
|
|
|
|
task_d = _task_dict(task, latest_summary=full_summary)
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
# Attach diagnostics so the drawer's Diagnostics section can
|
|
|
|
|
# render recovery actions without a second round-trip.
|
|
|
|
|
diags = _compute_task_diagnostics(conn, task_ids=[task_id])
|
|
|
|
|
diag_list = diags.get(task_id) or []
|
|
|
|
|
if diag_list:
|
|
|
|
|
task_d["diagnostics"] = diag_list
|
|
|
|
|
task_d["warnings"] = _warnings_summary_from_diagnostics(diag_list)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
return {
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
"task": task_d,
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"comments": [_comment_dict(c) for c in kanban_db.list_comments(conn, task_id)],
|
|
|
|
|
"events": [_event_dict(e) for e in kanban_db.list_events(conn, task_id)],
|
|
|
|
|
"links": _links_for(conn, task_id),
|
|
|
|
|
"runs": [_run_dict(r) for r in kanban_db.list_runs(conn, task_id)],
|
|
|
|
|
}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# POST /tasks
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class CreateTaskBody(BaseModel):
|
|
|
|
|
title: str
|
|
|
|
|
body: Optional[str] = None
|
|
|
|
|
assignee: Optional[str] = None
|
|
|
|
|
tenant: Optional[str] = None
|
|
|
|
|
priority: int = 0
|
|
|
|
|
workspace_kind: str = "scratch"
|
|
|
|
|
workspace_path: Optional[str] = None
|
|
|
|
|
parents: list[str] = Field(default_factory=list)
|
|
|
|
|
triage: bool = False
|
|
|
|
|
idempotency_key: Optional[str] = None
|
|
|
|
|
max_runtime_seconds: Optional[int] = None
|
|
|
|
|
skills: Optional[list[str]] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def create_task(payload: CreateTaskBody, board: Optional[str] = Query(None)):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
task_id = kanban_db.create_task(
|
|
|
|
|
conn,
|
|
|
|
|
title=payload.title,
|
|
|
|
|
body=payload.body,
|
|
|
|
|
assignee=payload.assignee,
|
|
|
|
|
created_by="dashboard",
|
|
|
|
|
workspace_kind=payload.workspace_kind,
|
|
|
|
|
workspace_path=payload.workspace_path,
|
|
|
|
|
tenant=payload.tenant,
|
|
|
|
|
priority=payload.priority,
|
|
|
|
|
parents=payload.parents,
|
|
|
|
|
triage=payload.triage,
|
|
|
|
|
idempotency_key=payload.idempotency_key,
|
|
|
|
|
max_runtime_seconds=payload.max_runtime_seconds,
|
|
|
|
|
skills=payload.skills,
|
|
|
|
|
)
|
|
|
|
|
task = kanban_db.get_task(conn, task_id)
|
|
|
|
|
body: dict[str, Any] = {"task": _task_dict(task) if task else None}
|
|
|
|
|
# Surface a dispatcher-presence warning so the UI can show a
|
|
|
|
|
# banner when a `ready` task would otherwise sit idle because no
|
|
|
|
|
# gateway is running (or dispatch_in_gateway=false). Only emit
|
|
|
|
|
# for ready+assigned tasks; triage/todo are expected to wait,
|
|
|
|
|
# and unassigned tasks can't be dispatched regardless.
|
|
|
|
|
if task and task.status == "ready" and task.assignee:
|
|
|
|
|
try:
|
|
|
|
|
from hermes_cli.kanban import _check_dispatcher_presence
|
|
|
|
|
running, message = _check_dispatcher_presence()
|
|
|
|
|
if not running and message:
|
|
|
|
|
body["warning"] = message
|
|
|
|
|
except Exception:
|
|
|
|
|
# Probe failure must never block the create itself.
|
|
|
|
|
pass
|
|
|
|
|
return body
|
|
|
|
|
except ValueError as e:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(e))
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# PATCH /tasks/:id (status / assignee / priority / title / body)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class UpdateTaskBody(BaseModel):
|
|
|
|
|
status: Optional[str] = None
|
|
|
|
|
assignee: Optional[str] = None
|
|
|
|
|
priority: Optional[int] = None
|
|
|
|
|
title: Optional[str] = None
|
|
|
|
|
body: Optional[str] = None
|
|
|
|
|
result: Optional[str] = None
|
|
|
|
|
block_reason: Optional[str] = None
|
|
|
|
|
# Structured handoff fields — forwarded to complete_task when status
|
|
|
|
|
# transitions to 'done'. Dashboard parity with ``hermes kanban
|
|
|
|
|
# complete --summary ... --metadata ...``.
|
|
|
|
|
summary: Optional[str] = None
|
|
|
|
|
metadata: Optional[dict] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.patch("/tasks/{task_id}")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def update_task(task_id: str, payload: UpdateTaskBody, board: Optional[str] = Query(None)):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
task = kanban_db.get_task(conn, task_id)
|
|
|
|
|
if task is None:
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
|
|
|
|
|
|
|
|
|
# --- assignee ----------------------------------------------------
|
|
|
|
|
if payload.assignee is not None:
|
|
|
|
|
try:
|
|
|
|
|
ok = kanban_db.assign_task(
|
|
|
|
|
conn, task_id, payload.assignee or None,
|
|
|
|
|
)
|
|
|
|
|
except RuntimeError as e:
|
|
|
|
|
raise HTTPException(status_code=409, detail=str(e))
|
|
|
|
|
if not ok:
|
|
|
|
|
raise HTTPException(status_code=404, detail="task not found")
|
|
|
|
|
|
|
|
|
|
# --- status -------------------------------------------------------
|
|
|
|
|
if payload.status is not None:
|
|
|
|
|
s = payload.status
|
|
|
|
|
ok = True
|
|
|
|
|
if s == "done":
|
|
|
|
|
ok = kanban_db.complete_task(
|
|
|
|
|
conn, task_id,
|
|
|
|
|
result=payload.result,
|
|
|
|
|
summary=payload.summary,
|
|
|
|
|
metadata=payload.metadata,
|
|
|
|
|
)
|
|
|
|
|
elif s == "blocked":
|
|
|
|
|
ok = kanban_db.block_task(conn, task_id, reason=payload.block_reason)
|
|
|
|
|
elif s == "ready":
|
|
|
|
|
# Re-open a blocked task, or just an explicit status set.
|
|
|
|
|
current = kanban_db.get_task(conn, task_id)
|
|
|
|
|
if current and current.status == "blocked":
|
|
|
|
|
ok = kanban_db.unblock_task(conn, task_id)
|
|
|
|
|
else:
|
|
|
|
|
# Direct status write for drag-drop (todo -> ready etc).
|
|
|
|
|
ok = _set_status_direct(conn, task_id, "ready")
|
|
|
|
|
elif s == "archived":
|
|
|
|
|
ok = kanban_db.archive_task(conn, task_id)
|
2026-05-04 14:47:13 +08:00
|
|
|
elif s == "running":
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=400,
|
|
|
|
|
detail="Cannot set status to 'running' directly; use the dispatcher/claim path",
|
|
|
|
|
)
|
|
|
|
|
elif s in ("todo", "triage"):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
ok = _set_status_direct(conn, task_id, s)
|
|
|
|
|
else:
|
|
|
|
|
raise HTTPException(status_code=400, detail=f"unknown status: {s}")
|
|
|
|
|
if not ok:
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=409,
|
|
|
|
|
detail=f"status transition to {s!r} not valid from current state",
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# --- priority -----------------------------------------------------
|
|
|
|
|
if payload.priority is not None:
|
|
|
|
|
with kanban_db.write_txn(conn):
|
|
|
|
|
conn.execute(
|
|
|
|
|
"UPDATE tasks SET priority = ? WHERE id = ?",
|
|
|
|
|
(int(payload.priority), task_id),
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
|
|
|
|
"VALUES (?, 'reprioritized', ?, ?)",
|
|
|
|
|
(task_id, json.dumps({"priority": int(payload.priority)}),
|
|
|
|
|
int(time.time())),
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# --- title / body -------------------------------------------------
|
|
|
|
|
if payload.title is not None or payload.body is not None:
|
|
|
|
|
with kanban_db.write_txn(conn):
|
|
|
|
|
sets, vals = [], []
|
|
|
|
|
if payload.title is not None:
|
|
|
|
|
if not payload.title.strip():
|
|
|
|
|
raise HTTPException(status_code=400, detail="title cannot be empty")
|
|
|
|
|
sets.append("title = ?")
|
|
|
|
|
vals.append(payload.title.strip())
|
|
|
|
|
if payload.body is not None:
|
|
|
|
|
sets.append("body = ?")
|
|
|
|
|
vals.append(payload.body)
|
|
|
|
|
vals.append(task_id)
|
|
|
|
|
conn.execute(
|
|
|
|
|
f"UPDATE tasks SET {', '.join(sets)} WHERE id = ?", vals,
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
|
|
|
|
"VALUES (?, 'edited', NULL, ?)",
|
|
|
|
|
(task_id, int(time.time())),
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
updated = kanban_db.get_task(conn, task_id)
|
|
|
|
|
return {"task": _task_dict(updated) if updated else None}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _set_status_direct(
|
|
|
|
|
conn: sqlite3.Connection, task_id: str, new_status: str,
|
|
|
|
|
) -> bool:
|
|
|
|
|
"""Direct status write for drag-drop moves that aren't covered by the
|
|
|
|
|
structured complete/block/unblock/archive verbs (e.g. todo<->ready,
|
|
|
|
|
running<->ready). Appends a ``status`` event row for the live feed.
|
|
|
|
|
|
|
|
|
|
When this transitions OFF ``running`` to anything other than the
|
|
|
|
|
terminal verbs above (which own their own run closing), we close the
|
|
|
|
|
active run with outcome='reclaimed' so attempt history isn't
|
|
|
|
|
orphaned. ``running -> ready`` via drag-drop is the common case
|
|
|
|
|
(user yanking a stuck worker back to the queue).
|
|
|
|
|
"""
|
|
|
|
|
with kanban_db.write_txn(conn):
|
|
|
|
|
# Snapshot current state so we know whether to close a run.
|
|
|
|
|
prev = conn.execute(
|
|
|
|
|
"SELECT status, current_run_id FROM tasks WHERE id = ?",
|
|
|
|
|
(task_id,),
|
|
|
|
|
).fetchone()
|
|
|
|
|
if prev is None:
|
|
|
|
|
return False
|
2026-05-04 20:18:40 +08:00
|
|
|
|
|
|
|
|
# Guard: don't allow promoting to 'ready' unless all parents are done.
|
|
|
|
|
# Prevents the dispatcher from spawning a child whose upstream work
|
|
|
|
|
# hasn't completed (e.g. T4 dispatched while T3 is still blocked).
|
|
|
|
|
if new_status == "ready":
|
|
|
|
|
parent_statuses = conn.execute(
|
|
|
|
|
"SELECT t.status FROM tasks t "
|
|
|
|
|
"JOIN task_links l ON l.parent_id = t.id "
|
|
|
|
|
"WHERE l.child_id = ?",
|
|
|
|
|
(task_id,),
|
|
|
|
|
).fetchall()
|
|
|
|
|
if parent_statuses and not all(
|
|
|
|
|
p["status"] == "done" for p in parent_statuses
|
|
|
|
|
):
|
|
|
|
|
return False
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
was_running = prev["status"] == "running"
|
|
|
|
|
|
|
|
|
|
cur = conn.execute(
|
|
|
|
|
"UPDATE tasks SET status = ?, "
|
|
|
|
|
" claim_lock = CASE WHEN ? = 'running' THEN claim_lock ELSE NULL END, "
|
|
|
|
|
" claim_expires = CASE WHEN ? = 'running' THEN claim_expires ELSE NULL END, "
|
|
|
|
|
" worker_pid = CASE WHEN ? = 'running' THEN worker_pid ELSE NULL END "
|
|
|
|
|
"WHERE id = ?",
|
|
|
|
|
(new_status, new_status, new_status, new_status, task_id),
|
|
|
|
|
)
|
|
|
|
|
if cur.rowcount != 1:
|
|
|
|
|
return False
|
|
|
|
|
run_id = None
|
|
|
|
|
if was_running and new_status != "running" and prev["current_run_id"]:
|
|
|
|
|
run_id = kanban_db._end_run(
|
|
|
|
|
conn, task_id,
|
|
|
|
|
outcome="reclaimed", status="reclaimed",
|
|
|
|
|
summary=f"status changed to {new_status} (dashboard/direct)",
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"INSERT INTO task_events (task_id, run_id, kind, payload, created_at) "
|
|
|
|
|
"VALUES (?, ?, 'status', ?, ?)",
|
|
|
|
|
(task_id, run_id, json.dumps({"status": new_status}), int(time.time())),
|
|
|
|
|
)
|
|
|
|
|
# If we re-opened something, children may have gone stale.
|
|
|
|
|
if new_status in ("done", "ready"):
|
|
|
|
|
kanban_db.recompute_ready(conn)
|
|
|
|
|
return True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Comments
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class CommentBody(BaseModel):
|
|
|
|
|
body: str
|
|
|
|
|
author: Optional[str] = "dashboard"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks/{task_id}/comments")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def add_comment(task_id: str, payload: CommentBody, board: Optional[str] = Query(None)):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
if not payload.body.strip():
|
|
|
|
|
raise HTTPException(status_code=400, detail="body is required")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
if kanban_db.get_task(conn, task_id) is None:
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
|
|
|
|
kanban_db.add_comment(
|
|
|
|
|
conn, task_id, author=payload.author or "dashboard", body=payload.body,
|
|
|
|
|
)
|
|
|
|
|
return {"ok": True}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Links
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class LinkBody(BaseModel):
|
|
|
|
|
parent_id: str
|
|
|
|
|
child_id: str
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/links")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def add_link(payload: LinkBody, board: Optional[str] = Query(None)):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
kanban_db.link_tasks(conn, payload.parent_id, payload.child_id)
|
|
|
|
|
return {"ok": True}
|
|
|
|
|
except ValueError as e:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(e))
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.delete("/links")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def delete_link(
|
|
|
|
|
parent_id: str = Query(...),
|
|
|
|
|
child_id: str = Query(...),
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
ok = kanban_db.unlink_tasks(conn, parent_id, child_id)
|
|
|
|
|
return {"ok": bool(ok)}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Bulk actions (multi-select on the board)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class BulkTaskBody(BaseModel):
|
|
|
|
|
ids: list[str]
|
|
|
|
|
status: Optional[str] = None
|
|
|
|
|
assignee: Optional[str] = None # "" or None = unassign
|
|
|
|
|
priority: Optional[int] = None
|
|
|
|
|
archive: bool = False
|
2026-05-05 11:07:13 +08:00
|
|
|
result: Optional[str] = None
|
|
|
|
|
summary: Optional[str] = None
|
|
|
|
|
metadata: Optional[dict] = None
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks/bulk")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def bulk_update(payload: BulkTaskBody, board: Optional[str] = Query(None)):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Apply the same patch to every id in ``payload.ids``.
|
|
|
|
|
|
|
|
|
|
This is an *independent* iteration — per-task failures don't abort
|
|
|
|
|
siblings. Returns per-id outcome so the UI can surface partials.
|
|
|
|
|
"""
|
|
|
|
|
ids = [i for i in (payload.ids or []) if i]
|
|
|
|
|
if not ids:
|
|
|
|
|
raise HTTPException(status_code=400, detail="ids is required")
|
|
|
|
|
results: list[dict] = []
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
for tid in ids:
|
|
|
|
|
entry: dict[str, Any] = {"id": tid, "ok": True}
|
|
|
|
|
try:
|
|
|
|
|
task = kanban_db.get_task(conn, tid)
|
|
|
|
|
if task is None:
|
|
|
|
|
entry.update(ok=False, error="not found")
|
|
|
|
|
results.append(entry)
|
|
|
|
|
continue
|
|
|
|
|
if payload.archive:
|
|
|
|
|
if not kanban_db.archive_task(conn, tid):
|
|
|
|
|
entry.update(ok=False, error="archive refused")
|
|
|
|
|
if payload.status is not None and not payload.archive:
|
|
|
|
|
s = payload.status
|
|
|
|
|
if s == "done":
|
2026-05-05 11:07:13 +08:00
|
|
|
ok = kanban_db.complete_task(
|
|
|
|
|
conn, tid,
|
|
|
|
|
result=payload.result,
|
|
|
|
|
summary=payload.summary,
|
|
|
|
|
metadata=payload.metadata,
|
|
|
|
|
)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
elif s == "blocked":
|
|
|
|
|
ok = kanban_db.block_task(conn, tid)
|
|
|
|
|
elif s == "ready":
|
|
|
|
|
cur = kanban_db.get_task(conn, tid)
|
|
|
|
|
if cur and cur.status == "blocked":
|
|
|
|
|
ok = kanban_db.unblock_task(conn, tid)
|
|
|
|
|
else:
|
|
|
|
|
ok = _set_status_direct(conn, tid, "ready")
|
|
|
|
|
elif s in ("todo", "running", "triage"):
|
|
|
|
|
ok = _set_status_direct(conn, tid, s)
|
|
|
|
|
else:
|
|
|
|
|
entry.update(ok=False, error=f"unknown status {s!r}")
|
|
|
|
|
results.append(entry)
|
|
|
|
|
continue
|
|
|
|
|
if not ok:
|
|
|
|
|
entry.update(ok=False, error=f"transition to {s!r} refused")
|
|
|
|
|
if payload.assignee is not None:
|
|
|
|
|
try:
|
|
|
|
|
if not kanban_db.assign_task(
|
|
|
|
|
conn, tid, payload.assignee or None,
|
|
|
|
|
):
|
|
|
|
|
entry.update(ok=False, error="assign refused")
|
|
|
|
|
except RuntimeError as e:
|
|
|
|
|
entry.update(ok=False, error=str(e))
|
|
|
|
|
if payload.priority is not None:
|
|
|
|
|
with kanban_db.write_txn(conn):
|
|
|
|
|
conn.execute(
|
|
|
|
|
"UPDATE tasks SET priority = ? WHERE id = ?",
|
|
|
|
|
(int(payload.priority), tid),
|
|
|
|
|
)
|
|
|
|
|
conn.execute(
|
|
|
|
|
"INSERT INTO task_events (task_id, kind, payload, created_at) "
|
|
|
|
|
"VALUES (?, 'reprioritized', ?, ?)",
|
|
|
|
|
(tid, json.dumps({"priority": int(payload.priority)}),
|
|
|
|
|
int(time.time())),
|
|
|
|
|
)
|
|
|
|
|
except Exception as e: # defensive — one bad id shouldn't kill the batch
|
|
|
|
|
entry.update(ok=False, error=str(e))
|
|
|
|
|
results.append(entry)
|
|
|
|
|
return {"results": results}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): generic diagnostics engine for task distress signals (#20332)
* feat(kanban): generic diagnostics engine for task distress signals
Replaces the hallucination-specific ``warnings`` / ``RecoverySection``
surface (shipped in PR #20232) with a reusable diagnostic-rule engine
that covers five distress kinds in v1 and can be extended without
touching UI code. The "something's wrong with this task" signal is
no longer limited to phantom card ids.
Closes the follow-up from #20232 discussion.
New module
----------
``hermes_cli/kanban_diagnostics.py`` — stateless, no-side-effect rule
engine. Each rule is a pure function of
``(task, events, runs, now, config) -> list[Diagnostic]``. Registry
is a simple list; adding a new distress kind is one function + one
import, no UI or API changes required.
v1 rule set
-----------
* ``hallucinated_cards`` (error) — folds the existing
``completion_blocked_hallucination`` event into the new surface.
* ``prose_phantom_refs`` (warning) — folds
``suspected_hallucinated_references``.
* ``repeated_spawn_failures`` (error → critical at 2x threshold) —
fires when ``tasks.spawn_failures >= 3``; suggests
``hermes -p <profile> doctor`` / ``auth``.
* ``repeated_crashes`` (error → critical) — fires after N consecutive
``crashed`` run outcomes with no successful completion between;
suggests ``hermes kanban log <id>``.
* ``stuck_in_blocked`` (warning) — fires after 24h in ``blocked``
state with no comments / unblock attempts; suggests commenting.
Every diagnostic carries structured ``actions`` (reclaim, reassign,
unblock, cli_hint, comment, open_docs) that render consistently in
both CLI and dashboard. Suggested actions are highlighted; generic
recovery actions (reclaim / reassign) are available on every kind as
fallbacks.
Diagnostics auto-clear when the underlying failure resolves — a
clean ``completed``/``edited`` event drops hallucination diagnostics,
a successful run drops crash diagnostics, a comment drops
stuck-blocked diagnostics. Audit events persist; the badge goes away.
API
---
``plugin_api.py``:
* ``/board`` now attaches ``diagnostics`` (full list) and
``warnings`` (compact summary with ``highest_severity``) per task.
* ``/tasks/{id}`` attaches diagnostics so the drawer's Diagnostics
section auto-opens on flagged tasks.
* NEW ``/diagnostics`` endpoint — fleet-wide listing, filterable by
severity, sorted critical-first.
CLI
---
* NEW ``hermes kanban diagnostics [--severity X] [--task id]
[--json]`` — fleet view or single-task view, matches dashboard rule
output so CLI users see the same picture.
* ``hermes kanban show <id>`` now renders a Diagnostics section near
the top with severity markers + suggested actions.
Dashboard
---------
* Card badge is severity-coloured (⚠ amber warning, !! orange error,
!!! red critical) using ``warnings.highest_severity``.
* Attention strip above the toolbar counts EVERY task with active
diagnostics (not just hallucinations), severity-coloured, lists
affected tasks with Open buttons when expanded.
* Drawer's old ``RecoverySection`` replaced with generic
``DiagnosticsSection`` rendering a card per active diagnostic:
title + detail + structured data (task-id chips when payload keys
look like id lists) + action buttons. Reassign profile picker is
inline per-diagnostic. Clipboard fallback uses ``.catch()`` for
environments where writeText rejects.
* Three-rung severity palette; amber for warning, orange for error,
red for critical. Uses CSS variables so theming is straightforward.
Tests
-----
* NEW ``tests/hermes_cli/test_kanban_diagnostics.py`` — 14 unit tests
covering each rule's positive/negative/threshold paths, severity
sorting, broken-rule isolation, and sqlite3.Row integration.
* Dashboard plugin tests extended: ``/diagnostics`` endpoint (empty,
populated, severity-filtered), ``/board`` exposes both diagnostic
list and compact summary with ``highest_severity``.
* Existing hallucination-specific test (``test_board_surfaces_
warnings_field_for_hallucinated_completions``) updated to reflect
the new contract: warning summary keys by diagnostic kind
(``hallucinated_cards``) not event kind.
379 kanban-suite tests pass (+16 net from this PR).
Live verification
-----------------
Seeded all 5 diagnostic kinds + one clean + one plain-running task
(7 total) into an isolated HERMES_HOME, spun up the dashboard, and
verified:
* Attention strip: shows ``!! 5 tasks need attention`` in the
error-severity orange; Show expands to a list of 5 rows ordered
critical > error > warning.
* Card badges: error tasks render ``!!`` orange, warning tasks
render ``⚠`` amber, clean and plain-running tasks render no badge.
* Each of the 5 rules opens a correctly-coloured, correctly-styled
diagnostic card in the drawer with its specific suggested action.
* Live reassign from a diagnostic card flipped
``broken-ml-worker → alice`` and the drawer refreshed with the
new assignee + the same diagnostic still firing (correct:
spawn_failures counter hasn't reset yet).
* CLI ``hermes kanban diagnostics`` prints all 5 in severity order;
``--severity error`` narrows to 3; ``kanban show <id>`` includes
the Diagnostics block at the top with suggested action hint.
Migration note
--------------
The old ``warnings`` shape (``{count, kinds, latest_at}``) is
preserved on the API but ``kinds`` now keys by diagnostic kind
(``hallucinated_cards``) instead of event kind
(``completion_blocked_hallucination``). ``highest_severity`` is a
new required field. The dashboard was the only consumer and has
been updated in the same commit; external API consumers of the
``warnings`` field will need to update their kind-match logic.
* feat(kanban/diagnostics): lead titles with the actual error text
The generic 'Worker crashed N runs in a row' / 'Worker failed to spawn
N times' titles buried the actual cause in the data section. Operators
had to open logs or expand the diagnostic to see WHY the worker is
stuck — rate-limit vs insufficient quota vs bad auth vs context
overflow vs network blip all looked identical at a glance.
New titles:
Agent crashed 3x: openai: 429 Too Many Requests - rate limit reached
Agent crashed 3x: anthropic: 402 insufficient_quota - credit balance
Agent crashed 3x: provider auth error: 401 Unauthorized
Agent spawn failed 4x: insufficient_quota: You exceeded your current
Detail keeps the full error snippet (capped at 500 chars + ellipsis
for tracebacks). Title takes the first line capped at 160 chars.
Fallback title if no error recorded stays honest ('no error recorded').
Tests: 4 new cases covering 429/billing/spawn/truncation. 383 total
pass (+4).
Live-verified on dashboard with 6 seeded scenarios
(rate-limit, billing, auth, context, network, spawn-billing) —
each card title leads with the actionable error text.
2026-05-05 13:32:42 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Diagnostics — fleet-wide distress signals (hallucinations, crashes,
|
|
|
|
|
# spawn failures, stuck-blocked). See hermes_cli.kanban_diagnostics for
|
|
|
|
|
# the rule engine.
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/diagnostics")
|
|
|
|
|
def list_diagnostics(
|
|
|
|
|
board: Optional[str] = Query(None, description="Kanban board slug (omit for current)"),
|
|
|
|
|
severity: Optional[str] = Query(
|
|
|
|
|
None,
|
|
|
|
|
description="Filter by severity: warning|error|critical",
|
|
|
|
|
),
|
|
|
|
|
):
|
|
|
|
|
"""Return ``[{task_id, task_title, task_status, task_assignee,
|
|
|
|
|
diagnostics: [...]}, ...]`` for every task on the board with at
|
|
|
|
|
least one active diagnostic.
|
|
|
|
|
|
|
|
|
|
Severity-filterable so the UI can render "just the critical ones"
|
|
|
|
|
or the CLI can grep. Useful for the board-header attention strip
|
|
|
|
|
AND for ``hermes kanban diagnostics`` which shells to this
|
|
|
|
|
endpoint when the dashboard's running, or invokes the engine
|
|
|
|
|
directly when it isn't.
|
|
|
|
|
"""
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
diags_by_task = _compute_task_diagnostics(conn, task_ids=None)
|
|
|
|
|
if not diags_by_task:
|
|
|
|
|
return {"diagnostics": [], "count": 0}
|
|
|
|
|
|
|
|
|
|
# Narrow by severity if asked.
|
|
|
|
|
if severity:
|
|
|
|
|
filtered: dict[str, list[dict]] = {}
|
|
|
|
|
for tid, dl in diags_by_task.items():
|
|
|
|
|
keep = [d for d in dl if d.get("severity") == severity]
|
|
|
|
|
if keep:
|
|
|
|
|
filtered[tid] = keep
|
|
|
|
|
diags_by_task = filtered
|
|
|
|
|
if not diags_by_task:
|
|
|
|
|
return {"diagnostics": [], "count": 0}
|
|
|
|
|
|
|
|
|
|
# Pull the task rows we need in one query so we can include
|
|
|
|
|
# titles/statuses without a per-task lookup.
|
|
|
|
|
ids = list(diags_by_task.keys())
|
|
|
|
|
placeholders = ",".join(["?"] * len(ids))
|
|
|
|
|
rows = {
|
|
|
|
|
r["id"]: r
|
|
|
|
|
for r in conn.execute(
|
|
|
|
|
f"SELECT id, title, status, assignee FROM tasks WHERE id IN ({placeholders})",
|
|
|
|
|
tuple(ids),
|
|
|
|
|
).fetchall()
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
out = []
|
|
|
|
|
for tid, dl in diags_by_task.items():
|
|
|
|
|
r = rows.get(tid)
|
|
|
|
|
out.append({
|
|
|
|
|
"task_id": tid,
|
|
|
|
|
"task_title": r["title"] if r else None,
|
|
|
|
|
"task_status": r["status"] if r else None,
|
|
|
|
|
"task_assignee": r["assignee"] if r else None,
|
|
|
|
|
"diagnostics": dl,
|
|
|
|
|
})
|
|
|
|
|
# Sort: highest severity first, then most recent.
|
|
|
|
|
from hermes_cli.kanban_diagnostics import SEVERITY_ORDER
|
|
|
|
|
sev_idx = {s: i for i, s in enumerate(SEVERITY_ORDER)}
|
|
|
|
|
def _sort_key(row):
|
|
|
|
|
top = row["diagnostics"][0]
|
|
|
|
|
return (
|
|
|
|
|
-sev_idx.get(top.get("severity"), -1),
|
|
|
|
|
-(top.get("last_seen_at") or 0),
|
|
|
|
|
)
|
|
|
|
|
out.sort(key=_sort_key)
|
|
|
|
|
|
|
|
|
|
return {
|
|
|
|
|
"diagnostics": out,
|
|
|
|
|
"count": sum(len(d["diagnostics"]) for d in out),
|
|
|
|
|
}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): hallucination gate + recovery UX for worker-created-card claims (#20232)
Workers completing a kanban task can now claim the ids of cards they
created via an optional ``created_cards`` field on ``kanban_complete``.
The kernel verifies each id exists and was created by the completing
worker's profile; any phantom id blocks the completion with a
``HallucinatedCardsError`` and records a
``completion_blocked_hallucination`` event on the task so the rejected
attempt is auditable. Successful completions also get a non-blocking
prose-scan pass over their ``summary`` + ``result`` that emits a
``suspected_hallucinated_references`` event for any ``t_<hex>``
reference that doesn't resolve.
Closes #20017.
Recovery UX (kernel + CLI + dashboard)
--------------------------------------
A structural gate alone isn't enough — operators also need to see and
act on stuck workers, especially when a profile's model is the root
cause. This PR ships the full loop:
* ``kanban_db.reclaim_task(task_id)`` — operator-driven reclaim that
releases an active worker claim immediately (unlike
``release_stale_claims`` which only acts after claim_expires has
passed). Emits a ``reclaimed`` event with ``manual: True`` payload.
* ``kanban_db.reassign_task(task_id, profile, reclaim_first=...)`` —
switch a task to a different profile, optionally reclaiming a stuck
running worker in the same call.
* ``hermes kanban reclaim <id> [--reason ...]`` and
``hermes kanban reassign <id> <profile> [--reclaim] [--reason ...]``
CLI subcommands wired through to the same helpers.
* ``POST /api/plugins/kanban/tasks/{id}/reclaim`` and
``POST /api/plugins/kanban/tasks/{id}/reassign`` endpoints on the
dashboard plugin.
Dashboard surfacing
-------------------
* ⚠ **warning badge** on cards with active hallucination events.
* **attention strip** at the top of the board listing all flagged
tasks; dismissible per session.
* **events callout** in the task drawer — hallucination events render
with a red left border, amber icon, and phantom ids as styled chips.
* **recovery section** in the task drawer with three actions: Reclaim,
Reassign (with profile picker + reclaim-first checkbox), and a
copy-to-clipboard hint for ``hermes -p <profile> model`` since
profile config lives on disk and can't be edited from the browser.
Auto-opens when the task has warnings, collapsed otherwise.
Keyed by task id so state doesn't leak between drawers.
Active-vs-stale rule: warnings clear when a clean ``completed`` or
``edited`` event supersedes the hallucination, so recovery is never
permanently stigmatising — the audit events persist for debugging but
the badge goes away once the worker succeeds.
Skill updates
-------------
* ``skills/devops/kanban-worker/SKILL.md`` documents the
``created_cards`` contract with good/bad examples.
* ``skills/devops/kanban-orchestrator/SKILL.md`` gains a "Recovering
stuck workers" section with the three actions and when to use each.
Tests
-----
* Kernel gate: verified-cards manifest, phantom rejection + audit
event, cross-worker rejection, prose scan positive + negative.
* Recovery helpers: reclaim on running task, reclaim on non-running
returns False, reassign refuses running without reclaim_first,
reassign with reclaim_first succeeds on running.
* API endpoints: warnings field present on /board and /tasks/:id,
warnings cleared after clean completion, reclaim 200 + 409 paths,
reassign 200 + 409 + reclaim_first paths.
* CLI smoke: reclaim + reassign subcommands.
Live-verified end-to-end on a dashboard with seeded scenarios:
attention strip renders, badges land on the right cards, drawer
callout shows phantom chips, Reclaim on a running task flips status to
ready + emits manual reclaimed event + refreshes the drawer,
Reassign swaps the assignee and triggers board refresh.
359/359 kanban-suite tests pass
(test_kanban_{db,cli,boards,core_functionality} + dashboard + tools).
2026-05-05 08:06:55 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Recovery actions — reclaim a running claim, reassign to a new profile
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class ReclaimBody(BaseModel):
|
|
|
|
|
reason: Optional[str] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks/{task_id}/reclaim")
|
|
|
|
|
def reclaim_task_endpoint(
|
|
|
|
|
task_id: str,
|
|
|
|
|
payload: ReclaimBody,
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
|
|
|
|
"""Release an active worker claim on a running task.
|
|
|
|
|
|
|
|
|
|
Used by the dashboard recovery popover when an operator wants to
|
|
|
|
|
abort a stuck worker (e.g. one that keeps hallucinating card ids)
|
|
|
|
|
without waiting for the claim TTL. Maps 1:1 to
|
|
|
|
|
``hermes kanban reclaim <task_id> --reason ...``.
|
|
|
|
|
"""
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
ok = kanban_db.reclaim_task(conn, task_id, reason=payload.reason)
|
|
|
|
|
if not ok:
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=409,
|
|
|
|
|
detail=(
|
|
|
|
|
f"cannot reclaim {task_id}: not in a claimable state "
|
|
|
|
|
"(not running, or unknown id)"
|
|
|
|
|
),
|
|
|
|
|
)
|
|
|
|
|
return {"ok": True, "task_id": task_id}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class ReassignBody(BaseModel):
|
|
|
|
|
profile: Optional[str] = None # "" or None = unassign
|
|
|
|
|
reclaim_first: bool = False
|
|
|
|
|
reason: Optional[str] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks/{task_id}/reassign")
|
|
|
|
|
def reassign_task_endpoint(
|
|
|
|
|
task_id: str,
|
|
|
|
|
payload: ReassignBody,
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
|
|
|
|
"""Reassign a task to a different profile, optionally reclaiming first.
|
|
|
|
|
|
|
|
|
|
Used by the dashboard recovery popover when an operator wants to
|
|
|
|
|
retry a task with a different worker profile (e.g. switch to a
|
|
|
|
|
smarter model after the assigned profile keeps hallucinating).
|
|
|
|
|
Maps 1:1 to ``hermes kanban reassign <task_id> <profile> [--reclaim]``.
|
|
|
|
|
"""
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
ok = kanban_db.reassign_task(
|
|
|
|
|
conn, task_id,
|
|
|
|
|
payload.profile or None,
|
|
|
|
|
reclaim_first=bool(payload.reclaim_first),
|
|
|
|
|
reason=payload.reason,
|
|
|
|
|
)
|
|
|
|
|
if not ok:
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=409,
|
|
|
|
|
detail=(
|
|
|
|
|
f"cannot reassign {task_id}: unknown id, or still "
|
|
|
|
|
"running (pass reclaim_first=true to release the claim first)"
|
|
|
|
|
),
|
|
|
|
|
)
|
|
|
|
|
return {"ok": True, "task_id": task_id, "assignee": payload.profile or None}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Plugin config (read dashboard.kanban.* defaults from config.yaml)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/config")
|
|
|
|
|
def get_config():
|
|
|
|
|
"""Return kanban dashboard preferences from ~/.hermes/config.yaml.
|
|
|
|
|
|
|
|
|
|
Reads the ``dashboard.kanban`` section if present; defaults otherwise.
|
|
|
|
|
Used by the UI to pre-select tenant filters, toggle markdown rendering,
|
|
|
|
|
or set column-width preferences without a round-trip per page load.
|
|
|
|
|
"""
|
|
|
|
|
try:
|
|
|
|
|
from hermes_cli.config import load_config
|
|
|
|
|
cfg = load_config() or {}
|
|
|
|
|
except Exception:
|
|
|
|
|
cfg = {}
|
|
|
|
|
dash_cfg = (cfg.get("dashboard") or {})
|
|
|
|
|
# dashboard.kanban may itself be a dict; fall back to {}.
|
|
|
|
|
k_cfg = dash_cfg.get("kanban") or {}
|
|
|
|
|
return {
|
|
|
|
|
"default_tenant": k_cfg.get("default_tenant") or "",
|
|
|
|
|
"lane_by_profile": bool(k_cfg.get("lane_by_profile", True)),
|
|
|
|
|
"include_archived_by_default": bool(k_cfg.get("include_archived_by_default", False)),
|
|
|
|
|
"render_markdown": bool(k_cfg.get("render_markdown", True)),
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
feat(kanban-dashboard): per-platform home-channel notification toggles (#19864)
* revert: auto-subscribe gateway chat on tool-driven kanban_create (#19718)
Reverts ff3d2773e2. Teknium reviewed the merged PR and decided this
behavior isn't wanted — tool-driven kanban_create should not mirror
the slash-command path's auto-subscribe. Orchestrators that want
their originating chat notified can call kanban_notify-subscribe
explicitly; we're not going to make it implicit.
* feat(kanban-dashboard): per-platform home-channel notification toggles
Adds a "Notify home channels" section to the task drawer in the kanban
dashboard plugin. Each platform where the user has set a home channel
(/sethome, TELEGRAM_HOME_CHANNEL env var, gateway.platforms.<p>.home_channel
in config.yaml) gets a toggle pill. Toggling on writes a kanban_notify_subs
row keyed to that platform's home (chat_id + thread_id); toggling off
removes it. The existing gateway notifier watcher delivers completed /
blocked / gave_up events without any new plumbing — this is purely a GUI
surface over existing machinery.
Replaces the reverted auto-subscribe behavior from #19718 with an explicit,
per-task, per-platform, user-controlled opt-in. No implicit subscription
on tool-driven kanban_create; no CLI commands; no slash commands. Just a
toggle in the drawer.
Backend (plugins/kanban/dashboard/plugin_api.py):
- GET /api/plugins/kanban/home-channels[?task_id=X]
Returns every platform with a configured home, plus a per-entry
subscribed: bool relative to task_id (false when task_id omitted).
Reads the live GatewayConfig via load_gateway_config() so env-var
overlays stay honored.
- POST /api/plugins/kanban/tasks/:id/home-subscribe/:platform
Idempotent add_notify_sub keyed to the platform's home.
- DELETE /api/plugins/kanban/tasks/:id/home-subscribe/:platform
remove_notify_sub for the same tuple.
- 404 when the platform has no home configured, or task_id doesn't
exist (POST only).
Frontend (plugins/kanban/dashboard/dist/index.js):
- TaskDrawer fetches /home-channels on open, keyed on task_id.
- HomeSubsSection renders nothing when zero platforms have a home (so
users who haven't set one up don't see an empty UI block).
- Optimistic toggle with busy flag + revert-on-failure. One pill per
platform; ✓ prefix and --on class indicate the subscribed state.
CSS (plugins/kanban/dashboard/dist/style.css):
- .hermes-kanban-home-subs flex row + .hermes-kanban-home-sub pill
style + --on subscribed variant (subtle ring-colored background).
Live-tested against a dashboard with TELEGRAM + DISCORD_BOT_TOKEN /
HOME_CHANNEL env vars set: drawer shows both pills, toggling each
flips its visual state AND writes/removes the correct kanban_notify_subs
row (verified via direct DB read).
Tests (tests/plugins/test_kanban_dashboard_plugin.py, 11 new, 53/53
pass total):
- home-channels lists only platforms with a home (slack with a
token but no home is excluded)
- no task_id -> all subscribed=false
- subscribe creates notify_sub row with correct chat/thread/platform
- subscribed=true reflected in subsequent GET
- idempotent re-subscribe
- unknown platform -> 404
- unknown task -> 404
- unsubscribe removes the row
- telegram + discord subscribe/unsubscribe independent
- zero homes -> empty list
2026-05-04 12:31:21 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Home-channel subscriptions (per-task, per-platform toggles)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
#
|
|
|
|
|
# Home channels are a first-class gateway concept — each configured platform
|
|
|
|
|
# can have exactly one (chat_id, thread_id, name) it considers "home". The
|
|
|
|
|
# dashboard surfaces these as per-task toggles so a user can opt a specific
|
|
|
|
|
# task into receiving terminal notifications (completed / blocked / gave_up)
|
|
|
|
|
# at their telegram/discord/slack home, without touching the CLI.
|
|
|
|
|
#
|
|
|
|
|
# The wire format mirrors kanban_db.add_notify_sub — (task_id, platform,
|
|
|
|
|
# chat_id, thread_id) — so toggle-on creates exactly the same row the
|
|
|
|
|
# `/kanban create` slash command would, and the existing gateway notifier
|
|
|
|
|
# watcher delivers events without any additional plumbing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _configured_home_channels() -> list[dict]:
|
|
|
|
|
"""Return every platform that has a home_channel set, fully hydrated.
|
|
|
|
|
|
|
|
|
|
Reads the live GatewayConfig so env-var overlays (``TELEGRAM_HOME_CHANNEL``
|
|
|
|
|
etc.) are honored alongside config.yaml. Returns platforms in a stable
|
|
|
|
|
order and drops platforms without a home.
|
|
|
|
|
"""
|
|
|
|
|
try:
|
|
|
|
|
from gateway.config import load_gateway_config
|
|
|
|
|
except Exception:
|
|
|
|
|
return []
|
|
|
|
|
try:
|
|
|
|
|
gw_cfg = load_gateway_config()
|
|
|
|
|
except Exception:
|
|
|
|
|
return []
|
|
|
|
|
result: list[dict] = []
|
|
|
|
|
for platform, pcfg in gw_cfg.platforms.items():
|
|
|
|
|
if not pcfg or not pcfg.home_channel:
|
|
|
|
|
continue
|
|
|
|
|
hc = pcfg.home_channel
|
|
|
|
|
result.append({
|
|
|
|
|
"platform": platform.value,
|
|
|
|
|
"chat_id": hc.chat_id,
|
|
|
|
|
"thread_id": hc.thread_id or "",
|
|
|
|
|
"name": hc.name or "Home",
|
|
|
|
|
})
|
|
|
|
|
# Stable order for deterministic UI — platform name alphabetical.
|
|
|
|
|
result.sort(key=lambda r: r["platform"])
|
|
|
|
|
return result
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _home_sub_matches(sub: dict, home: dict) -> bool:
|
|
|
|
|
"""True if a notify_subs row corresponds to the given home channel."""
|
|
|
|
|
return (
|
|
|
|
|
sub.get("platform") == home["platform"]
|
|
|
|
|
and str(sub.get("chat_id", "")) == str(home["chat_id"])
|
|
|
|
|
and str(sub.get("thread_id") or "") == str(home["thread_id"] or "")
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.get("/home-channels")
|
|
|
|
|
def get_home_channels(
|
|
|
|
|
task_id: Optional[str] = Query(None),
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
|
|
|
|
"""List every platform with a home channel, plus whether *task_id*
|
|
|
|
|
(if given) is currently subscribed to that home.
|
|
|
|
|
|
|
|
|
|
When ``task_id`` is omitted, every entry's ``subscribed`` is ``false``
|
|
|
|
|
— useful for the "no task selected" state of the UI.
|
|
|
|
|
"""
|
|
|
|
|
homes = _configured_home_channels()
|
|
|
|
|
subscribed_homes: set[tuple[str, str, str]] = set()
|
|
|
|
|
if task_id:
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
subs = kanban_db.list_notify_subs(conn, task_id)
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
for sub in subs:
|
|
|
|
|
key = (
|
|
|
|
|
str(sub.get("platform") or ""),
|
|
|
|
|
str(sub.get("chat_id") or ""),
|
|
|
|
|
str(sub.get("thread_id") or ""),
|
|
|
|
|
)
|
|
|
|
|
subscribed_homes.add(key)
|
|
|
|
|
result = []
|
|
|
|
|
for home in homes:
|
|
|
|
|
key = (home["platform"], home["chat_id"], home["thread_id"])
|
|
|
|
|
result.append({**home, "subscribed": key in subscribed_homes})
|
|
|
|
|
return {"home_channels": result}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/tasks/{task_id}/home-subscribe/{platform}")
|
|
|
|
|
def subscribe_home(task_id: str, platform: str, board: Optional[str] = Query(None)):
|
|
|
|
|
"""Subscribe *task_id* to notifications routed to *platform*'s home channel.
|
|
|
|
|
|
|
|
|
|
Idempotent — re-subscribing is a no-op at the DB layer. 404 if the
|
|
|
|
|
platform has no home channel configured. 404 if the task doesn't exist.
|
|
|
|
|
"""
|
|
|
|
|
homes = _configured_home_channels()
|
|
|
|
|
home = next((h for h in homes if h["platform"] == platform), None)
|
|
|
|
|
if not home:
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=404,
|
|
|
|
|
detail=f"No home channel configured for platform {platform!r}. "
|
|
|
|
|
f"Set one from the messenger via /sethome, or configure "
|
|
|
|
|
f"gateway.platforms.{platform}.home_channel in config.yaml.",
|
|
|
|
|
)
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
task = kanban_db.get_task(conn, task_id)
|
|
|
|
|
if task is None:
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
|
|
|
|
kanban_db.add_notify_sub(
|
|
|
|
|
conn,
|
|
|
|
|
task_id=task_id,
|
|
|
|
|
platform=platform,
|
|
|
|
|
chat_id=home["chat_id"],
|
|
|
|
|
thread_id=home["thread_id"] or None,
|
|
|
|
|
)
|
|
|
|
|
return {"ok": True, "task_id": task_id, "home_channel": home}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.delete("/tasks/{task_id}/home-subscribe/{platform}")
|
|
|
|
|
def unsubscribe_home(task_id: str, platform: str, board: Optional[str] = Query(None)):
|
|
|
|
|
"""Remove any notify subscription on *task_id* that matches *platform*'s home."""
|
|
|
|
|
homes = _configured_home_channels()
|
|
|
|
|
home = next((h for h in homes if h["platform"] == platform), None)
|
|
|
|
|
if not home:
|
|
|
|
|
raise HTTPException(
|
|
|
|
|
status_code=404,
|
|
|
|
|
detail=f"No home channel configured for platform {platform!r}.",
|
|
|
|
|
)
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
|
|
|
|
try:
|
|
|
|
|
kanban_db.remove_notify_sub(
|
|
|
|
|
conn,
|
|
|
|
|
task_id=task_id,
|
|
|
|
|
platform=platform,
|
|
|
|
|
chat_id=home["chat_id"],
|
|
|
|
|
thread_id=home["thread_id"] or None,
|
|
|
|
|
)
|
|
|
|
|
return {"ok": True, "task_id": task_id, "home_channel": home}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Stats (per-profile / per-status counts + oldest-ready age)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/stats")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def get_stats(board: Optional[str] = Query(None)):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Per-status + per-assignee counts + oldest-ready age.
|
|
|
|
|
|
|
|
|
|
Designed for the dashboard HUD and for router profiles that need to
|
|
|
|
|
answer "is this specialist overloaded?" without scanning the whole
|
|
|
|
|
board themselves.
|
|
|
|
|
"""
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
return kanban_db.board_stats(conn)
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.get("/assignees")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def get_assignees(board: Optional[str] = Query(None)):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Known profiles + per-profile task counts.
|
|
|
|
|
|
|
|
|
|
Returns the union of ``~/.hermes/profiles/*`` on disk and every
|
|
|
|
|
distinct assignee currently used on the board. The dashboard uses
|
|
|
|
|
this to populate its assignee dropdown so a freshly-created profile
|
|
|
|
|
appears in the picker before it's been given any task.
|
|
|
|
|
"""
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
return {"assignees": kanban_db.known_assignees(conn)}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Worker log (read-only; file written by _default_spawn)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.get("/tasks/{task_id}/log")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def get_task_log(
|
|
|
|
|
task_id: str,
|
|
|
|
|
tail: Optional[int] = Query(None, ge=1, le=2_000_000),
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
"""Return the worker's stdout/stderr log.
|
|
|
|
|
|
|
|
|
|
``tail`` caps the response size (bytes) so the dashboard drawer
|
|
|
|
|
doesn't paginate megabytes into the browser. Returns 404 if the task
|
|
|
|
|
has never spawned. The on-disk log is rotated at 2 MiB per
|
|
|
|
|
``_rotate_worker_log`` — a single ``.log.1`` is kept, no further
|
|
|
|
|
generations, so disk usage per task is bounded at ~4 MiB.
|
|
|
|
|
"""
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
task = kanban_db.get_task(conn, task_id)
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
if task is None:
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"task {task_id} not found")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
content = kanban_db.read_worker_log(task_id, tail_bytes=tail, board=board)
|
|
|
|
|
log_path = kanban_db.worker_log_path(task_id, board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
size = log_path.stat().st_size if log_path.exists() else 0
|
|
|
|
|
return {
|
|
|
|
|
"task_id": task_id,
|
|
|
|
|
"path": str(log_path),
|
|
|
|
|
"exists": content is not None,
|
|
|
|
|
"size_bytes": size,
|
|
|
|
|
"content": content or "",
|
|
|
|
|
# Truncated when the on-disk file was larger than the tail cap.
|
|
|
|
|
"truncated": bool(tail and size > tail),
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Dispatch nudge (optional quick-path so the UI doesn't wait 60 s)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
@router.post("/dispatch")
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
def dispatch(
|
|
|
|
|
dry_run: bool = Query(False),
|
|
|
|
|
max_n: int = Query(8, alias="max"),
|
|
|
|
|
board: Optional[str] = Query(None),
|
|
|
|
|
):
|
|
|
|
|
board = _resolve_board(board)
|
|
|
|
|
conn = _conn(board=board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
result = kanban_db.dispatch_once(
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
conn, dry_run=dry_run, max_spawn=max_n, board=board,
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
)
|
|
|
|
|
# DispatchResult is a dataclass.
|
|
|
|
|
try:
|
|
|
|
|
return asdict(result)
|
|
|
|
|
except TypeError:
|
|
|
|
|
return {"result": str(result)}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# Boards CRUD (multi-project support)
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
class CreateBoardBody(BaseModel):
|
|
|
|
|
slug: str
|
|
|
|
|
name: Optional[str] = None
|
|
|
|
|
description: Optional[str] = None
|
|
|
|
|
icon: Optional[str] = None
|
|
|
|
|
color: Optional[str] = None
|
|
|
|
|
switch: bool = False
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class RenameBoardBody(BaseModel):
|
|
|
|
|
name: Optional[str] = None
|
|
|
|
|
description: Optional[str] = None
|
|
|
|
|
icon: Optional[str] = None
|
|
|
|
|
color: Optional[str] = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _board_counts(slug: str) -> dict[str, int]:
|
|
|
|
|
"""Return ``{status: count}`` for a board. Safe on an empty DB."""
|
|
|
|
|
try:
|
|
|
|
|
path = kanban_db.kanban_db_path(board=slug)
|
|
|
|
|
if not path.exists():
|
|
|
|
|
return {}
|
|
|
|
|
conn = kanban_db.connect(board=slug)
|
|
|
|
|
try:
|
|
|
|
|
rows = conn.execute(
|
|
|
|
|
"SELECT status, COUNT(*) AS n FROM tasks GROUP BY status"
|
|
|
|
|
).fetchall()
|
|
|
|
|
return {r["status"]: int(r["n"]) for r in rows}
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
except Exception:
|
|
|
|
|
return {}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.get("/boards")
|
|
|
|
|
def list_boards(include_archived: bool = Query(False)):
|
|
|
|
|
"""Return every board on disk with task counts and the active slug."""
|
|
|
|
|
boards = kanban_db.list_boards(include_archived=include_archived)
|
|
|
|
|
current = kanban_db.get_current_board()
|
|
|
|
|
for b in boards:
|
|
|
|
|
b["is_current"] = (b["slug"] == current)
|
|
|
|
|
b["counts"] = _board_counts(b["slug"])
|
|
|
|
|
b["total"] = sum(b["counts"].values())
|
|
|
|
|
return {"boards": boards, "current": current}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/boards")
|
|
|
|
|
def create_board_endpoint(payload: CreateBoardBody):
|
|
|
|
|
"""Create a new board. Idempotent — ``slug`` collision returns existing."""
|
|
|
|
|
try:
|
|
|
|
|
meta = kanban_db.create_board(
|
|
|
|
|
payload.slug,
|
|
|
|
|
name=payload.name,
|
|
|
|
|
description=payload.description,
|
|
|
|
|
icon=payload.icon,
|
|
|
|
|
color=payload.color,
|
|
|
|
|
)
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
if payload.switch:
|
|
|
|
|
try:
|
|
|
|
|
kanban_db.set_current_board(meta["slug"])
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
return {"board": meta, "current": kanban_db.get_current_board()}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.patch("/boards/{slug}")
|
|
|
|
|
def rename_board(slug: str, payload: RenameBoardBody):
|
|
|
|
|
"""Update a board's display metadata (slug is immutable — create a new one to rename the directory)."""
|
|
|
|
|
try:
|
|
|
|
|
normed = kanban_db._normalize_board_slug(slug)
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
if not normed or not kanban_db.board_exists(normed):
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"board {slug!r} does not exist")
|
|
|
|
|
meta = kanban_db.write_board_metadata(
|
|
|
|
|
normed,
|
|
|
|
|
name=payload.name,
|
|
|
|
|
description=payload.description,
|
|
|
|
|
icon=payload.icon,
|
|
|
|
|
color=payload.color,
|
|
|
|
|
)
|
|
|
|
|
return {"board": meta}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.delete("/boards/{slug}")
|
|
|
|
|
def delete_board(slug: str, delete: bool = Query(False, description="Hard-delete instead of archive")):
|
|
|
|
|
"""Archive (default) or hard-delete a board."""
|
|
|
|
|
try:
|
|
|
|
|
res = kanban_db.remove_board(slug, archive=not delete)
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
return {"result": res, "current": kanban_db.get_current_board()}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.post("/boards/{slug}/switch")
|
|
|
|
|
def switch_board(slug: str):
|
|
|
|
|
"""Persist ``slug`` as the active board for subsequent CLI / slash calls.
|
|
|
|
|
|
|
|
|
|
Dashboard users pick boards via a client-side ``localStorage`` — this
|
|
|
|
|
endpoint is for ``/kanban boards switch`` parity so gateway slash
|
|
|
|
|
commands and the CLI share the same current-board pointer.
|
|
|
|
|
"""
|
|
|
|
|
try:
|
|
|
|
|
normed = kanban_db._normalize_board_slug(slug)
|
|
|
|
|
except ValueError as exc:
|
|
|
|
|
raise HTTPException(status_code=400, detail=str(exc))
|
|
|
|
|
if not normed or not kanban_db.board_exists(normed):
|
|
|
|
|
raise HTTPException(status_code=404, detail=f"board {slug!r} does not exist")
|
|
|
|
|
kanban_db.set_current_board(normed)
|
|
|
|
|
return {"current": normed}
|
|
|
|
|
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
# WebSocket: /events?since=<event_id>
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
# Poll interval for the event tail loop. SQLite WAL + 300 ms polling is
|
|
|
|
|
# the simplest and most robust approach; it adds a fraction of a percent
|
|
|
|
|
# of CPU and has no shared state to synchronize across workers.
|
|
|
|
|
_EVENT_POLL_SECONDS = 0.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@router.websocket("/events")
|
|
|
|
|
async def stream_events(ws: WebSocket):
|
|
|
|
|
# Enforce the dashboard session token as a query param — browsers can't
|
|
|
|
|
# set Authorization on a WS upgrade. This matches how the PTY bridge
|
|
|
|
|
# authenticates in hermes_cli/web_server.py.
|
|
|
|
|
token = ws.query_params.get("token")
|
|
|
|
|
if not _check_ws_token(token):
|
|
|
|
|
await ws.close(code=http_status.WS_1008_POLICY_VIOLATION)
|
|
|
|
|
return
|
|
|
|
|
await ws.accept()
|
|
|
|
|
try:
|
|
|
|
|
since_raw = ws.query_params.get("since", "0")
|
|
|
|
|
try:
|
|
|
|
|
cursor = int(since_raw)
|
|
|
|
|
except ValueError:
|
|
|
|
|
cursor = 0
|
|
|
|
|
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
# Board selection — pinned at the WS handshake; re-subscribe to
|
|
|
|
|
# switch boards. Changing boards mid-stream would require
|
|
|
|
|
# reconciling two cursors, so the UI just opens a new WS on
|
|
|
|
|
# board change.
|
|
|
|
|
ws_board_raw = ws.query_params.get("board")
|
|
|
|
|
try:
|
|
|
|
|
ws_board = kanban_db._normalize_board_slug(ws_board_raw) if ws_board_raw else None
|
|
|
|
|
except ValueError:
|
|
|
|
|
ws_board = None
|
|
|
|
|
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
def _fetch_new(cursor_val: int) -> tuple[int, list[dict]]:
|
feat(kanban): multi-project boards — one install, many kanbans (#19653)
Adds first-class board support to kanban so users can separate unrelated
streams of work (projects, repos, domains) into isolated queues. Single-
project users stay on the 'default' board and see no UI change.
Isolation model
---------------
- Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with
its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board
keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh
installs and pre-boards users get zero migration.
- Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in
their env alongside the existing `HERMES_KANBAN_DB` /
`HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see
other boards' tasks.
- The gateway's single dispatcher loop now sweeps every board per tick;
per-tick cost is a few extra filesystem stats.
- CAS concurrency guarantees are preserved per-board (each board is its
own SQLite DB, same WAL+IMMEDIATE machinery as before).
CLI
---
hermes kanban boards list|create|switch|show|rename|rm
hermes kanban --board <slug> <any-subcommand>
Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env →
`~/.hermes/kanban/current` file → `default`. Slug validation is strict:
lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with
alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` /
control chars are rejected so boards can't name their way out of the
boards/ directory.
Passive discoverability: when more than one board exists, `hermes kanban
list` prints a one-line header ("Board: foo (2 other boards …)") so
users who stumble across multi-project never have to hunt for the
feature. Invisible for single-board installs.
Dashboard
---------
- New `BoardSwitcher` component at the top of the Kanban tab: dropdown
with all boards + task counts, `+ New board` button, `Archive`
button (non-default only). Hidden entirely when only `default` exists
and is empty — single-project users never see it.
- New `NewBoardDialog` modal: slug / display name / description / icon
+ "switch to this board after creating" checkbox.
- Selected board persists to `localStorage` so browser users don't
shift the CLI's active board out from under a terminal they left open.
- New `?board=<slug>` query param on every existing endpoint plus a
new `/boards` CRUD surface (`GET /boards`, `POST /boards`,
`PATCH /boards/<slug>`, `DELETE /boards/<slug>`,
`POST /boards/<slug>/switch`).
- Events WebSocket is pinned to a board at connection time; switching
opens a fresh WS against the new board.
Also fixes a pre-existing bug in the plugin's tenant / assignee
filters: the SDK's `Select` uses `onValueChange(value)`, not
native `onChange(event)`, so those filters silently didn't work.
New `selectChangeHandler` helper wires both signatures.
Tests
-----
49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering:
slug validation (valid / invalid / auto-downcase), path resolution
(default = legacy path, named = `boards/<slug>/`, env var override),
current-board resolution chain (env > file > default), board CRUD +
archive / hard-delete, per-board connection isolation (tasks don't
leak), worker spawn env injection (`HERMES_KANBAN_BOARD`,
`HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the
right board), and end-to-end CLI surface.
Regression surface: all 264 pre-existing kanban tests continue to pass.
Live-tested via the dashboard: created 3 boards (default,
hermes-agent, atm10-server), created tasks on each via both CLI
(`--board <slug> create`) and dashboard (inline create on the Ready
column), confirmed zero cross-board leakage, confirmed `BoardSwitcher`
+ `NewBoardDialog` work end-to-end in the browser.
2026-05-04 04:42:38 -07:00
|
|
|
conn = kanban_db.connect(board=ws_board)
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
try:
|
|
|
|
|
rows = conn.execute(
|
|
|
|
|
"SELECT id, task_id, run_id, kind, payload, created_at "
|
|
|
|
|
"FROM task_events WHERE id > ? ORDER BY id ASC LIMIT 200",
|
|
|
|
|
(cursor_val,),
|
|
|
|
|
).fetchall()
|
|
|
|
|
out: list[dict] = []
|
|
|
|
|
new_cursor = cursor_val
|
|
|
|
|
for r in rows:
|
|
|
|
|
try:
|
|
|
|
|
payload = json.loads(r["payload"]) if r["payload"] else None
|
|
|
|
|
except Exception:
|
|
|
|
|
payload = None
|
|
|
|
|
out.append({
|
|
|
|
|
"id": r["id"],
|
|
|
|
|
"task_id": r["task_id"],
|
|
|
|
|
"run_id": r["run_id"],
|
|
|
|
|
"kind": r["kind"],
|
|
|
|
|
"payload": payload,
|
|
|
|
|
"created_at": r["created_at"],
|
|
|
|
|
})
|
|
|
|
|
new_cursor = r["id"]
|
|
|
|
|
return new_cursor, out
|
|
|
|
|
finally:
|
|
|
|
|
conn.close()
|
|
|
|
|
|
|
|
|
|
while True:
|
|
|
|
|
cursor, events = await asyncio.to_thread(_fetch_new, cursor)
|
|
|
|
|
if events:
|
|
|
|
|
await ws.send_json({"events": events, "cursor": cursor})
|
|
|
|
|
await asyncio.sleep(_EVENT_POLL_SECONDS)
|
|
|
|
|
except WebSocketDisconnect:
|
|
|
|
|
return
|
2026-05-07 01:11:28 +02:00
|
|
|
except asyncio.CancelledError:
|
|
|
|
|
# Normal shutdown path: dashboard process exit (Ctrl-C) cancels the
|
|
|
|
|
# websocket task while it is sleeping in the poll loop.
|
|
|
|
|
# CancelledError is a BaseException in 3.8+ so the bare Exception
|
|
|
|
|
# handler below would not catch it; without this clause Uvicorn
|
|
|
|
|
# surfaces the cancellation as an application traceback. Quiet it.
|
|
|
|
|
return
|
feat(kanban): durable multi-profile collaboration board (#17805)
Salvage of PR #16100 onto current main (after emozilla's #17514 fix
that unblocks plugin Pydantic body validation). History preserved on
the standing `feat/kanban-standing` branch; this squashes the 22
iterative commits into one clean landing.
What this lands:
- SQLite kernel (hermes_cli/kanban_db.py) — durable task board with
tasks, task_links, task_runs, task_comments, task_events,
kanban_notify_subs tables. WAL mode, atomic claim via CAS,
tenant-namespaced, skills JSON array per task, max-runtime timeouts,
worker heartbeats, idempotency keys, circuit breaker on repeated
spawn failures, crash detection via /proc/<pid>/status, run history
preserved across attempts.
- Dispatcher — runs inside the gateway by default
(`kanban.dispatch_in_gateway: true`). Ticks every 60s, reclaims
stale claims, promotes ready tasks, spawns `hermes -p <assignee>
chat -q "work kanban task <id>"` with HERMES_KANBAN_TASK +
HERMES_KANBAN_WORKSPACE env. Auto-loads `--skills kanban-worker`
plus any per-task skills. Health telemetry warns on stuck ready
queue.
- Structured tool surface (tools/kanban_tools.py) — 7 tools
(kanban_show, kanban_complete, kanban_block, kanban_heartbeat,
kanban_comment, kanban_create, kanban_link). Gated on
HERMES_KANBAN_TASK via check_fn so zero schema footprint in normal
sessions.
- System-prompt guidance (agent/prompt_builder.py KANBAN_GUIDANCE)
injected only when kanban tools are active.
- Dashboard plugin (plugins/kanban/dashboard/) — Linear-style board
UI: triage/todo/ready/running/blocked/done columns, drag-drop,
inline create, task drawer with markdown, comments, run history,
dependency editor, bulk ops, lanes-by-profile grouping, WS-driven
live refresh. Matches active dashboard theme via CSS variables.
- CLI — `hermes kanban init|create|list|show|assign|link|unlink|
claim|comment|complete|block|unblock|archive|tail|dispatch|context|
init|gc|watch|stats|notify|log|heartbeat|runs|assignees` +
`/kanban` slash in-session.
- Worker + orchestrator skills (skills/devops/kanban-worker +
kanban-orchestrator) — pattern library for good summary/metadata
shapes, retry diagnostics, block-reason examples, fan-out patterns.
- Per-task force-loaded skills — `--skill <name>` (repeatable),
stored as JSON, threaded through to dispatcher argv as one
`--skills X` pair per skill alongside the built-in kanban-worker.
Dashboard + CLI + tool parity.
- Deprecation of standalone `hermes kanban daemon` — stub exits 2
with migration guidance; `--force` escape hatch for headless hosts.
- Docs (website/docs/user-guide/features/kanban.md + kanban-tutorial.md)
with 11 dashboard screenshots walking through four user stories
(Solo Dev, Fleet Farming, Role Pipeline, Circuit Breaker).
- Tests (251 passing): kernel schema + migration + CAS atomicity,
dispatcher logic, circuit breaker, crash detection, max-runtime
timeouts, claim lifecycle, tenant isolation, idempotency keys, per-
task skills round-trip + validation + dispatcher argv, tool surface
(7 tools × round-trip + error paths), dashboard REST (CRUD + bulk
+ links + warnings), gateway-embedded dispatcher (config gate, env
override, graceful shutdown), CLI deprecation stub, migration from
legacy schemas.
Gateway integration:
- GatewayRunner._kanban_dispatcher_watcher — new asyncio background
task, symmetric with _kanban_notifier_watcher. Runs dispatch_once
via asyncio.to_thread so SQLite WAL never blocks the loop. Sleeps
in 1s slices for snappy shutdown. Respects HERMES_KANBAN_DISPATCH_IN_GATEWAY=0
env override for debugging.
- Config: new `kanban` section in DEFAULT_CONFIG with
`dispatch_in_gateway: true` (default) + `dispatch_interval_seconds: 60`.
Additive — no \_config_version bump needed.
Forward-compat:
- workflow_template_id / current_step_key columns on tasks (v1 writes
NULL; v2 will use them for routing).
- task_runs holds claim machinery (claim_lock, claim_expires,
worker_pid, last_heartbeat_at) so multi-attempt history is first-
class from day one.
Closes #16102.
Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-30 13:36:47 -07:00
|
|
|
except Exception as exc: # defensive: never crash the dashboard worker
|
|
|
|
|
log.warning("Kanban event stream error: %s", exc)
|
|
|
|
|
try:
|
|
|
|
|
await ws.close()
|
|
|
|
|
except Exception:
|
|
|
|
|
pass
|