mirror of
https://github.com/patriceckhart/zot.git
synced 2026-06-27 22:06:31 +02:00
A /swarm subsystem for long-running parallel subagents. Each agent runs in its own subprocess against a fresh git worktree (branch swarm/<id>) with its own persistent session file and unix-socket inbox; the parent zot stays in the main session and pokes / observes them via the dashboard. Highlights: - New internal/swarm package: Agent, Spawn/Resume/Kill/Remove, event log (events.jsonl), inbox protocol (listen/dial), worktree manager, exec runner that spawns "zot --swarm-agent ...". - New internal/agent/swarm_agent.go: daemon-mode child entry point. Reuses the standard agent loop but persists turns to the supervisor- chosen session.json and streams events as JSONL on stdout. Mirror to events.jsonl is dormant while the supervisor's stdout pipe is alive so events do not get double-written. - Resume reattaches in place: reuses the same worktree, session, branch and inbox path; carries forward the prior transcript replayed from events.jsonl. Resume no longer re-fires the original Task as a fresh user turn -- that was producing "agent busy; send cancel first" races. - core.NewSessionAtPath plus an openOrCreateSession fallback so the child actually persists its session.json at the supervisor-chosen path on first spawn instead of running with sess==nil. - Dashboard in internal/agent/modes/swarm_dialog.go + swarm_slash.go: list / new / kill / remove / resume / logs / send subcommands plus an interactive picker. Transcript view is /btw-style: an always-on inline editor at the bottom, streaming auto-follow, inline busy spinner with the agent's current activity such as "thinking" or "tool: edit". /model inside the spawn editor pops the global model picker. - Per-session scope: each spawn is stamped with the host session's id and only shows in that session's /swarm dashboard. Pre-upgrade agents -- empty session_id -- remain visible everywhere as a safety net. The active scope is re-applied whenever loadSession swaps sessions. - Resolve falls back to the provider's default model when the persisted cfg.Model is no longer in the catalogue, warns on stderr, and rewrites config.json so the next launch is silent. - ReadEventLog folds back-to-back same-type identical-payload events within 250ms so events.jsonl files polluted by the old supervisor + mirror double-write read back cleanly. - DrawLog gains an idle no-op fast path: identical buffer plus identical cursor = emit nothing, so the terminal's cursor blink keeps ticking in dialogs whose underlying agent is idle. Slash UX: - New /swarm command with subcommands; the suggester picks it up. - README.md documents the full dashboard, CLI, and persistence story, and explicitly notes that /session export does NOT bundle subagents -- their worktree and unix-socket inbox cannot round-trip through a .zotsession. Tests cover: SpawnReq + Resume lifecycle, session-id scoping + persistence, default-child-args spawn vs resume contract, NewSessionAtPath at a fixed path, model fallback when the configured model is gone, swarm dialog behaviour -- auto-open editor, /model in spawn editor, transcript grows without internal scroll, busy spinner, multi-message send -- event-log dedup, swarm emitter dormant-until-orphan, and the DrawLog idle no-op + change-breaks-fast-path invariants.
69 lines
2.5 KiB
Go
69 lines
2.5 KiB
Go
package swarm
|
|
|
|
import (
|
|
"crypto/sha1"
|
|
"encoding/hex"
|
|
"fmt"
|
|
"os"
|
|
"path/filepath"
|
|
"runtime"
|
|
)
|
|
|
|
// maxUnixSocketPath is the conservative platform-portable path limit
|
|
// for unix sockets. macOS allows 104, linux 108 (including the NUL
|
|
// terminator). We pick 100 so the path itself plus a small filename
|
|
// tail stays under both caps with a safety margin.
|
|
const maxUnixSocketPath = 100
|
|
|
|
// inboxSocketPath returns a per-agent unix-socket path that's short
|
|
// enough to actually work (see maxUnixSocketPath) and unique per
|
|
// swarm root so two zot instances on the same machine don't collide.
|
|
//
|
|
// Strategy:
|
|
//
|
|
// 1. Try <root>/agents/<id>/in.sock. This is the obvious place and
|
|
// puts everything next to the durable state; on most setups it
|
|
// fits.
|
|
// 2. If that's too long, fall back to <tmp>/zot-swarm-<roothash>/<id>.sock.
|
|
// We hash root rather than embedding it so the tmp directory name
|
|
// stays short. SHA-1's first 8 hex chars is plenty: collisions
|
|
// only matter within a single user's tmp dir and we already
|
|
// create a dedicated subdir.
|
|
// 3. If even /tmp is somehow too long (chroots, containers), give
|
|
// up with a clear error so the caller surfaces it instead of
|
|
// leaving the user wondering why follow-ups don't work.
|
|
func inboxSocketPath(root, agentID string) (string, error) {
|
|
primary := filepath.Join(root, "agents", agentID, "in.sock")
|
|
if len(primary) <= maxUnixSocketPath {
|
|
return primary, nil
|
|
}
|
|
tmp := os.TempDir()
|
|
dir := filepath.Join(tmp, "zot-swarm-"+rootTag(root))
|
|
if err := os.MkdirAll(dir, 0o700); err != nil {
|
|
return "", fmt.Errorf("socket tmp dir: %w", err)
|
|
}
|
|
candidate := filepath.Join(dir, agentID+".sock")
|
|
if len(candidate) <= maxUnixSocketPath {
|
|
return candidate, nil
|
|
}
|
|
// Last-resort: use just the short hash of the id so even very long
|
|
// task slugs fit. We surface the original id in the meta.json /
|
|
// events log; the socket path is purely transport.
|
|
short := shortHash(agentID)
|
|
candidate = filepath.Join(dir, short+".sock")
|
|
if len(candidate) <= maxUnixSocketPath {
|
|
return candidate, nil
|
|
}
|
|
return "", fmt.Errorf("unix socket path too long even after shortening (%s, %d > %d, GOOS=%s)",
|
|
candidate, len(candidate), maxUnixSocketPath, runtime.GOOS)
|
|
}
|
|
|
|
// rootTag returns a stable 8-hex-char tag for the swarm root. Used
|
|
// in the tmp-dir name so two parallel zot instances with different
|
|
// roots don't share sockets.
|
|
func rootTag(root string) string { return shortHash(root) }
|
|
|
|
func shortHash(s string) string {
|
|
sum := sha1.Sum([]byte(s))
|
|
return hex.EncodeToString(sum[:4])
|
|
}
|