Cleans stale Herdr socket/API naming after the Colibri socket rename, preserves Herdr as an optional Linux/macOS display client, marks the clawdie mini-binary service as experimental rather than ISO/deployed-service contract, and removes old internal session logs.\n\nChecks: ./scripts/check-format.sh; cargo fmt --check; git diff --check; sh -n packaging/freebsd/colibri_daemon.in packaging/freebsd/clawdie.in
31 KiB
colibri-daemon ↔ colibri-glasspane integration contract
Attribution: Sam & Hermes
This is the binding contract between the two core Rust crates in the colibri workspace. It defines the socket API, pane-to-session identity mapping, state flow, unified vocabulary, the snapshot contract, and the boot sequence. Both crates MUST implement their side of this contract; changes here require both crates to be updated in lockstep.
Architecture summary
┌─────────────────────────────────────────────────────────────────┐
│ colibri-daemon (always-on service) │
│ │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌───────────────┐ │
│ │ Spawner │ │ Sessions │ │ Heartbeat │ │ Socket Server │ │
│ │ (agents) │ │ (JSONL) │ │ (30s) │ │ (Unix domain) │──┼──► Colibri TUI / web
│ └─────┬─────┘ └───────────┘ └─────┬─────┘ └───────┬───────┘ │ Zed / optional Herdr Linux
│ │ │ │ │
│ │ stdout JSONL │ poll_exit() │ query │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ DaemonState.glasspane: RwLock<PaneSupervisor> │ │
│ │ (colibri-glasspane — owned, embedded in daemon) │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
- colibri-daemon owns the
PaneSupervisor. Glasspane is NOT a separate process; it is a crate compiled into the daemon binary. - colibri-glasspane provides the state machine (
apply_pi_event/fold_pi_events), thePiJsonlIngestor,SupervisedPane,PaneSupervisor, and theGlasspaneSnapshotwire type. It has no I/O dependencies beyond what the daemon gives it (lines of text with timestamps).
1. Socket API shape
The daemon opens a Unix domain socket (path from DaemonConfig.socket_path).
All communication is newline-delimited JSON (one JSON object per line,
matching the existing watchdog convention).
Wire types (defined in colibri-daemon/src/lib.rs)
Inbound: ColibriCommand (tagged by cmd field):
cmd value |
Parameters | Purpose |
|---|---|---|
status |
none | Health check: agent count, session count |
glasspane-snapshot |
none | Full PaneSupervisor.snapshot_at(...) |
list-sessions |
none | Enumerate sessions (id, turn_count, bytes) |
spawn-agent |
provider, model, session_id?, system_prompt? |
Spawn agent, attach pane to glasspane. For provider:"local", model is treated as the executable path and no API key is required. |
kill-agent |
agent_id |
SIGKILL agent, ingest error event |
get-session |
session_id |
Full session dump (turns + prompt) |
compact-session |
session_id |
Compact oldest turns in a session |
Outbound: ColibriResponse:
{
"ok": true,
"error": null,
"data": { ... }
}
data is null when ok is false.
Glasspane-facing commands
Two commands produce or consume glasspane state:
glasspane-snapshot — reads the PaneSupervisor under its RwLock and
returns a GlasspaneSnapshot:
Client: {"cmd":"glasspane-snapshot"}\n
Server: {"ok":true,"data":{
"schema":"clawdie.glasspane.snapshot.v1",
"host":"domedog",
"observed_at":"2026-05-27T12:00:00.000Z",
"panes":[
{"id":"abc-123","agent":"pi","state":"working",
"pi_session_id":"019e5e59-...","last_event_at":"...",
"cwd":"/repo","stalled":false}
]
}}\n
spawn-agent — creates a daemon session, spawns an agent subprocess,
attaches a new pane to PaneSupervisor, and wires agent stdout into the
glasspane ingestion pipeline:
Client: {"cmd":"spawn-agent","provider":"deepseek","model":"deepseek-chat",
"session_id":"sess-1","system_prompt":"You are a helpful assistant."}\n
Server: {"ok":true,"data":{"agent_id":"a1b2-c3d4","status":"running"}}\n
Operator CLI smoke helpers
colibri-client also ships small binaries for manual display-client and SSH
smoke tests without hand-writing socket JSON:
# Inspect daemon state
colibri --socket "$COLIBRI_DAEMON_SOCKET" status
colibri --socket "$COLIBRI_DAEMON_SOCKET" snapshot
# Spawn a deterministic no-network Pi JSONL emitter through the daemon
colibri --socket "$COLIBRI_DAEMON_SOCKET" \
spawn-local target/release/colibri-smoke-agent
# Stop the spawned local agent
colibri --socket "$COLIBRI_DAEMON_SOCKET" kill <agent_id>
colibri-smoke-agent emits session, turn_start, queue_update,
turn_start, and turn_end, so colibri-tui should show:
Idle → Working → Blocked → Done
What the daemon sends TO glasspane
The daemon is the sole writer into the PaneSupervisor. It calls:
| Daemon action | Glasspane method called |
|---|---|
cmd_spawn_agent — attach new pane |
attach_pane_at(agent_id, binary_name, SystemTime::now) |
stream_agent_stdout_to_glasspane — each JSONL line |
ingest_line_at(agent_id, line, SystemTime::now) |
heartbeat — agent exited successfully |
ingest_line_at(agent_id, '{"type":"agent_end"}', now) |
heartbeat — agent exited with error |
ingest_line_at(agent_id, '{"type":"error"}', now) |
cmd_kill_agent — forced kill |
ingest_line_at(agent_id, '{"type":"error"}', now) |
What glasspane exposes TO the daemon
The daemon is the sole reader of the PaneSupervisor:
| Daemon action | Glasspane method called |
|---|---|
cmd_glasspane_snapshot |
snapshot_at(host, now, DEFAULT_STALL_AFTER) |
cmd_status (agent count lookup) |
not glasspane — reads state.agents directly |
Glasspane does NOT push events to the daemon. It is a passive state accumulator. The daemon feeds it events; the daemon reads snapshots. Glasspane has no threads, no channels, no timers of its own.
2. Pane-to-session mapping
Two distinct identity spaces exist and MUST NOT be conflated:
| Concept | Owner | ID namespace | Example |
|---|---|---|---|
| Agent ID | colibri-daemon |
UUIDv4 (Uuid::new_v4()) |
"a1b2c3d4-e5f6-..." |
| Pane ID | colibri-daemon |
same as Agent ID | "a1b2c3d4-e5f6-..." |
| Session ID | colibri-daemon |
caller-supplied or UUIDv4 | "sess-1" or "019e5e59-..." |
| Pi session ID | Pi agent (JSONL) | Pi --mode json header id field |
"019e5e59-6645-7e21-aca2-b57ccf0f8578" |
The mapping chain
Agent ID == Pane ID ──► Session ID ──► Pi session ID
(daemon) (daemon) (daemon) (glasspane, discovered)
│ │ │ │
│ 1:1 │ n:1 │ 1:1 │ discovered from
│ │ (multiple │ (one Pi agent │ Pi JSONL header
│ │ agents │ per session; │ `{"type":"session",
│ │ can share │ but sessions │ "id":"..."}`
│ │ a session) │ can be reused)
│ │ │
▼ ▼ ▼
AgentHandle SupervisedPane Session
(DashMap) (PaneSupervisor) (DashMap)
How the mapping is created
spawn-agentcommand arrives → daemon generatesagent_id(UUIDv4).- Daemon creates or resolves
session_id. Spawner::spawn()returnsAgentHandle { id: agent_id }.- Daemon inserts handle into
state.agentskeyed byagent_id. - Daemon calls
state.glasspane.attach_pane_at(agent_id, agent_binary, now)— this creates aSupervisedPanewithpane.id == agent_id. - Daemon spawns
stream_agent_stdout_to_glasspane(state, agent_id, stdout). - When glasspane ingests a
{"type":"session","id":"pi-xxx","cwd":"/repo"}line, it capturespi_session_idandcwdon theSupervisedPane.
Why separate Pane ID from Pi session ID?
- The daemon controls agent lifecycle (spawn, kill, restart). It needs a stable ID that it assigns before the Pi agent emits its first JSONL line.
- The Pi agent's session ID is internal to the agent — it cannot be known until the JSONL stream begins.
- Glasspane tracks both:
pane.idis the daemon-assigned key;pane.pi_session_idis the discovered Pi header field. Tests enforcepane.id != pane.pi_session_idwhen both are present.
Agent-to-pane lifetime
- Agent spawn → pane attached (
attach_pane_at). - Agent stdout lines → pane ingests (
ingest_line_at). - Agent exit (natural or killed) → daemon ingests a final lifecycle event
(
agent_endorerror) into the pane. - Pane is never removed from
PaneSupervisorcurrently. In Phase 4, the daemon may prune panes after a configurable retention window.
3. State flow
Daemon lifecycle events → Glasspane AgentState
| Daemon lifecycle event | Ingested Pi event type | Resulting AgentState |
Notes |
|---|---|---|---|
| Agent subprocess spawned | (pane attached, state = Idle) |
Idle |
SupervisedPane::new defaults to Idle |
Agent emits session header |
session / session_started |
Idle |
Also captures pi_session_id and cwd |
| Agent emits turn/message/tool | turn_start, message_start, tool_execution_*, etc. |
Working |
Any of 14 event types |
| Agent emits compaction events | auto_compaction_*, compaction_* |
Working |
Compaction is active work |
| Agent emits retry events | auto_retry_* |
Working |
Retry is active work |
| Agent awaits steering/approval | queue_update |
Blocked |
Operator attention needed (dashboard headline) |
| Turn/task complete | turn_end / agent_end |
Done |
Agent reached a completion point |
| Agent emits explicit error | error |
Error |
Terminal failure state |
| Agent subprocess exits (0) | daemon injects agent_end |
Done |
Heartbeat detected normal exit |
| Agent subprocess exits (!=0) | daemon injects error |
Error |
Heartbeat detected crash/error exit |
| Agent killed externally | daemon injects error |
Error |
kill-agent command |
State transition diagram
┌──────────┐
attach ───────►│ Idle │◄──── session / session_started
└────┬─────┘
│ turn_start, message_*, tool_execution_*,
│ auto_compaction_*, auto_retry_*
▼
queue_update ─────► ┌──────────┐ ◄──── any working event
(steering needed) │ Working │
└────┬─────┘
│ turn_end / agent_end
▼
┌──────────┐
│ Done │
└──────────┘
┌──────────┐
│ Error │◄──── error (from agent or daemon)
└──────────┘
┌──────────┐
│ Blocked │──► turn_start etc. ──► Working
└──────────┘
▲
│ queue_update
┌────┴─────┐
│ Working │
└──────────┘
Blockedis entered fromWorkingor any other state whenqueue_updatearrives. It transitions back toWorkingon any working-type event (the agent resumed after receiving steering input).DoneandErrorare terminal-ish: they are not reset by subsequent events unless a newsessionheader appears (which would restart atIdle).- Unknown event types preserve the current state — forward-compatible with future Pi event taxonomy additions.
Daemon background loop ↔ Glasspane
| Loop tick | Interval | Glasspane interaction |
|---|---|---|
| Heartbeat | 30s | Polls AgentHandle::poll_exit(). On exit, injects agent_end or error event into glasspane via ingest_line_at. |
| Session rotation | 60s | Checks session byte/turn thresholds. Triggers compaction. No direct glasspane interaction, but agent compaction emits auto_compaction_* events that flow through stdout → glasspane. |
| Memory handoff | 120s | Currently a stub. Future: produce shared context summaries. No glasspane interaction yet. |
Stalled detection
Stalled is derived in the snapshot layer, not stored as mutable state:
pub fn is_stalled_at(&self, now: SystemTime, stall_after: Duration) -> bool {
if !matches!(self.state(), AgentState::Working | AgentState::Blocked) {
return false;
}
let silence_since = self.last_event_at().unwrap_or(self.started_at);
now.duration_since(silence_since)
.is_ok_and(|silent_for| silent_for >= stall_after)
}
- Only
WorkingandBlockedpanes can be stalled. DEFAULT_STALL_AFTERis 4 hours.DoneandErrorpanes are never stalled (they've already reached a terminal state).- The daemon heartbeat's
agent_stall_timeout(300s default) is a separate concept: it detects dead subprocesses, not semantic stalling. The heartbeat timeout triggers event injection; glasspanestalledis a display concern.
4. Unified API vocabulary
Consistent naming across colibri-daemon and colibri-glasspane. Every term
below means exactly one thing.
Glasspane / supervision namespace
| Term | Rust type / fn | Owner | Meaning |
|---|---|---|---|
pane |
SupervisedPane, Pane |
glasspane | One agent occupying one supervision slot |
pane.id |
PaneId (type alias String) |
glasspane | Daemon-assigned unique ID (= agent ID) |
attach_pane_at |
PaneSupervisor::attach_pane_at |
glasspane | Register a new pane in the supervisor |
ingest_line_at |
PaneSupervisor::ingest_line_at |
glasspane | Feed one JSONL line at a wall-clock time |
ingest_jsonl_reader_at |
PaneSupervisor::ingest_jsonl_reader_at |
glasspane | Feed a BufRead to the supervisor |
snapshot_at |
PaneSupervisor::snapshot_at |
glasspane | Produce GlasspaneSnapshot for all panes |
state |
AgentState enum |
glasspane | Semantic agent state (5 variants) |
stalled |
Pane::stalled (derived bool) |
glasspane | Event silence exceeds stall_after threshold |
pi_session_id |
Option<String> on Pane |
glasspane | Pi session ID captured from JSONL header |
last_event_at |
Option<SystemTime> |
glasspane | Wall-clock time of last accepted Pi event |
cwd |
Option<String> on Pane |
glasspane | Working directory from Pi session header |
apply_pi_event |
fn(AgentState, &str) -> AgentState |
glasspane | Pure state transition function |
fold_pi_events |
fn(Iterator<&str>) -> AgentState |
glasspane | Fold a sequence of event types |
DEFAULT_STALL_AFTER |
Duration (4 hours) |
glasspane | Default stall silence threshold |
GLASSPANE_SNAPSHOT_SCHEMA |
&str = "clawdie.glasspane.snapshot.v1" |
glasspane | Schema constant for all snapshots |
Daemon / lifecycle namespace
| Term | Rust type / fn | Owner | Meaning |
|---|---|---|---|
agent |
AgentHandle |
daemon | Running agent subprocess handle |
agent.id |
String (UUIDv4) |
daemon | Same value as pane.id |
session |
Session |
daemon | JSONL-backed conversation store |
session.id |
String |
daemon | Caller-supplied or generated session key |
spawn |
Spawner::spawn |
daemon | Launch agent subprocess with retry/backoff |
kill |
AgentHandle::kill |
daemon | SIGKILL agent, update status |
poll_exit |
AgentHandle::poll_exit |
daemon | Non-blocking exit check (heartbeat) |
compact |
Session::compact_oldest_turns |
daemon | Compaction triggered by byte/turn thresholds |
prune |
Session::prune_to |
daemon | Aggressive pruning after compaction |
heartbeat |
fn heartbeat in daemon loop |
daemon | 30s tick: check exits, detect stalls |
session_rotation |
fn session_rotation in daemon loop |
daemon | 60s tick: compact/prune sessions |
memory_handoff |
fn memory_handoff in daemon loop |
daemon | 120s tick: cross-agent context sharing |
Provider namespace
| Term | Rust type / fn | Owner | Meaning |
|---|---|---|---|
provider |
Provider enum |
daemon | LLM backend (DeepSeek, OpenRouter, Anthropic) |
provider (socket) |
"deepseek", "openrouter", "anthropic", "local" |
daemon | String form in spawn-agent command (local is no-network/fake-agent smoke only; model = executable path) |
Socket command namespace
| Term | Wire cmd value |
Owner | Meaning |
|---|---|---|---|
status |
"status" |
daemon | Health check |
glasspane-snapshot |
"glasspane-snapshot" |
daemon/glasspane | Read full supervision snapshot |
list-sessions |
"list-sessions" |
daemon | Enumerate sessions |
spawn-agent |
"spawn-agent" |
daemon | Spawn agent + attach pane |
kill-agent |
"kill-agent" |
daemon | Kill agent + ingest error event |
get-session |
"get-session" |
daemon | Dump full session |
compact-session |
"compact-session" |
daemon | Manual compaction trigger |
Event taxonomy (colibri-pi-events, shared)
These are the Pi --mode json type field values recognized by apply_pi_event:
| Event type | Maps to state | Notes |
|---|---|---|
session |
Idle |
Captures pi_session_id and cwd |
session_started |
Idle |
Alternative header form |
agent_start |
Working |
Agent lifecycle begin |
turn_start |
Working |
Turn/task begin |
message_start |
Working |
LLM message streaming begin |
message_update |
Working |
LLM message streaming chunk |
message_end |
Working |
LLM message streaming complete |
tool_execution_start |
Working |
Tool invocation begin |
tool_execution_update |
Working |
Tool invocation progress |
tool_execution_end |
Working |
Tool invocation complete |
auto_compaction_start |
Working |
Automatic context compaction begin |
auto_compaction_end |
Working |
Automatic context compaction complete |
compaction_start |
Working |
Legacy compaction begin |
compaction_end |
Working |
Legacy compaction complete |
auto_retry_start |
Working |
Automatic retry begin |
auto_retry_end |
Working |
Automatic retry complete |
queue_update |
Blocked |
Steering/approval/input required |
turn_end |
Done |
Turn/task complete |
agent_end |
Done |
Agent lifecycle complete |
error |
Error |
Terminal failure |
5. Contract: clawdie.glasspane.snapshot.v1
Where it is defined
Defined in colibri-glasspane/src/lib.rs:
pub const GLASSPANE_SNAPSHOT_SCHEMA: &str = "clawdie.glasspane.snapshot.v1";
Rust type
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct GlasspaneSnapshot {
pub schema: String, // "clawdie.glasspane.snapshot.v1"
pub host: String, // Hostname from DaemonConfig.host
pub observed_at: String, // RFC 3339 with milliseconds (e.g. "2026-05-27T12:00:00.000Z")
pub panes: Vec<Pane>, // All supervised panes
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct Pane {
pub id: String, // Daemon-assigned pane ID (= agent ID)
pub agent: String, // Agent binary name (e.g. "pi", "hermes-agent")
pub state: AgentState, // "idle" | "working" | "blocked" | "done" | "error"
pub pi_session_id: Option<String>, // Pi session ID from JSONL header
pub last_event_at: Option<String>, // RFC 3339 of last accepted event
pub cwd: Option<String>, // Working directory from Pi session header
pub stalled: bool, // Derived: event silence >= DEFAULT_STALL_AFTER
}
Where it is produced
Exactly one place: PaneSupervisor::snapshot_at(host, observed_at, stall_after)
in colibri-glasspane. Called by cmd_glasspane_snapshot in
colibri-daemon/src/socket.rs.
// In cmd_glasspane_snapshot:
let snapshot = state.glasspane.read().await.snapshot_at(
state.config.host.clone(),
SystemTime::now(),
DEFAULT_STALL_AFTER,
);
Where it is consumed
| Consumer | Transport | Phase | Purpose |
|---|---|---|---|
colibri CLI / colibri-tui |
Unix socket | 4 | Native operator dashboard and smoke surface |
| Herdr (Linux/macOS optional) | Unix socket/bridge | 4 | Optional external display client, not source |
| Zed / web board | HTTP / SSE | 4 | Web-based supervision view |
| colibri-orchestrator | In-memory / socket | 5 | Route/dispatch work across panes |
Wire shape (JSON)
{
"schema": "clawdie.glasspane.snapshot.v1",
"host": "domedog",
"observed_at": "2026-05-27T12:00:00.123Z",
"panes": [
{
"id": "a1b2c3d4-e5f6-...",
"agent": "pi",
"state": "working",
"pi_session_id": "019e5e59-6645-7e21-aca2-b57ccf0f8578",
"last_event_at": "2026-05-27T11:59:58.456Z",
"cwd": "/home/clawdija/clawdie-ai",
"stalled": false
}
]
}
Serialization rules:
pi_session_id,last_event_at, andcwdare omitted whenNone(#[serde(skip_serializing_if = "Option::is_none")]).stalledis omitted whenfalse(#[serde(skip_serializing_if = "skip_false")]).AgentStateserializes as lowercase:"idle","working","blocked","done","error".
Promotion path
The schema constant and types currently live in colibri-glasspane. Once a
second consumer (a display client binary separate from the daemon) needs to
deserialize GlasspaneSnapshot, the types should be promoted to a shared
colibri-contracts crate. Until then, the crate boundary is sufficient — the
daemon depends on colibri-glasspane and links it directly.
6. Boot sequence
Which starts first?
colibri-daemon starts first and starts alone. colibri-glasspane is a
library crate, not a process — it is compiled into the daemon binary.
Startup order
1. CLI parses args, loads DaemonConfig (env or toml)
│
2. DaemonState::new(config) ──► PaneSupervisor::new() (empty BTreeMap)
│
3. Daemon background loop spawned (tokio::spawn)
├── heartbeat tick (30s)
├── session_rotation tick (60s)
└── memory_handoff tick (120s)
│
4. socket::serve(state, shutdown_rx) ← BLOCKING
├── Binds Unix socket at config.socket_path
├── Accepts connections
└── Dispatches ColibriCommand variants
│
5. External clients (colibri CLI/TUI, optional Herdr Linux/macOS, web) connect
and send commands
Clean boot checklist (in sequence)
- Remove stale socket file if it exists.
- Create parent directory for socket if needed.
- Bind
UnixListener. - Spawn daemon loop task.
- Enter accept loop — the daemon is now ready.
Shutdown sequence
- Daemon receives
shutdown_rx.recv()(from SIGINT/SIGTERM or explicit shutdown command). - Socket server breaks its accept loop.
- Daemon loop task breaks its select loop.
- Socket file is removed.
- All agent subprocesses are killed via
AgentHandle::kill(). - Process exits.
No discovery needed
There is no service discovery between daemon and glasspane because glasspane is
embedded. External clients discover the daemon by connecting to the well-known
Unix socket path (DaemonConfig.socket_path).
Cross-reference
| Document | Relationship |
|---|---|
docs/COLIBRI-GLASSPANE-DESIGN.md |
Glasspane capability design, phase plan, non-goals |
crates/colibri-client/src/lib.rs |
Phase-4 typed Unix-socket client for display/UI consumers |
docs/HERDR-VS-COLIBRI-GRAPH.md |
Hybrid boundary: Herdr as Linux display client |
crates/colibri-daemon/src/socket.rs |
Socket server implementation |
crates/colibri-daemon/src/daemon.rs |
Daemon background loop + heartbeat |
crates/colibri-daemon/src/lib.rs |
Wire types: ColibriCommand, ColibriResponse |
crates/colibri-daemon/src/spawner.rs |
Agent subprocess spawner |
crates/colibri-glasspane/src/lib.rs |
State machine, supervisor, snapshot contract |