Commit graph

18 commits

Author SHA1 Message Date
Sam & Claude
b32c3acaed fix(daemon): handle SIGTERM + liveness-aware socket cleanup (Sam & Claude)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
The rc.d "rm stale socket on prestart" fix (07e4660) was a band-aid over
two daemon-side defects that surfaced on the live FreeBSD host:

1. colibri-daemon never handled SIGTERM. main.rs awaited only ctrl_c()
   (SIGINT), so `service stop`/`restart` — which sends SIGTERM via
   daemon(8) to the child — killed it on the default disposition with no
   cleanup. The graceful path (socket removal, agent reaping) never ran,
   leaking the socket file and orphaning spawned agents across restarts.
   Now wait_for_shutdown_signal() selects on SIGTERM or SIGINT, so the
   same graceful path runs on a normal service stop. New integration test
   (tests/sigterm_shutdown.rs) spawns the binary, sends SIGTERM, and
   asserts the socket is removed.

2. Stale-socket cleanup had no liveness check — both the daemon
   (socket.rs) and the rc prestart would unconditionally rm the socket
   before bind, which could delete a *running* instance's socket if
   rc.subr's pid detection misfires and starts a second daemon. Cleanup
   now probes first (clear_stale_socket): connect succeeds -> refuse to
   start; refused/dead -> remove and bind. Unit-tested for absent, stale,
   and live cases.

With the daemon owning safe socket cleanup, the rc prestart no longer
removes the socket (only stale pidfiles), eliminating the restart-time
clobber hazard. This also makes the SIGTERM shutdown described in
ISO-SERVICE-LAYOUT.md (PR #75) actually true.

Gates: cargo fmt --check, clippy -D warnings, cargo test --workspace all
green on Linux; sh -n on the rc script OK. FreeBSD runtime validation
still pending per FREEBSD-BUILD-LANE-HANDOFF.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 08:37:06 +02:00
Sam & Claude
df5fbab051 fix(rc): FreeBSD rc.d deep-audit — cost mode naming, chmod cleanup, health check, docs (Sam & Hermes)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
Six bugs found in deep-dive analysis of FreeBSD rc.d/rc.conf after the
live-copy-safe fix (7d23905):

1. colibri_cost_mode → colibri_daemon_cost_mode: naming broke rc.subr
   ${name}_ convention — operator setting colibri_daemon_cost_mode=fast
   in rc.conf was silently ignored. Fixed in rc.d, staging script,
   rc.conf.sample, and all docs.

2. Removed redundant chmod 660 on socket in poststart: Rust code already
   sets 0770 with documented rationale. The poststart override to 0660
   was conflicting, fragile, and had no comment.

3. Removed unnecessary chmod 644 on pidfile in poststart: pidfile lives
   in a 0750 directory — world-readable permission is pointless and
   security-negative.

4. Fixed ISO-SERVICE-LAYOUT.md: socket perms were wrong (said 750, actual
   770), colibri-daemon.pid was labeled supervisor pidfile (it's the
   child), supervisor pidfile was missing entirely, shutdown behavior
   didn't mention custom stop_cmd targeting the supervisor.

5. health_cmd now checks for non-empty daemon response instead of just
   connectvity — a hung daemon accepting connections but returning
   garbage was reported healthy.

6. rc.conf.sample hostname path: $ (hostname) → $(/bin/hostname) for
   consistency with rc.d script and early-boot PATH safety.

Checks: sh -n OK, cargo fmt --check OK, cargo clippy clean,
cargo test --workspace 207 passed.
2026-06-15 08:28:20 +02:00
8cf36a4264 docs(rc): simplify colibri daemon service comments (Sam & Codex)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
Keep the FreeBSD rc.d comments focused on the intended service contract instead of historical repair details.\n\nChecks: sh -n packaging/freebsd/colibri_daemon.in; git diff --check.
2026-06-15 07:54:51 +02:00
d77a2a524d fix(rc): set PATH for local test agent spawns (Sam & Codex)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
Ensure rc.d-launched colibri-daemon can resolve local provider commands such as colibri-test-agent from /usr/local/bin.\n\nChecks: sh -n packaging/freebsd/colibri_daemon.in; git diff --check.
2026-06-15 07:36:37 +02:00
9891d06144 feat(rc): rename test agent and load provider env (Sam & Codex)
Rename the local deterministic launch helper from colibri-smoke-agent to colibri-test-agent, update CLI/TUI/tests/docs, and teach the FreeBSD rc.d service to source /usr/local/etc/colibri/provider.env plus set a service PATH for local spawns.\n\nChecks: cargo fmt --check; ./scripts/check-format.sh; git diff --check; cargo check -p colibri-daemon -p colibri-client -p colibri-glasspane-tui; cargo check -p colibri-client --bins; cargo test -p colibri-client --test live_socket_check -- --nocapture.
2026-06-15 07:35:44 +02:00
07e4660a95 fix(rc): clear stale colibri socket before start (Sam & Codex)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
Remove stale socket and pidfiles in prestart while rc.d still has root privileges. This handles live USB repair cases where a previous corrupt/root-started daemon left /var/run/colibri/colibri.sock owned by root, causing the colibri user to fail unlink with EPERM and bind with EADDRINUSE.\n\nChecks: sh -n packaging/freebsd/colibri_daemon.in; git diff --check.
2026-06-14 22:39:52 +02:00
7d239053ed fix(rc): make colibri_daemon script live-copy safe (Sam & Codex)
Make the FreeBSD rc.d source safe to copy directly onto the live USB: avoid rc.subr's *_program command override, avoid double privilege drop via daemon(8) -u, and keep pid/socket chmod fixes in the source script.\n\nChecks: sh -n packaging/freebsd/colibri_daemon.in; git diff --check.
2026-06-14 22:08:54 +02:00
Sam & Claude
1f2377d4dd cleanup: drop the experimental clawdie mini-binary
Some checks failed
CI / markdown (pull_request) Has been cancelled
CI / rust (pull_request) Has been cancelled
The `clawdie` crate (Telegram+DeepSeek mini-agent over the control-plane core)
was an experimental operator-lane candidate. Per the agent-harness
consolidation, the live USB runs colibri_daemon + the zot agent, and the
deployed `service clawdie` is a reserved name, not this binary — so the
mini-binary is dead weight. Remove it and its now-orphaned docs.

- delete crates/clawdie (leaf crate; nothing depended on it)
- delete packaging/freebsd/clawdie.in (its rc.d candidate)
- delete docs/CLAWDIE-AGENT-WIKI.md + docs/CLAWDIE-BUILD.md (only described it)
- drop it from workspace members + Cargo.lock; tidy the strip-profile comment
- README: 11 → 10 crates, remove the clawdie row
- COLIBRI-TOKENOMICS-TRIFECTA: drop the stale clawdie-lane scope note

No "relay" existed in this repo (already gone). zot is untouched. The Clawdie
brand, the clawdie operator user, and the reserved deployed `service clawdie`
name are unaffected — this only removes the experimental Rust mini-binary.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 19:19:07 +02:00
6e78ea630d docs: clarify Herdr as optional Linux display (Sam & Codex)
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
Cleans stale Herdr socket/API naming after the Colibri socket rename, preserves Herdr as an optional Linux/macOS display client, marks the clawdie mini-binary service as experimental rather than ISO/deployed-service contract, and removes old internal session logs.\n\nChecks: ./scripts/check-format.sh; cargo fmt --check; git diff --check; sh -n packaging/freebsd/colibri_daemon.in packaging/freebsd/clawdie.in
2026-06-13 12:29:11 +02:00
Sam & Claude
1df13e06af fix(rc.d): supervisor-aware stop + bring clawdie.in to parity (Sam & Claude)
Follow-up to #17. Two issues with the daemon(8) `-r` + child-pidfile pattern:

1. Stop semantics (both services): with `-r`, rc.subr's default stop sends
   SIGTERM to the *child* pid — and the still-running daemon(8) supervisor
   respawns it ~1s later, so `service … stop` never actually stops it. Fix:
   add a `-P` supervisor pidfile and a custom stop_cmd that SIGTERMs the
   supervisor (which forwards to the child and exits without restarting),
   waits up to 30s, SIGKILL fallback, then cleans pidfiles. Child pidfile +
   unique procname are kept for accurate start/status.

2. clawdie.in parity: it still carried the pre-#17 pattern (`-P ${pidfile}`
   as the only pidfile + procname="/usr/sbin/daemon"), so `service clawdie
   status` could match tailscaled/colibri_daemon on a stale pidfile. Brought
   it to the same shape as colibri_daemon.in: child pidfile, procname="clawdie",
   supervisor pidfile, stop_cmd, socket-ready poststart, socket cleanup
   poststop, and a `health` command.

Packaging-only — no Rust touched, no rebuild needed. `sh -n` clean on both;
stop algorithm exercised standalone (kills supervisor, idempotent). FreeBSD
start/stop/status/restart validation still owed on OSA.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 10:49:19 +02:00
4f4d36b0ea fix(rc.d): use child pidfile + unique procname for colibri-daemon (Sam & Hermes)
Fix three bugs identified in live USB diagnostics (COLIBRI-XFCE-HANDOFF-04.JUN.2026):

1. procname collision: 'colibri-daemon' instead of '/usr/sbin/daemon'
   so service status finds OUR process, not tailscaled or any other
   daemon(8)-managed service.

2. pidfile flag: -p (child pidfile) instead of -P (supervisor pidfile)
   so the pidfile holds the colibri-daemon PID, which rc.subr can
   match against the unique procname.

3. Cascading bypass: fixing procname prevents rc.subr from skipping
   daemon(8) entirely. Process tree should now be:
   rc.d → daemon(8) → colibri-daemon (with crash restart)
2026-06-04 09:53:55 +02:00
Sam & Claude
98b232cc0a fix(clawdie): set COLIBRI_DB_PATH so the service doesn't crash-loop at boot (Sam & Claude)
clawdie.in exported COLIBRI_DAEMON_DATA_DIR/SOCKET/HOST but not COLIBRI_DB_PATH.
On FreeBSD, default_db_path() then falls back to /var/db/colibri/colibri.sqlite
— the full Colibri daemon's DB, owned colibri:colibri (0750). clawdie runs as
the clawdie user, so Store::open() hits EACCES; DaemonState::new() panics
(daemon.rs:35) and daemon(8) -r restart-loops forever. The service would never
serve, despite a correct-looking rc.d.

Fix: add a clawdie_db_path var (default ${clawdie_data_dir}/clawdie.sqlite) and
export COLIBRI_DB_PATH from prestart, keeping clawdie's DB in its own
clawdie-owned dir. No collision with the colibri daemon's DB.

Reproduced + verified on Linux:
- COLIBRI_DB_PATH unwritable  → panic "failed to open coordination store … Permission denied", exit 101
- COLIBRI_DB_PATH in data dir → sqlite created, runs clean

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 14:56:41 +02:00
Sam & Claude
25c7f16600 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	Cargo.toml
2026-06-02 09:26:46 +02:00
Sam & Claude
aea2fbe60e feat: add clawdie — simplified operator agent in one small binary (Sam & Claude)
New `clawdie` crate: the operator-friendly face of Colibri. One small Rust
binary that reuses the proven control-plane core (glasspane supervision +
Herdr Unix-socket API + coordination loop — the "split brain") and puts a
DeepSeek-backed Telegram bot in front of it. That is the entire out-of-the-box
surface.

Deliberately lifted vs. the full control plane: cost modes, quota accounting,
context budgets, multi-provider fallback (OpenRouter/Anthropic), per-user
limits. One DeepSeek key serves both the chat lane and the daemon routing.

- crates/clawdie: main (core + bridge wiring), telegram (long-poll bridge),
  deepseek (minimal one-key chat), build.rs (bakes CLAWDIE_TG_TOKEN +
  CLAWDIE_DEEPSEEK_KEY build flags; runtime env overrides).
- packaging/freebsd/clawdie.in: rc.d service, daemon(8)-supervised, restart on
  crash, dedicated clawdie user — starts as a service like Clawdie-AI.
- release profile strips symbols (binary ~7.6 MB stripped).
- docs/CLAWDIE-AGENT-WIKI.md (mindmap), docs/CLAWDIE-BUILD.md (build + ISO +
  next-build XFCE USB fixes), README workspace table.

Build/clippy/fmt green; headless start smoke-tested (socket + sessions bind).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 08:31:59 +02:00
3235f8c00e feat: ISO service hardening — rc.d + log rotation + layout docs
Hardens the FreeBSD service for production readiness:

- rc.d: post-start socket health check (waits up to 10s), post-stop
  socket cleanup, 'health' extra command that probes socket with
  a status command via nc.

- newsyslog: log rotation at 1MB, 7 compressed archives,
  colibri:colibri ownership.

- staging: copies newsyslog config into image root, updated
  staging report to list all installed files.

- docs/ISO-SERVICE-LAYOUT.md: filesystem layout, boot/shutdown
  behavior, startup validation commands, config knobs, secrets
  policy, log rotation details.

Shell syntax: sh -n clean on both scripts.
Workspace tests: all green.
2026-05-31 16:48:48 +02:00
f3a221330b docs: add ISO acceptance tracker and staging helper 2026-05-27 22:52:59 +02:00
d5744e8414 fix: rc.d colibri_daemon runs under daemon(8) (Sam & Claude)
colibri-daemon runs in the foreground and writes no pidfile, so the previous
`command=/usr/local/bin/colibri-daemon` would hang `service start` and leave
status/stop unable to track it. Run it under daemon(8): -P supervisor pidfile,
-r restart on crash, -u colibri privilege drop, -o logfile for the tracing
stdout. start_precmd recreates the colibri-owned /var/run/colibri (tmpfs),
/var/db/colibri, and the log dir each start. Still review-only / not installed;
needs an on-FreeBSD `service` test (osa-smoke rec #4).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:48:47 +02:00
123kupola
db5737bcdb feat: add prototype rc.d service + fix plan wording (Sam & Hermes)
packaging/freebsd/colibri_daemon.in: FreeBSD rc.d service file
for review only, not installed. Uses /var/db/colibri,
/var/run/colibri, COLIBRI_DB_PATH. /tmp smoke test comes first.
2026-05-27 19:30:51 +02:00