Cleans stale Herdr socket/API naming after the Colibri socket rename, preserves Herdr as an optional Linux/macOS display client, marks the clawdie mini-binary service as experimental rather than ISO/deployed-service contract, and removes old internal session logs.\n\nChecks: ./scripts/check-format.sh; cargo fmt --check; git diff --check; sh -n packaging/freebsd/colibri_daemon.in packaging/freebsd/clawdie.in
9.9 KiB
Multiagent Development Workflow Improvements
This document describes three improvements to the Colibri multiagent development workflow, implemented to enhance coordination, tracking, and validation across agents.
Overview
The Colibri project uses a multiagent development model where different AI agents work together on a complex TypeScript → Rust migration. These improvements standardize handoffs, automate proof gate validation, and ensure cross-platform parity.
1. Agent Handoff Protocol
File: doc/<FEATURE>-HANDOFF.md
Purpose
Standardizes how agents transfer context, reducing context loss and creating clear accountability for proof gates and platform validations.
How It Works
- Before Handoff: Agent validates tests, proof gates, and cross-platform evidence
- Handoff Template: JSON structure captures context, limitations, and next steps
- Handoff History: Timestamped log of all handoffs for traceability
Example Handoff Entry
### 2026-05-27T12:30:00Z - Glasspane Integration
- **Agent From**: hermes
- **Agent To**: sam
- **Focus Area**: colibri-glasspane TUI and client integration
- **Proof Gates**: Gate #5 complete, daemon socket API working
- **Evidence**:
- `manifests/2026-05-26-osa-watchdog-host-status.json`
- `crates/colibri-glasspane-tui/` created
- **Known Limitations**: PTY launch not yet validated on FreeBSD
- **Next Steps**:
- Complete colibri-client crate
- Add glasspane-snapshot command to socket API
- Live FreeBSD PTY validation
Benefits
- ✅ Reduces context loss between agents
- ✅ Creates accountability for proof gates
- ✅ Prevents re-work due to unclear handoffs
- ✅ Provides audit trail for complex migrations
2. Proof Gate Tracker
Files: tools/proof-gate-tracker.rs, tools/README.md
Purpose
Automated validation tool that checks the status of all 6 migration proof gates, providing instant visibility into which gates are passing/failing.
Proof Gates Tracked
| Gate | Description | Critical | Status |
|---|---|---|---|
| #1 | Contract round-tripping (11 golden tests) | ✅ | Automated |
| #2 | DeepSeek live cache manifest | ✅ | Automated |
| #3 | Runtime inventory parity | ✅ | Automated |
| #4 | Cross-platform build/test | ✅ | Automated |
| #5 | Watchdog socket reader | ✅ | Automated |
| #6 | Caller inventory & retirement | ⏸️ | Precondition check |
Usage
# Build and run
cargo run --release --bin proof-gate-tracker
# Build and run directly
cargo build --release --bin proof-gate-tracker
./target/release/proof-gate-tracker
Example Output
🔍 Colibri Proof Gate Tracker
═════════════════════════════
✅ gate-1-contracts 🔒: 6/6 golden fixtures valid
✅ gate-2-cache-manifest 🔒: osa: 3584 cache hit tokens; domedog: 3584 cache hit tokens
✅ gate-3-runtime-inventory 🔒: osa: FreeBSD (pi: 0.75.5); domedog: Linux (pi: 0.75.5)
✅ gate-4-cross-platform 🔒: cargo check --workspace passed
✅ gate-5-watchdog 🔒: osa watchdog socket read successful (mode: auto)
⏭️ gate-6-caller-inventory ⏸️: Caller inventory documented, awaiting production verification
═════════════════════════════
Summary: 5/6 gates passing
Critical: 5/5 gates passing
✅ All critical gates are passing!
Exit Codes
0: All critical gates passing1: Some critical gates failing
Benefits
- ✅ Instant visibility into gate status
- ✅ Prevents breaking previously-passing gates
- ✅ Can be integrated into CI/CD pipelines
- ✅ Reduces manual checklist fatigue
CI/CD Integration
# Example GitHub Actions
- name: Validate Proof Gates
run: cargo run --release --bin proof-gate-tracker
3. Cross-Platform Smoke Test Matrix
File: tests/platform-matrix.rs
Purpose
Automated integration tests that validate cross-platform parity, catching platform-specific regressions early and providing clear evidence matrices.
Platforms Tested
| Platform | Host | Status | Notes |
|---|---|---|---|
| FreeBSD | osa.smilepowered.org | ✅ Active | Primary target, FreeBSD 15.0-RELEASE-p8 |
| Linux | domedog | ✅ Active | Development/testing, Linux 6.8.0 |
| Linux | debby | ⏸️ Partial | Node 24, no Pi yet |
Tests Included
all_platforms_validate_core_features: Validates all platforms have valid manifestsfreebsd_specific_tests: FreeBSD-specific validations (osa)linux_specific_tests: Linux-specific validations (domedog, debby)cache_economics_parity: Verifies cache hit rate consistency across platforms
Usage
# Run all platform matrix tests
cargo test --test platform-matrix
# Run with output
cargo test --test platform-matrix -- --nocapture
# Run specific test
cargo test --test platform-matrix all_platforms_validate_core_features -- --nocapture
Example Output
╔══════════════════════════════════════════════════════════════╗
║ Cross-Platform Smoke Test Matrix ║
╚══════════════════════════════════════════════════════════════╝
=== Platform: FreeBSD (host: osa) ===
✅ deepseek-cache-hit: cache_hit_observed=true, 3584 tokens cached, model=deepseek-v4-flash
✅ runtime-inventory: os=FreeBSD 15.0-RELEASE-p8 x86_64, pi=0.75.5, package_manager=pkg
✅ watchdog-socket-read: source=watchdog-socket, mode=auto
✅ contract-roundtrip: 2/2 manifest files round-trip successfully
=== Platform: Linux (host: domedog) ===
✅ deepseek-cache-hit: cache_hit_observed=true, 3584 tokens cached, model=deepseek-v4-flash
✅ runtime-inventory: os=Linux 6.8.0-117-generic x86_64, pi=0.75.5, package_manager=apt
✅ watchdog-socket-read: Not applicable on Linux (watchdog socket only on FreeBSD)
✅ contract-roundtrip: 2/2 manifest files round-trip successfully
╔══════════════════════════════════════════════════════════════╗
║ Summary ║
╚══════════════════════════════════════════════════════════════╝
Total tests: 16
Passed: 16 ✅
Failed: 0 ❌
✅ FreeBSD validation passed
✅ domedog Linux validation passed
✅ debby Linux validation passed
✅ Average cache hit rate: 97.9%
Benefits
- ✅ Compares Linux and FreeBSD behavior and surfaces platform drift; FreeBSD runtime proof still requires FreeBSD validation
- ✅ Catches platform-specific regressions early
- ✅ Provides clear evidence matrix for each gate
- ✅ Reduces manual "is this working on osa?" checks
Impact Summary
| Improvement | Primary Benefit | Implementation Effort | Risk |
|---|---|---|---|
| Agent Handoff Protocol | Reduces context loss, improves accountability | Low (1-2 hours) | Very Low |
| Proof Gate Tracker | Instant visibility, prevents gate regressions | Medium (4-6 hours) | Low |
| Platform Matrix | Cross-platform parity, automated validation | Medium (3-5 hours) | Low |
Integration Recommendations
1. Pre-Commit Hooks
Add to .git/hooks/pre-commit:
#!/bin/bash
# Run proof gate tracker before committing
cargo run --release --bin proof-gate-tracker
if [ $? -ne 0 ]; then
echo "❌ Proof gates failing - commit blocked"
exit 1
fi
echo "✅ All proof gates passing"
2. CI/CD Pipeline
Add to your CI configuration:
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions-rs/toolchain@v1
- name: Validate Proof Gates
run: cargo run --release --bin proof-gate-tracker
- name: Run Platform Matrix Tests
run: cargo test --test platform-matrix -- --nocapture
3. Agent Workflow
When an agent completes work:
- Run
proof-gate-trackerand verify all critical gates pass - Run
cargo test --workspaceand verify all tests pass - Update
doc/<FEATURE>-HANDOFF.mdwith handoff details - Commit and push with clear handoff message
Maintenance
Adding New Proof Gates
- Add check function to
tools/proof-gate-tracker.rs - Add gate to the
gatesvector - Mark as
critical: trueorcritical: false - Update this documentation
Adding New Platforms
- Add platform to
PlatformMatrixtests intests/platform-matrix.rs - Add validation criteria
- Update
doc/<FEATURE>-HANDOFF.mdplatform matrix - Commit manifests to
manifests/directory
Updating Handoff Protocol
- Modify
doc/<FEATURE>-HANDOFF.mdtemplate - Update handoff history format
- Communicate changes to all agents
Conclusion
These three improvements are low-risk, high-value enhancements that significantly improve how the multiagent team works together without requiring refactoring core logic. They standardize communication, automate validation, and ensure cross-platform consistency throughout the complex TypeScript → Rust migration.