colibri/docs/MULTIAGENT-WORKFLOW-IMPROVEMENTS.md
Sam & Claude 6e78ea630d
Some checks failed
CI / rust (pull_request) Has been cancelled
CI / markdown (pull_request) Has been cancelled
docs: clarify Herdr as optional Linux display (Sam & Codex)
Cleans stale Herdr socket/API naming after the Colibri socket rename, preserves Herdr as an optional Linux/macOS display client, marks the clawdie mini-binary service as experimental rather than ISO/deployed-service contract, and removes old internal session logs.\n\nChecks: ./scripts/check-format.sh; cargo fmt --check; git diff --check; sh -n packaging/freebsd/colibri_daemon.in packaging/freebsd/clawdie.in
2026-06-13 12:29:11 +02:00

9.9 KiB

Multiagent Development Workflow Improvements

This document describes three improvements to the Colibri multiagent development workflow, implemented to enhance coordination, tracking, and validation across agents.

Overview

The Colibri project uses a multiagent development model where different AI agents work together on a complex TypeScript → Rust migration. These improvements standardize handoffs, automate proof gate validation, and ensure cross-platform parity.


1. Agent Handoff Protocol

File: doc/<FEATURE>-HANDOFF.md

Purpose

Standardizes how agents transfer context, reducing context loss and creating clear accountability for proof gates and platform validations.

How It Works

  • Before Handoff: Agent validates tests, proof gates, and cross-platform evidence
  • Handoff Template: JSON structure captures context, limitations, and next steps
  • Handoff History: Timestamped log of all handoffs for traceability

Example Handoff Entry

### 2026-05-27T12:30:00Z - Glasspane Integration
- **Agent From**: hermes
- **Agent To**: sam
- **Focus Area**: colibri-glasspane TUI and client integration
- **Proof Gates**: Gate #5 complete, daemon socket API working
- **Evidence**:
  - `manifests/2026-05-26-osa-watchdog-host-status.json`
  - `crates/colibri-glasspane-tui/` created
- **Known Limitations**: PTY launch not yet validated on FreeBSD
- **Next Steps**:
  - Complete colibri-client crate
  - Add glasspane-snapshot command to socket API
  - Live FreeBSD PTY validation

Benefits

  • Reduces context loss between agents
  • Creates accountability for proof gates
  • Prevents re-work due to unclear handoffs
  • Provides audit trail for complex migrations

2. Proof Gate Tracker

Files: tools/proof-gate-tracker.rs, tools/README.md

Purpose

Automated validation tool that checks the status of all 6 migration proof gates, providing instant visibility into which gates are passing/failing.

Proof Gates Tracked

Gate Description Critical Status
#1 Contract round-tripping (11 golden tests) Automated
#2 DeepSeek live cache manifest Automated
#3 Runtime inventory parity Automated
#4 Cross-platform build/test Automated
#5 Watchdog socket reader Automated
#6 Caller inventory & retirement ⏸️ Precondition check

Usage

# Build and run
cargo run --release --bin proof-gate-tracker

# Build and run directly
cargo build --release --bin proof-gate-tracker
./target/release/proof-gate-tracker

Example Output

🔍 Colibri Proof Gate Tracker
═════════════════════════════

✅ gate-1-contracts 🔒: 6/6 golden fixtures valid
✅ gate-2-cache-manifest 🔒: osa: 3584 cache hit tokens; domedog: 3584 cache hit tokens
✅ gate-3-runtime-inventory 🔒: osa: FreeBSD (pi: 0.75.5); domedog: Linux (pi: 0.75.5)
✅ gate-4-cross-platform 🔒: cargo check --workspace passed
✅ gate-5-watchdog 🔒: osa watchdog socket read successful (mode: auto)
⏭️  gate-6-caller-inventory ⏸️: Caller inventory documented, awaiting production verification

═════════════════════════════
Summary: 5/6 gates passing
Critical: 5/5 gates passing

✅ All critical gates are passing!

Exit Codes

  • 0: All critical gates passing
  • 1: Some critical gates failing

Benefits

  • Instant visibility into gate status
  • Prevents breaking previously-passing gates
  • Can be integrated into CI/CD pipelines
  • Reduces manual checklist fatigue

CI/CD Integration

# Example GitHub Actions
- name: Validate Proof Gates
  run: cargo run --release --bin proof-gate-tracker

3. Cross-Platform Smoke Test Matrix

File: tests/platform-matrix.rs

Purpose

Automated integration tests that validate cross-platform parity, catching platform-specific regressions early and providing clear evidence matrices.

Platforms Tested

Platform Host Status Notes
FreeBSD osa.smilepowered.org Active Primary target, FreeBSD 15.0-RELEASE-p8
Linux domedog Active Development/testing, Linux 6.8.0
Linux debby ⏸️ Partial Node 24, no Pi yet

Tests Included

  • all_platforms_validate_core_features: Validates all platforms have valid manifests
  • freebsd_specific_tests: FreeBSD-specific validations (osa)
  • linux_specific_tests: Linux-specific validations (domedog, debby)
  • cache_economics_parity: Verifies cache hit rate consistency across platforms

Usage

# Run all platform matrix tests
cargo test --test platform-matrix

# Run with output
cargo test --test platform-matrix -- --nocapture

# Run specific test
cargo test --test platform-matrix all_platforms_validate_core_features -- --nocapture

Example Output

╔══════════════════════════════════════════════════════════════╗
║           Cross-Platform Smoke Test Matrix                    ║
╚══════════════════════════════════════════════════════════════╝

=== Platform: FreeBSD (host: osa) ===
  ✅ deepseek-cache-hit: cache_hit_observed=true, 3584 tokens cached, model=deepseek-v4-flash
  ✅ runtime-inventory: os=FreeBSD 15.0-RELEASE-p8 x86_64, pi=0.75.5, package_manager=pkg
  ✅ watchdog-socket-read: source=watchdog-socket, mode=auto
  ✅ contract-roundtrip: 2/2 manifest files round-trip successfully

=== Platform: Linux (host: domedog) ===
  ✅ deepseek-cache-hit: cache_hit_observed=true, 3584 tokens cached, model=deepseek-v4-flash
  ✅ runtime-inventory: os=Linux 6.8.0-117-generic x86_64, pi=0.75.5, package_manager=apt
  ✅ watchdog-socket-read: Not applicable on Linux (watchdog socket only on FreeBSD)
  ✅ contract-roundtrip: 2/2 manifest files round-trip successfully

╔══════════════════════════════════════════════════════════════╗
║                        Summary                               ║
╚══════════════════════════════════════════════════════════════╝
Total tests: 16
Passed: 16 ✅
Failed: 0 ❌

✅ FreeBSD validation passed
✅ domedog Linux validation passed
✅ debby Linux validation passed
✅ Average cache hit rate: 97.9%

Benefits

  • Compares Linux and FreeBSD behavior and surfaces platform drift; FreeBSD runtime proof still requires FreeBSD validation
  • Catches platform-specific regressions early
  • Provides clear evidence matrix for each gate
  • Reduces manual "is this working on osa?" checks

Impact Summary

Improvement Primary Benefit Implementation Effort Risk
Agent Handoff Protocol Reduces context loss, improves accountability Low (1-2 hours) Very Low
Proof Gate Tracker Instant visibility, prevents gate regressions Medium (4-6 hours) Low
Platform Matrix Cross-platform parity, automated validation Medium (3-5 hours) Low

Integration Recommendations

1. Pre-Commit Hooks

Add to .git/hooks/pre-commit:

#!/bin/bash
# Run proof gate tracker before committing
cargo run --release --bin proof-gate-tracker
if [ $? -ne 0 ]; then
    echo "❌ Proof gates failing - commit blocked"
    exit 1
fi
echo "✅ All proof gates passing"

2. CI/CD Pipeline

Add to your CI configuration:

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions-rs/toolchain@v1
      - name: Validate Proof Gates
        run: cargo run --release --bin proof-gate-tracker
      - name: Run Platform Matrix Tests
        run: cargo test --test platform-matrix -- --nocapture

3. Agent Workflow

When an agent completes work:

  1. Run proof-gate-tracker and verify all critical gates pass
  2. Run cargo test --workspace and verify all tests pass
  3. Update doc/<FEATURE>-HANDOFF.md with handoff details
  4. Commit and push with clear handoff message

Maintenance

Adding New Proof Gates

  1. Add check function to tools/proof-gate-tracker.rs
  2. Add gate to the gates vector
  3. Mark as critical: true or critical: false
  4. Update this documentation

Adding New Platforms

  1. Add platform to PlatformMatrix tests in tests/platform-matrix.rs
  2. Add validation criteria
  3. Update doc/<FEATURE>-HANDOFF.md platform matrix
  4. Commit manifests to manifests/ directory

Updating Handoff Protocol

  1. Modify doc/<FEATURE>-HANDOFF.md template
  2. Update handoff history format
  3. Communicate changes to all agents

Conclusion

These three improvements are low-risk, high-value enhancements that significantly improve how the multiagent team works together without requiring refactoring core logic. They standardize communication, automate validation, and ensure cross-platform consistency throughout the complex TypeScript → Rust migration.