vault provision: daemon needs login→unlock→fetch→lock per call (no standing session) #89

Closed
opened 2026-06-19 22:31:46 +02:00 by clawdie · 2 comments
Owner

Problem

colibri-vault::provision() assumes the caller already holds an unlocked vault session (BW_SESSION), but the caller is now the long-running colibri-daemon, which has none (bw status = unauthenticated). So provisioning can't authenticate today — and the only way to make a standing session work would be to keep the daemon's vault unlocked indefinitely, which is a security smell and conflicts with the host-holds-bootstrap invariant.

The proven shell helper (clawdie-vault-fetch) did this correctly: login --apikey → unlock --passwordenv → fetch → lock per run, with a trap that locks on every exit.

Mirror the shell helper inside the crate (or the hook): per provision call —

  1. bw config server + bw login --apikey (tolerate already-logged-in)
  2. bw unlock --raw --passwordenv to get a short-lived session
  3. fetch
  4. bw lock on completion (incl. error paths)

Read bootstrap creds (BW_CLIENTID/BW_CLIENTSECRET/BW_PASSWORD) from the daemon's provider env file. Never hold a standing unlocked session.

Acceptance

The daemon provisions a tenant with no pre-existing session, and the vault is left locked afterward (verify bw status). Bootstrap creds stay host-side; only the resolved .env enters the jail.

See docs/HIVE-ONBOARDING.md (layered-soul) — "security invariant".

🤖 Generated with Claude Code

## Problem `colibri-vault::provision()` assumes the caller already holds an unlocked vault session (`BW_SESSION`), but the caller is now the long-running `colibri-daemon`, which has none (`bw status = unauthenticated`). So provisioning can't authenticate today — and the only way to make a standing session work would be to keep the daemon's vault **unlocked indefinitely**, which is a security smell and conflicts with the host-holds-bootstrap invariant. The proven shell helper (`clawdie-vault-fetch`) did this correctly: `login --apikey → unlock --passwordenv → fetch → lock` per run, with a `trap` that locks on every exit. ## Fix (recommended) Mirror the shell helper inside the crate (or the hook): per provision call — 1. `bw config server` + `bw login --apikey` (tolerate already-logged-in) 2. `bw unlock --raw --passwordenv` to get a short-lived session 3. fetch 4. **`bw lock`** on completion (incl. error paths) Read bootstrap creds (`BW_CLIENTID`/`BW_CLIENTSECRET`/`BW_PASSWORD`) from the daemon's provider env file. Never hold a standing unlocked session. ## Acceptance The daemon provisions a tenant with no pre-existing session, and the vault is left **locked** afterward (verify `bw status`). Bootstrap creds stay host-side; only the resolved `.env` enters the jail. See docs/HIVE-ONBOARDING.md (layered-soul) — "security invariant". 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Author
Owner

Trade-offs for the fix

Option A — unlock-per-call (login→unlock→fetch→lock each provision; mirrors the proven shell helper)

  • Vault unlocked only for the brief provision window, locked right after — smallest exposure.
  • Already proven (the smoke test used exactly this).
  • No decrypted vault material lingering between spawns.
  • Daemon provider env holds the master password at rest (host-side, but it's the master key).
  • Per-spawn latency; bw is process-global → concurrent provisions race, needs a mutex.

Option B — standing session (unlock once at startup, hold BW_SESSION)

  • Fast; master password could be entered once at boot, not stored.
  • Vault stays unlocked for the daemon's whole uptime → host compromise = all tenants readable. Largest blast radius, worst for multi-tenant. Session expiry handling needed.

Option C — org service-account / scoped API key (no master password; the design doc's "operator's one key")

  • No master password on the daemon; scope to needed collections → minimal blast radius. Best destination.
  • Requires Vaultwarden org + service accounts; verify Vaultwarden parity. Bigger move — likely overengineering for MVP.

Recommendation: A + serialization mutex for the MVP (proven, least-exposure at-rest). Treat C as the destination once multi-tenant is real and you want the master password off the host. Avoid B.

🤖 Generated with Claude Code

## Trade-offs for the fix **Option A — unlock-per-call** (`login→unlock→fetch→lock` each provision; mirrors the proven shell helper) - ✅ Vault unlocked only for the brief provision window, locked right after — smallest exposure. - ✅ Already proven (the smoke test used exactly this). - ✅ No decrypted vault material lingering between spawns. - ❌ Daemon provider env holds the **master password** at rest (host-side, but it's the master key). - ❌ Per-spawn latency; `bw` is process-global → concurrent provisions race, needs a **mutex**. **Option B — standing session** (unlock once at startup, hold `BW_SESSION`) - ✅ Fast; master password could be entered once at boot, not stored. - ❌ Vault stays unlocked for the daemon's whole uptime → host compromise = **all tenants** readable. Largest blast radius, worst for multi-tenant. Session expiry handling needed. **Option C — org service-account / scoped API key** (no master password; the design doc's "operator's one key") - ✅ No master password on the daemon; scope to needed collections → minimal blast radius. Best destination. - ❌ Requires Vaultwarden org + service accounts; verify Vaultwarden parity. Bigger move — likely overengineering for MVP. **Recommendation:** **A + serialization mutex** for the MVP (proven, least-exposure at-rest). Treat **C as the destination** once multi-tenant is real and you want the master password off the host. **Avoid B.** 🤖 Generated with [Claude Code](https://claude.com/claude-code)
clawdie added the
first-proof blocker
label 2026-06-20 06:38:04 +02:00
Author
Owner

Resolved by #94 (fix(vault): use tenant collection names with per-call unlock) — verified: tenant id is now passed as the Vaultwarden collection name (#88), and colibri-vault does per-call login→unlock→fetch→lock from the daemon's provider env, locking on both success and error paths (#89). Closing.

🤖 Generated with Claude Code

Resolved by #94 (`fix(vault): use tenant collection names with per-call unlock`) — verified: tenant id is now passed as the Vaultwarden collection name (#88), and `colibri-vault` does per-call login→unlock→fetch→lock from the daemon's provider env, locking on both success and error paths (#89). Closing. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: clawdie/colibri#89
No description provided.