Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Dual-licensed under **MIT OR Apache-2.0**, at your choice.
|---|---|
| [`CLAUDE.md`](CLAUDE.md) | Project-specific rules: docs layout, /create-pr workflow in worktrees, terminology-source-of-truth, branch push policy, idempotent-remote-setup invariants, runbook-fix-fold-back policy. **Read first, every session.** |
| [`docs/arch.md`](docs/arch.md) | Single source of truth for component inventory (K1–K11), trust boundaries, HDKD actor tree, per-actor binding ceremonies. When the per-doc detail outgrows arch.md, link outward — never duplicate. |
| [`docs/spec/plans/development-stages.md`](docs/spec/plans/development-stages.md) | The 8-stage build plan. Each stage has a `harness/stage-N-done.sh` gate; never self-grade — run the gate. |
| [`docs/archived/development-stages-v2-2026-04.md`](docs/archived/development-stages-v2-2026-04.md) | The 8-stage build plan (archived; superseded by the milestone roadmap below). Each stage has a `harness/stage-N-done.sh` gate; never self-grade — run the gate. |
| [`docs/plan/execution-plan.md`](docs/plan/execution-plan.md) | Orchestration runbook (ralph, team, ultraqa workflows). |
| [`docs/spec/broker-and-operator-dev-guide.md`](docs/spec/broker-and-operator-dev-guide.md) | Inner edit-build-test loop for broker + operator-side code. Use this before suggesting changes to the broker's run-time behavior. |

Expand All @@ -97,7 +97,7 @@ These are non-negotiable. Violating them produces broken PRs / corrupted state.
### Per-session protocol

1. `jj log --limit 10 && cat harness/progress.json && bash harness/init.sh $(jq -r .current_stage harness/progress.json)`
2. Read the stage contract for the current stage in `docs/spec/plans/development-stages.md`.
2. Read the milestone scope for the current milestone in `docs/plan/milestones-roadmap.md` (the v1/v2 stage framing is archived at `docs/archived/development-stages-v2-2026-04.md`).
3. Pick the HIGHEST-PRIORITY incomplete deliverable from `harness/features.json`.
4. Implement ONE deliverable, run `cargo test -p <crate>`, `jj describe`, update `harness/features.json`, `jj new`.

Expand Down
2 changes: 1 addition & 1 deletion TODOS.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,4 +144,4 @@ milestone. Filing as TODOs to prevent "post-MVP" from becoming "never":
- Stage 5b: agentic fallback + audit trail + fallback→PR + `/agentkeys-record-scraper` skill usage
- Stage 6: npm package + install.sh + README polish + DX docs
- Stage 8: production hardening (daemon memory hygiene + CLI defensive features)
- Pattern 4 (Heima) audit submission infrastructure — see docs/spec/plans/development-stages.md Stage 9
- Pattern 4 (Heima) audit submission infrastructure — see docs/archived/development-stages-v2-2026-04.md Stage 9
12 changes: 6 additions & 6 deletions crates/agentkeys-cli/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1281,23 +1281,23 @@ pub async fn cmd_scope(
fn format_provision_error(err: &ProvisionError) -> String {
match err {
ProvisionError::InProgress { active_service } => format!(
"Problem: Another provision is running for {}.\nCause: Provisioner serializes calls per daemon.\nFix: Wait and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: Another provision is running for {}.\nCause: Provisioner serializes calls per daemon.\nFix: Wait and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
active_service
),
ProvisionError::Tripwire { kind, step, .. } => format!(
"Problem: A script step timed out at '{}'.\nCause: The target site's DOM may have changed (tripwire: {:?}).\nFix: Open an issue at https://github.com/litentry/agentKeys/issues with the logs.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: A script step timed out at '{}'.\nCause: The target site's DOM may have changed (tripwire: {:?}).\nFix: Open an issue at https://github.com/litentry/agentKeys/issues with the logs.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
step, kind
),
ProvisionError::StoreFailed { obtained_key_masked, .. } => format!(
"Problem: Credential provisioned but storage failed.\nCause: Backend store_credential returned an error.\nFix: Manually store the key with `agentkeys store <service> <key>`. Masked key for reference: {}.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: Credential provisioned but storage failed.\nCause: Backend store_credential returned an error.\nFix: Manually store the key with `agentkeys store <service> <key>`. Masked key for reference: {}.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
obtained_key_masked
),
ProvisionError::VerificationFailed { service, reason } => format!(
"Problem: Key verification failed for {}.\nCause: {}.\nFix: Re-run with --force to attempt a fresh provision.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: Key verification failed for {}.\nCause: {}.\nFix: Re-run with --force to attempt a fresh provision.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
service, reason
),
other => format!(
"Problem: Provision failed.\nCause: {}.\nFix: Check logs and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: Provision failed.\nCause: {}.\nFix: Check logs and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
other
),
}
Expand Down Expand Up @@ -1339,7 +1339,7 @@ pub async fn cmd_provision(
],
other => {
return Err(anyhow!(
"Problem: Service '{}' not supported.\nCause: Only 'openrouter' is supported in Stage 5a.\nFix: Use a supported service name.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md",
"Problem: Service '{}' not supported.\nCause: Only 'openrouter' is supported in Stage 5a.\nFix: Use a supported service name.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md",
other
));
}
Expand Down
2 changes: 1 addition & 1 deletion crates/agentkeys-cli/tests/cli_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1552,7 +1552,7 @@ async fn cli_provision_error_format() {
assert!(result.is_err());
match result.unwrap_err() {
ProvisionError::InProgress { .. } => {
let formatted = "Problem: Another provision is running for openrouter.\nCause: Provisioner serializes calls per daemon.\nFix: Wait and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/spec/plans/development-stages.md";
let formatted = "Problem: Another provision is running for openrouter.\nCause: Provisioner serializes calls per daemon.\nFix: Wait and retry.\nDocs: https://github.com/litentry/agentKeys/blob/main/docs/archived/development-stages-v2-2026-04.md";
assert!(
formatted.contains("Problem:"),
"missing Problem: in: {formatted}"
Expand Down
2 changes: 1 addition & 1 deletion docs/arch.md
Original file line number Diff line number Diff line change
Expand Up @@ -2264,7 +2264,7 @@ The full bring-up runbook lives in [`scripts/setup-broker-host.sh`](../scripts/s
- **Per-endpoint request/response shapes.** Each endpoint surface has its own canonical doc — broker endpoints in `plan/v2-issues/issue-v2-stage-1-foundation.md`; signer in `signer-protocol.md`; workers in per-worker READMEs under each crate.
- **Per-step environment-variable inventory.** That's `operator-runbook.md`.
- **Detailed threat model for K3 retroactive confidentiality.** That's `threat-model-key-custody.md`.
- **Stage-by-stage build progression history.** That's `plans/development-stages.md` + `plan/v2-issues/`.
- **Stage-by-stage build progression history.** That's `plan/milestones-roadmap.md` §1 (Phase 0, which links onward to the archived stage plan) + `plan/v2-issues/`.
- **MetaMask / Foundry tooling instructions.** Retired in v2 — operators no longer hold local EVM keys unless they want to (`identity_type = evm` is supported but not required).
- **v3+ hardening** (per-(user, service) KEK, wrap-and-rewrap, ZK-proven cap minting, threshold-MPC signer, per-operator K3) — tracked separately as v3+ issues. v2 ships the design described here.

Expand Down
14 changes: 7 additions & 7 deletions docs/plan/execution-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Then create `CLAUDE.md` in the repo root encoding the harness workflow (read pro

**Invoke:**
```
/oh-my-claudecode:ralph "Implement Stage 0 per docs/spec/plans/development-stages.md: create Cargo workspace skeleton (7 crates), harness artifacts (init.sh, progress.json, features.json, stage-0-done.sh), agentkeys-types crate (all types from docs/spec/credential-backend-interface.md), agentkeys-core crate (CredentialBackend trait with 15 methods, PaymentRail trait, canonical CBOR serialization, OTP derivation, test vectors). 8 tests must pass. Tag stage-0-done when done."
/oh-my-claudecode:ralph "Implement Stage 0 per docs/archived/development-stages-v2-2026-04.md: create Cargo workspace skeleton (7 crates), harness artifacts (init.sh, progress.json, features.json, stage-0-done.sh), agentkeys-types crate (all types from docs/spec/credential-backend-interface.md), agentkeys-core crate (CredentialBackend trait with 15 methods, PaymentRail trait, canonical CBOR serialization, OTP derivation, test vectors). 8 tests must pass. Tag stage-0-done when done."
```

**Deliverables:** Cargo workspace compiles, 8 tests pass, harness artifacts exist, `bash harness/stage-0-done.sh` exits 0.
Expand All @@ -47,7 +47,7 @@ The largest stage: 37 tests, 10 stories. Ralph loops through them.

**Invoke:**
```
/oh-my-claudecode:ralph "Implement Stage 1 per docs/spec/plans/development-stages.md: agentkeys-mock-server (axum + rusqlite) with 7 SQLite tables, 15 REST endpoints implementing every CredentialBackend method, identity linking, master key custody, TTL/single-use enforcement, MockHttpClient connection. 37 tests must pass. See docs/archived/eng-review-test-plan.md for the full test matrix including property tests (pair-code collision, nonce uniqueness) and integrity tests (tamper detection, OTP replay). Tag stage-1-done when done."
/oh-my-claudecode:ralph "Implement Stage 1 per docs/archived/development-stages-v2-2026-04.md: agentkeys-mock-server (axum + rusqlite) with 7 SQLite tables, 15 REST endpoints implementing every CredentialBackend method, identity linking, master key custody, TTL/single-use enforcement, MockHttpClient connection. 37 tests must pass. See docs/archived/eng-review-test-plan.md for the full test matrix including property tests (pair-code collision, nonce uniqueness) and integrity tests (tamper detection, OTP replay). Tag stage-1-done when done."
```

**Deliverables:** Mock server starts on port 8090, all 37 tests pass, curl smoke test works, `bash harness/stage-1-done.sh` exits 0.
Expand All @@ -60,7 +60,7 @@ The one parallelization opportunity. Stages 2 (CLI, 14 tests) and 3 (Daemon+MCP,

**Invoke:**
```
/oh-my-claudecode:team 2:executor "Two parallel stages for AgentKeys. AGENT 1: Implement Stage 2 (CLI Core) per docs/spec/plans/development-stages.md — 10 CLI commands in agentkeys-cli, 14 tests, keyring session storage, error messaging spec, --help with examples. AGENT 2: Implement Stage 3 (Daemon + MCP) per docs/spec/plans/development-stages.md — agentkeys-daemon binary with MCP tools (get_credential, list_credentials), kernel hardening (memfd_secret, seccomp, caps), 13 tests. Use AGENTKEYS_SESSION env var as test seam (NOT the production bootstrap). Both agents: read harness/progress.json first, commit per deliverable, tag stage-N-done when complete."
/oh-my-claudecode:team 2:executor "Two parallel stages for AgentKeys. AGENT 1: Implement Stage 2 (CLI Core) per docs/archived/development-stages-v2-2026-04.md — 10 CLI commands in agentkeys-cli, 14 tests, keyring session storage, error messaging spec, --help with examples. AGENT 2: Implement Stage 3 (Daemon + MCP) per docs/archived/development-stages-v2-2026-04.md — agentkeys-daemon binary with MCP tools (get_credential, list_credentials), kernel hardening (memfd_secret, seccomp, caps), 13 tests. Use AGENTKEYS_SESSION env var as test seam (NOT the production bootstrap). Both agents: read harness/progress.json first, commit per deliverable, tag stage-N-done when complete."
```

**Deliverables:** Both `stage-2-done.sh` and `stage-3-done.sh` exit 0. `cargo test --workspace` passes all 72 tests (8+37+14+13).
Expand All @@ -73,7 +73,7 @@ The cross-component integration stage. Modifies both daemon (pair-on-startup) an

**Invoke:**
```
/oh-my-claudecode:ralph "Implement Stage 4 per docs/spec/plans/development-stages.md: child-initiates rendezvous pairing (daemon generates keypair → open_auth_request → register_rendezvous → display pair code → long-poll), CLI approve command (fetch_auth_request → display OTP → user confirms → approve_auth_request), recovery flow (--recover with AgentIdentity resolution via identity graph). 11 tests must pass. Two-terminal pair E2E must work. Tag stage-4-done."
/oh-my-claudecode:ralph "Implement Stage 4 per docs/archived/development-stages-v2-2026-04.md: child-initiates rendezvous pairing (daemon generates keypair → open_auth_request → register_rendezvous → display pair code → long-poll), CLI approve command (fetch_auth_request → display OTP → user confirms → approve_auth_request), recovery flow (--recover with AgentIdentity resolution via identity graph). 11 tests must pass. Two-terminal pair E2E must work. Tag stage-4-done."
```

**Deliverables:** Pair flow works across two terminals, recovery preserves credentials, 11 tests pass.
Expand All @@ -86,7 +86,7 @@ Mixed Rust+TypeScript stage. Playwright browser automation for OpenRouter signup

**Invoke:**
```
/oh-my-claudecode:ralph "Implement Stage 5 per docs/spec/plans/development-stages.md: agentkeys-provisioner Rust orchestrator (spawn TS subprocess, IPC via stdin/stdout JSON, encrypt API key to shielding key, store_credential), provisioner-scripts/lib/email.ts (Gmail IMAP plus-addressing for verification codes), provisioner-scripts/scrapers/openrouter.ts (Playwright signup flow using email.ts). MCP tool: agentkeys.provision(service). 9 tests must pass. Tag stage-5-done."
/oh-my-claudecode:ralph "Implement Stage 5 per docs/archived/development-stages-v2-2026-04.md: agentkeys-provisioner Rust orchestrator (spawn TS subprocess, IPC via stdin/stdout JSON, encrypt API key to shielding key, store_credential), provisioner-scripts/lib/email.ts (Gmail IMAP plus-addressing for verification codes), provisioner-scripts/scrapers/openrouter.ts (Playwright signup flow using email.ts). MCP tool: agentkeys.provision(service). 9 tests must pass. Tag stage-5-done."
```

**Deliverables:** Orchestrator IPC tests pass, email client tests pass, live OpenRouter provision works (manual verification by human).
Expand All @@ -99,7 +99,7 @@ Packaging and documentation polish.

**Invoke:**
```
/oh-my-claudecode:ralph "Implement Stage 6 per docs/spec/plans/development-stages.md: @agentkeys/daemon npm package with postinstall binary selection (linux-x64, linux-arm64, darwin-x64, darwin-arm64), install.sh curl script, README with quickstart, docs/how-it-works.md, docs/security-model.md, CHANGELOG, LICENSE (MIT OR Apache-2.0), per-subcommand --help with examples. 7 tests must pass. Tag stage-6-done."
/oh-my-claudecode:ralph "Implement Stage 6 per docs/archived/development-stages-v2-2026-04.md: @agentkeys/daemon npm package with postinstall binary selection (linux-x64, linux-arm64, darwin-x64, darwin-arm64), install.sh curl script, README with quickstart, docs/how-it-works.md, docs/security-model.md, CHANGELOG, LICENSE (MIT OR Apache-2.0), per-subcommand --help with examples. 7 tests must pass. Tag stage-6-done."
```

**Advance:** `bash harness/advance-stage.sh 6 7`
Expand Down Expand Up @@ -133,7 +133,7 @@ bash harness/advance-stage.sh N N+1 # advance to next stage
|---|---|
| Ralph session dies mid-story | Re-invoke `/ralph` with same PRD. Ralph reads progress.json + git log and resumes from last completed story. |
| One team agent fails, other succeeds | Invoke `/ralph` individually for the failed stage. The successful stage's work is already committed. |
| Test seems like a spec bug | `credential-backend-interface.md` > `development-stages.md` > `eng-review-test-plan.md` (priority order). Fix spec if genuinely wrong, then re-run. |
| Test seems like a spec bug | `credential-backend-interface.md` > `development-stages-v2-2026-04.md` > `eng-review-test-plan.md` (priority order). Fix spec if genuinely wrong, then re-run. |
| Playwright breaks on live site | Update selectors in `openrouter.ts`. Rust IPC tests still pass (mock subprocess). |
| Stage 7 E2E keeps failing after 5 ultraqa cycles | Human diagnoses root cause. Likely a cross-component integration issue that needs manual architectural judgment. |

Expand Down
4 changes: 2 additions & 2 deletions docs/spec/broker-and-operator-dev-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ When you don't want to talk to Heima at all, run [foundry](https://book.getfound
anvil --chain-id 31337 --port 8545
```

Then `AGENTKEYS_CHAIN=anvil` in your operator env makes every `cast send` hit anvil instead of Heima. The deployer wallet is whichever anvil-prefunded key you point at via `HEIMA_DEPLOYER_KEY` / `HEIMA_DEPLOYER_KEY_FILE`. Anvil's mempool is single-tenant — none of the [PR #102 nonce-contention issues](./plans/issue-101-ci-auto-deploy.md) bite locally.
Then `AGENTKEYS_CHAIN=anvil` in your operator env makes every `cast send` hit anvil instead of Heima. The deployer wallet is whichever anvil-prefunded key you point at via `HEIMA_DEPLOYER_KEY` / `HEIMA_DEPLOYER_KEY_FILE`. Anvil's mempool is single-tenant — none of the [PR #102 nonce-contention issues](../ci-setup.md) bite locally.

### 4.4 Editing `setup-broker-host.sh`

Expand Down Expand Up @@ -333,4 +333,4 @@ Switch with `--chain` on any harness script. Contract addresses for `heima` and
- [`docs/ci-setup.md`](../ci-setup.md) — no-LLM CI + auto-deploy of test broker (issue #101 / PR #102).
- [`docs/spec/signer-protocol.md`](./signer-protocol.md) — wire contract for the signer (TEE swap-in target).
- [`docs/spec/credential-backend-interface.md`](./credential-backend-interface.md) — the `CredentialBackend` trait; what the broker's storage plug-ins must implement.
- [`docs/spec/plans/development-stages.md`](./plans/development-stages.md) — the staged build plan + harness gates.
- [`docs/archived/development-stages-v2-2026-04.md`](../archived/development-stages-v2-2026-04.md) — the staged build plan + harness gates.
2 changes: 1 addition & 1 deletion docs/spec/heima-gaps-vs-desired-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Related docs:
- [`plans/issue-74-step-1c-device-key-auth.md`](../plan/issue-74-step-1c-device-key-auth.md) — device-key auth on `/dev/*`, planned.
- [`docs/wiki/blockchain-tee-architecture.md`](../wiki/blockchain-tee-architecture.md) — canonical desired architecture (four rules).
- [`docs/wiki/key-security.md`](../wiki/key-security.md) — TEE key security model.
- [`plans/development-stages.md`](./plans/development-stages.md) — stage roadmap; this gap list is the critical path for Stage 6 and Stage 7.
- [`development-stages-v2-2026-04.md`](../archived/development-stages-v2-2026-04.md) — stage roadmap (archived); this gap list is the critical path for Stage 6 and Stage 7.
- [`ses-email-architecture.md`](./ses-email-architecture.md) — Stage 6 email spec; depends on gaps §2, §3, §5.

## 1a. Status snapshot (added 2026-05-09)
Expand Down
Loading
Loading