Skip to content

feat: outcome tracking + semantic search + read-before-write#4

Merged
mabry1985 merged 1 commit into
mainfrom
feat/outcomes-and-semantic-search
May 17, 2026
Merged

feat: outcome tracking + semantic search + read-before-write#4
mabry1985 merged 1 commit into
mainfrom
feat/outcomes-and-semantic-search

Conversation

@mabry1985
Copy link
Copy Markdown

@mabry1985 mabry1985 commented May 17, 2026

Summary

Three additive expansions driven by deep research into the broader DR/ADR ecosystem:

  1. Outcome tracking — close the feedback loop between an accepted decision and what was actually observed in practice. Outcomes are first-class entities at dr/outcomes/, gated to post-handoff. Five new MCP tools (dr_record_outcome, dr_set_outcome_status, dr_update_outcome, dr_list_outcomes, dr_get_outcome).
  2. Semantic search — embeddings cache at .dr/cache/embeddings.json, populated on every dr_accept_decision. dr_search_decisions returns semantic results when populated, substring fallback when not. dr_reindex_embeddings backfills or rebuilds. OPENAI_EMBEDDING_MODEL env var; "none" disables.
  3. Read-before-write — the deciding agent now retrieves prior accepted decisions before proposing a new one. Hits ≥ 0.85 either suppress the new DR or get cited via related_decisions. Operationalizes what the AgenticAKM literature names but doesn't ship.

What's missing in the broader ecosystem (and why we built this now)

  • No tool we surveyed (adr-tools, log4brains, Backstage TechDocs, MCP ADR Analysis Server, ...) ships outcome tracking. Most leave "Consequences" as a write-once field at decision time.
  • Backstage uses full-text search only; semantic similarity over DRs isn't standard anywhere.
  • The agentic-AKM literature (AgenticAKM, AgDR, Pollick's "ADR Comeback") names read-before-write as the obvious next step but no tool operationalizes it.

Test plan

  • npm run typecheck (clean)
  • npm run test:unit — 77 tests pass (was 48; +29 new in unit-outcomes, unit-embeddings)
  • npm run test:flow — 11 tests pass (was 2; +9 new in flow-outcomes, flow-search, +1 happy-path now asserts embeddings_indexed event)
  • npm run build — both bundles build
  • CLI --help mentions OPENAI_EMBEDDING_MODEL
  • CI passes
  • End-to-end smoke test with a real OPENAI_API_KEY (deferred; flow tests cover the deterministic path)

Files

  • server/src/embeddings/ — new module (client, text, indexer)
  • server/src/tools/outcomes.ts, server/src/tools/search.ts — new tool surfaces
  • server/src/schemas/index.tsOutcomeSchema, EmbeddingCacheSchema, new event kinds, next_outcome_seq
  • schemas/outcome.schema.json + updated event.schema.json, state.schema.json
  • docs/how-to/track-outcomes.md, docs/how-to/search-decisions.md, docs/explanation/research-notes.md
  • server/src/cli/agents/deciding.ts — read-before-write step inserted into the prompt

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added outcome tracking for post-handoff observations tied to decisions.
    • Implemented semantic search for decisions using embeddings with substring fallback.
    • Added "read-before-write" search phase during decision creation to reference similar existing decisions.
    • Updated rendering to include outcomes in generated artifacts.
  • Documentation

    • Added how-to guides for tracking outcomes and searching decisions.
    • Documented new outcome and search tools in MCP tools reference.
    • Updated data model documentation with the new Outcome entity.
  • Tests

    • Added comprehensive tests for outcome lifecycle and search functionality.

Review Change Stack

Three additive expansions driven by deep research into the broader DR/ADR
ecosystem (see docs/explanation/research-notes.md):

1. Outcome tracking — close the feedback loop between an accepted decision
   and what was actually observed in practice. Outcomes are first-class
   entities at dr/outcomes/, with status pending|validated|invalidated|
   inconclusive, optional metric and evidence. Five new MCP tools
   (dr_record_outcome / set_outcome_status / update_outcome / list / get)
   gated to post-handoff only.

2. Semantic search — embeddings cache at .dr/cache/embeddings.json,
   populated on every dr_accept_decision (and on dr_update_decision when
   the result is accepted). dr_search_decisions returns semantic results
   when the cache is populated, substring fallback when not, "empty" when
   no decisions match the status filter. dr_reindex_embeddings backfills
   or rebuilds. OPENAI_EMBEDDING_MODEL env var; "none" disables.

3. Read-before-write — the deciding agent now retrieves prior accepted
   decisions before proposing a new one. The prompt mandates a
   dr_search_decisions call per prospective topic; hits ≥ 0.85 either
   suppress the new DR or get cited via related_decisions.

Tests: 77 unit + 11 flow = 88 green. New JSON schema mirror for outcomes
plus event/state schema extensions. Full docs in Diátaxis style: two new
how-tos, a research-notes explanation, plus updates to data-model, cli,
and mcp-tools references.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mabry1985 mabry1985 merged commit 1687b08 into main May 17, 2026
2 of 3 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 923511d0-9a70-4018-96c2-6830f719c981

📥 Commits

Reviewing files that changed from the base of the PR and between 620986c and af9917c.

📒 Files selected for processing (34)
  • docs/README.md
  • docs/explanation/research-notes.md
  • docs/how-to/search-decisions.md
  • docs/how-to/track-outcomes.md
  • docs/reference/cli.md
  • docs/reference/data-model.md
  • docs/reference/mcp-tools.md
  • schemas/README.md
  • schemas/event.schema.json
  • schemas/outcome.schema.json
  • schemas/state.schema.json
  • server/src/cli/agents/deciding.ts
  • server/src/cli/index.ts
  • server/src/embeddings/client.ts
  • server/src/embeddings/index.ts
  • server/src/embeddings/text.ts
  • server/src/render/html.ts
  • server/src/render/markdown.ts
  • server/src/schemas/index.ts
  • server/src/storage/paths.ts
  • server/src/storage/store.ts
  • server/src/tools/decisions.ts
  • server/src/tools/index.ts
  • server/src/tools/outcomes.ts
  • server/src/tools/pipeline.ts
  • server/src/tools/render.ts
  • server/src/tools/search.ts
  • server/src/util.ts
  • server/tests/flow-outcomes.test.ts
  • server/tests/flow-poc-pipeline.test.ts
  • server/tests/flow-search.test.ts
  • server/tests/helpers/mock-openai.ts
  • server/tests/unit-embeddings.test.ts
  • server/tests/unit-outcomes.test.ts

Walkthrough

This PR introduces three major interconnected capabilities: post-handoff outcome tracking via a new first-class entity with CRUD tools, semantic decision search using embeddings with substring fallback, and read-before-write enforcement for the deciding agent to reuse existing decisions. The changes expand the pipeline from five to six entity types and add an embedding cache layer for decision vectors, with comprehensive documentation, test coverage, and integration into existing rendering and agent workflows.

Changes

Outcome Tracking and Decision Search

Layer / File(s) Summary
Outcome Entity & Storage Foundation
schemas/outcome.schema.json, schemas/state.schema.json, schemas/event.schema.json, server/src/schemas/index.ts, server/src/storage/paths.ts, server/src/storage/store.ts
New Outcome JSON schema with id/decision_id/status/observation fields; extended PipelineState with next_outcome_seq counter; new event kinds for outcome lifecycle and embeddings indexing; Zod schemas for outcomes and embedding cache; storage paths and Store CRUD methods for outcome records, markdown files, and embeddings cache.
Embeddings Infrastructure
server/src/embeddings/client.ts, server/src/embeddings/text.ts, server/src/embeddings/index.ts, server/src/tools/decisions.ts
OpenAI client configuration with environment variable override and disabling support; text composition from decision fields and SHA-256 hashing; indexDecision function that builds cache entries with timestamp; cosine similarity computation; hooks into dr_accept_decision and dr_update_decision to trigger indexing on acceptance.
Decision Search & Reindex Tools
server/src/tools/search.ts, server/src/tools/index.ts
dr_search_decisions accepts query, limit, min_score, status filter; performs semantic ranking when embeddings cache exists, falls back to substring search with warnings when cache missing or embeddings disabled; dr_reindex_embeddings re-embeds all accepted decisions, handles cache initialization and model compatibility; both tools registered via registerSearchTools().
Outcome Management Tools & Rendering
server/src/tools/outcomes.ts, server/src/render/html.ts, server/src/render/markdown.ts, server/src/util.ts
Five outcome tools: dr_record_outcome (with project handed-off + decision accepted invariants), dr_set_outcome_status (with no-op detection), dr_update_outcome (patches fields, tracks changed fields in event), dr_list_outcomes (optional filters), dr_get_outcome; HTML rendering with outcome status pills and decision cross-links; Markdown rendering per outcome with optional decision link; outcomeId utility for O####-slug formatting.
Deciding Agent Read-Before-Write
server/src/cli/agents/deciding.ts, server/src/tools/render.ts
Agent prompt updated to search for related decisions (score ≥0.85 threshold) before proposing new decisions, with guidance to cite/suppress/relate based on hits; render tool loads outcomes, passes to decision/project markdown, writes per-outcome markdown files, includes outcomes counts in HTML header and project summary.
Documentation & References
docs/README.md, docs/explanation/research-notes.md, docs/how-to/track-outcomes.md, docs/how-to/search-decisions.md, docs/reference/data-model.md, docs/reference/mcp-tools.md, docs/reference/cli.md, schemas/README.md
Research notes document explain DR/ADR discipline, outcome tracking rationale, semantic search with fallback, agent enforcement, and roadmap; how-to guides cover outcome recording workflow and search tool behavior; data model docs define entity count (5→6), filesystem layout (dr/outcomes/, .dr/cache/embeddings.json), schema tables, and event/ID conventions; MCP tools docs list all five outcome tools and search tools with inputs/outputs; CLI reference adds OPENAI_EMBEDDING_MODEL environment variable.
Test Infrastructure & Comprehensive Suites
server/tests/helpers/mock-openai.ts, server/tests/unit-embeddings.test.ts, server/tests/unit-outcomes.test.ts, server/tests/flow-outcomes.test.ts, server/tests/flow-search.test.ts, server/tests/flow-poc-pipeline.test.ts
Mock OpenAI extended with MockOpenAIOptions.embeddingsFor for deterministic test embeddings; unit tests cover cosine similarity edge cases, text composition formatting, config resolution, and indexing with cache reuse; outcome schema/storage tests validate schemas and Store CRUD; integration tests verify outcome lifecycle (record/update/status/list/get), search fallback behavior, semantic ranking, reindexing, and embedding indexing on decision acceptance.

Sequence Diagram(s)

sequenceDiagram
  participant Agent as Deciding Agent
  participant Search as dr_search_decisions
  participant Propose as dr_propose_decision
  participant Store as Store
  participant Embeddings as embeddings cache
  Agent->>Search: query topic (read-before-write)
  Search->>Store: load decisions by status
  Search->>Embeddings: retrieve cached vectors
  Embeddings-->>Search: vectors or empty
  alt embeddings available and non-empty
    Search->>Search: compute query embedding
    Search->>Search: cosine similarity ranking
    Search->>Search: filter by min_score ≥0.85
  else fallback
    Search->>Search: substring search
    Search-->>Agent: warnings included
  end
  Search-->>Agent: results with mode/scores
  alt score ≥0.85 match found
    Agent->>Agent: cite/suppress/relate decision
  else no match or low score
    Agent->>Propose: proceed with new decision
    Propose->>Store: create decision
    Propose->>Store: emit decision_proposed
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/outcomes-and-semantic-search

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant