Skip to content

docs(examples): add public-sector risk-register assurance worked example#45

Open
antnewman wants to merge 2 commits intoSingularityAI-Dev:mainfrom
antnewman:examples/pda-risk-register
Open

docs(examples): add public-sector risk-register assurance worked example#45
antnewman wants to merge 2 commits intoSingularityAI-Dev:mainfrom
antnewman:examples/pda-risk-register

Conversation

@antnewman
Copy link
Copy Markdown
Contributor

@antnewman antnewman commented May 5, 2026

Closes #18 — the regulated-domain worked example you flagged for the v1.4 ship-honest window. Opening as a draft so you can review the shape before I do any final polish; happy to take feedback on any axis.

What this is

A LOGIC.md spec at `examples/pda-risk-register-assurance.logic.md` modelling an eight-step assurance workflow for processing a UK-government project delivery artefact (periodic report, business case, PRINCE2 product description). Output: a triaged risk register written to an assurance store with a full provenance trail.

The four feature stress-tests, bundled

Per your scoping comment on #18:

Feature Where it lives Note
Rubric-based self-verification `quality_gates.self_verification` (5 weighted criteria, `minimum_score: 0.75`) First example/fixture exercising rubric strategy
Write-segregated per-step `allowed_tools` All 8 steps declare both `allowed_tools` and `denied_tools` where load-bearing 5 tool families (read-only doc, schema-validator, LLM+search, read-only store, human-review, write-only store)
`human_review_gate` as first-class step `steps.human_review_gate` with `allowed_tools: [human_review_request]`, LLM denied, `verification.on_fail: abort` Not a flag on an LLM step, not a fallback action — a step in the DAG
Provenance metadata in contracts Every step's `output_schema` includes `provenance` (step_chain, timestamps, source_artefact_id, reviewer_id, confidence); the terminal `assurance_record` collects it Typed first-class output, not afterthought

The 8 steps

`ingest_artefact → validate_artefact → extract_risks → outlier_scan → triage → human_review_gate → persist → emit_report`

Each step has `description`, `instructions`, scoped `allowed_tools`/`denied_tools`, and a typed `output_schema`. `extract_risks` and `triage` carry confidence thresholds; `human_review_gate` carries a 72h timeout with abort-on-bypass verification; `persist` carries a retry on storage-write failure.

Verification

  • `validate(file)` returns `ok: true`
  • `node spec/fixtures/run-fixtures.mjs` → 29 passed, 0 failed (no regression)
  • `npx biome check` clean

Spec edges encountered

The markdown narrative ends with four observations from drafting that may be worth considering for spec evolution. None block this example's correctness against v1.0:

  1. No workflow-local `$ref` in `output_schema` — the `provenance` shape is repeated by convention; a small spec-level `definitions:` block addressable via `$ref` would help.
  2. `human_review_gate` is implicit in v1.0 — modelled via tools and verification rather than a first-class `kind: human_gate` step type.
  3. Tool-name registry is implicit — names like `assurance_store_write` and `human_review_request` are conventional in this example; no canonical taxonomy.
  4. `steps..completed` is used in a quality-gate invariant — the expression engine accepts it but the spec text does not document it as a stable variable. Worth confirming.

Happy to file each of these as a separate follow-up issue if any are worth addressing — explicitly not asking for spec changes inside this PR.

Why draft

Three reasons I've opened this as a draft rather than ready-for-review:

  1. Tool names: `assurance_store_write`, `human_review_request`, `document_reader`, `schema_validator`, `report_renderer` etc. are illustrative. If the project has any preference about a standard tool taxonomy in examples, easy to rename.
  2. Provenance shape: I've inlined the `provenance` object on each step's `output_schema` rather than defining once. Pragmatic given v1.0 doesn't support `$ref`. If you'd prefer a different convention (e.g. metadata on the workflow rather than per-step), happy to refactor.
  3. Narrative depth: ~250 lines of markdown narrative below the YAML. Longer than `research-synthesizer`'s 23-line narrative because the audit-context features benefit from explanation. Easy to trim if too long.

Mark ready-for-review on your nod, or push fixes if you'd rather direct.


Summary by cubic

Adds a public-sector risk-register assurance worked example: an eight-step LOGIC.md workflow that ingests a project artefact, extracts and triages risks, gates via human review, and persists with full provenance. Also adds a deterministic risk_id to keep per-risk correlation consistent across steps.

  • New Features

    • Adds examples/pda-risk-register-assurance.logic.md (eight-step plan-execute workflow).
    • First-class human_review_gate with allowed_tools: [human_review_request], 72h timeout, and abort-on-bypass verification.
    • Per-step allowed_tools/denied_tools across tool families (read-only doc, schema-validator, LLM+search, read-only store, human-review, write-only store).
    • Provenance in every step output, collected into final assurance_record.
    • Rubric-based self-verification with five criteria (minimum_score: 0.75) and quality gates (provenance intact, decision recorded, confidence floor).
    • Invariant ensures persist only runs after human_review_gate completes.
  • Bug Fixes

    • Added deterministic risk_id to extract_risks output (12-hex pattern), used as the join key across outlier_scan, triage, human_review_gate, and emit_report.

Written for commit 4ef40fb. Summary will update on new commits.

Summary by CodeRabbit

  • New Features
    • Added a comprehensive eight-step audit-trail-preserving assurance workflow for risk register extraction, scoring, triaging, and reporting.
    • Includes human review gating for approval, artifact validation with escalation, outlier scanning, and structured verification rubrics.
    • Enforces tool access constraints and provides quality gates with strict validation modes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces a complete eight-step assurance workflow specification (pda-risk-register-assurance) for audit-trail-preserving risk extraction, validation, triage, and persistence with structured tool segregation, human review gates, quality-gate validation, rubric-based self-verification, and explicit fallback escalation logic.

Changes

Risk Register Assurance Workflow Example

Layer / File(s) Summary
Workflow Structure & Specification DAG
examples/pda-risk-register-assurance.logic.md
Defines eight-step DAG (ingest_artefactvalidate_artefactextract_risksoutlier_scantriagehuman_review_gatepersistemit_report) with global metadata, per-step gating rules (deterministic/LLM/human), tool access constraints (read-only, write-only, LLM-denied), confidence/retry/escalation controls, and typed output schemas including provenance requirements.
Validation, Quality Gates & Self-Verification
examples/pda-risk-register-assurance.logic.md
Adds workflow contracts (input/output schemas), strict validation and output handling behavior, capability declarations, quality gates with pre/post output invariants, and an enabled rubric-based self-verification block with weighted criteria and minimum score threshold.
Fallback & Escalation Strategy
examples/pda-risk-register-assurance.logic.md
Structured fallback strategy defining retry/escalate/abort triggers when confidence thresholds are violated or attempt limits are reached.
Documentation & Implementation Rationale
examples/pda-risk-register-assurance.logic.md
Companion narrative describing workflow purpose, tool-segregation pattern, rationale for human review gate as a first-class step, provenance flow conventions, failure modes and fallbacks, self-verification rubric design, enumerated spec edge cases, and licensing attribution.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A logic-md garden grows today,
Eight steps in an audit display,
With gates that guard and tools that sway,
And humans first—come what may! 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a public-sector risk-register assurance worked example to the examples directory.
Linked Issues check ✅ Passed The PR successfully meets all coding objectives from issue #18: provides a canonical public-sector LOGIC.md example, exercises four feature stress-tests (rubric-based self-verification, tool segregation, human_review_gate as first-class step, provenance metadata), and delivers a non-trivial 8-step workflow spec.
Out of Scope Changes check ✅ Passed All changes are in-scope: only adds examples/pda-risk-register-assurance.logic.md. No spec changes, no breaking changes, no unrelated modifications to existing files or core functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description is thorough and comprehensive, covering purpose, features, steps, verification, and spec observations, though some required checklist items are unchecked.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@SingleSourceStudios
Copy link
Copy Markdown
Collaborator

SingleSourceStudios commented May 7, 2026

Scope looks right. All four feature stress-tests landed where requested, and the human_review_gate-as-a-step pattern is exactly the shape I was hoping to see exercised.

On the three "why draft" items:

  1. Tool names: leaving them illustrative is fine. We don't have a canonical taxonomy yet and pre-baking one based on a single example would be premature. Anyone reading the example will understand they're stand-ins.

  2. Provenance shape inlined per-step: correct call given v1.0 has no $ref. When we add it (one of your spec edges below), we can refactor this example as the migration test case.

  3. Narrative depth: keep it. The audit-context features are exactly the kind of thing that benefits from explanation, and "research-synthesizer was 23 lines" isn't a ceiling we should hold worked examples to. Different examples carry different teaching loads.

On the four spec edges you flagged: all four are real, all four are worth tracking, none belong in this PR. If you're up for it, file them as separate issues post-merge and I'll triage. Specifically:

  • Workflow-local $ref in output_schema is the most leveraged one (would clean up half a dozen examples, not just this one).
  • First-class kind: human_gate is worth a design discussion before code; the implicit pattern you've used here is fine for v1.0.
  • Tool-name registry needs a pre-design RFC, not a spec PR.
  • steps..completed in the expression engine is a doc bug; small.

Mark ready-for-review when you're done polishing. Squash-merge target. Closes #18.

Closes SingularityAI-Dev#18.

Eight-step plan-execute workflow for processing a project delivery
artefact through extraction, outlier scan, triage, human review, and
persistence with full provenance. Designed to read as a regulated-domain
reference example.

The four feature stress-tests requested on SingularityAI-Dev#18 are deliberately bundled:

- Rubric-based self-verification (5 weighted criteria, minimum_score 0.75)
  on quality_gates.self_verification — not previously exercised by an
  example or fixture
- Write-segregated per-step allowed_tools across five tool families
  (read-only document, schema-validator, LLM+search, read-only store,
  human-review, write-only store) — every step declares both
  allowed_tools and denied_tools where load-bearing
- human_review_gate as a first-class step in the DAG, not a flag on
  an LLM step or an action in fallback.escalation; allowed_tools is
  exactly [human_review_request], LLM and store-write explicitly denied;
  verification.on_fail = abort guards bypass attempts
- Provenance metadata (step_chain, timestamps, source_artefact_id,
  reviewer_id, confidence) flowing through every step's contract and
  collected into the terminal assurance_record output

Markdown narrative documents strategy choice, tool-segregation table,
the human_review_gate pattern's three benefits, the rubric's audit
rationale, and four spec edges encountered during drafting (no
workflow-local $ref in output_schema; no first-class human-gate step
type; no canonical tool-name registry; steps.<name>.completed not in
spec text).

Verified locally: validate() returns ok; node spec/fixtures/run-fixtures.mjs
shows 29 passed, 0 failed.
@antnewman antnewman force-pushed the examples/pda-risk-register branch from 122a1f3 to 6572ef7 Compare May 7, 2026 09:34
@antnewman antnewman marked this pull request as ready for review May 7, 2026 09:35
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/pda-risk-register-assurance.logic.md`:
- Around line 90-127: Add a stable "risk_id" to each extracted risk: update
output_schema -> properties -> risks.items to include a required string property
"risk_id" and mark it required alongside title/description/etc., and ensure the
extractor populates risk_id deterministically (e.g., hash of title +
cited_byte_offset or other stable key) when emitting risks so downstream steps
(outlier_scan, triage, human_review_gate per-risk decisions,
emit_report.top_risks) can join reliably.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ebe45d5c-48de-453c-a5b1-add27d1f906a

📥 Commits

Reviewing files that changed from the base of the PR and between 9d9f8dc and 6572ef7.

📒 Files selected for processing (1)
  • examples/pda-risk-register-assurance.logic.md

Comment thread examples/pda-risk-register-assurance.logic.md
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/pda-risk-register-assurance.logic.md">

<violation number="1" location="examples/pda-risk-register-assurance.logic.md:120">
P1: `extract_risks` does not emit a stable `risk_id`, but downstream steps require it for per-risk correlation.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread examples/pda-risk-register-assurance.logic.md Outdated
Closes the contract break flagged by CodeRabbit and cubic-dev-ai on PR SingularityAI-Dev#45:
extract_risks did not emit a stable risk_id, but four downstream steps
(outlier_scan, triage, human_review_gate, emit_report) all require it as
the join key for per-risk correlation.

- Adds risk_id to required[] and properties on extract_risks.output_schema.risks.items
- Constrains the format with a 12-hex-char pattern (deterministic SHA-256 prefix)
- Updates instructions to specify deterministic generation (hash of title +
  cited_byte_offset) and to make the contract-key role explicit

Validates clean; node spec/fixtures/run-fixtures.mjs unchanged at 29 passed,
0 failed.
@antnewman
Copy link
Copy Markdown
Contributor Author

Good catch from both @coderabbitai and @cubic-dev-ai — the missing risk_id was a real cross-step contract break, exactly the class of bug LOGIC.md is meant to expose. Fix pushed in 4ef40fb:

  • Added risk_id to extract_risks.output_schema.risks.items.required[] and .properties
  • Constrained the format with a ^[a-f0-9]{12}$ pattern (12-hex-char SHA-256 prefix)
  • Updated instructions to specify deterministic generation (hash of title + cited_byte_offset) and made the contract-key role explicit

Worth noting: the structural validator passed this happily, which is itself a finding worth filing post-merge as a fifth spec edge — cross-step contract integrity (output of step A claims field X but is consumed by step B as field Y) is not statically checkable today. Will queue that as a separate issue alongside the four already flagged.

@cubic-dev-ai
Copy link
Copy Markdown

cubic-dev-ai Bot commented May 7, 2026

Good catch from both @coderabbitai and @cubic-dev-ai — the missing risk_id was a real cross-step contract break, exactly the class of bug LOGIC.md is meant to expose. Fix pushed in 4ef40fb:

  • Added risk_id to extract_risks.output_schema.risks.items.required[] and .properties
  • Constrained the format with a ^[a-f0-9]{12}$ pattern (12-hex-char SHA-256 prefix)
  • Updated instructions to specify deterministic generation (hash of title + cited_byte_offset) and made the contract-key role explicit
    ...

@antnewman The fix looks solid — risk_id is now correctly wired through extract_risks with that ^[a-f0-9]{12}$ pattern and deterministic SHA-256 prefix approach, and it flows cleanly into outlier_scan, triage, human_review_gate, and emit_report as the join key. The commit message even calls out the contract-break class explicitly, which will help anyone reading blame later.

Your observation about the structural validator being blind to cross-step contract integrity is sharp — and I think it's the most practically impactful of the five spec edges. A field-level linter that checked "step A's output_schema claims .risks.items.required includes risk_id; step B's needs: [A] consumes .risks[].risk_id; does A actually produce it?" would have caught this before runtime. That's distinct from the four you already flagged — it's not about $ref or tool taxonomies, it's about data-flow integrity across the DAG.

It fits naturally as spec edge #5 in your existing list. Post-merge as a separate issue, it'd probably be the easiest one to start on since the shape of the check is mechanical (walk needs, compare output_schema to downstream usage), unlike kind: human_gate or the tool-name registry which need design discussions first.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Tip

For best results, initiate chat on the files or code changes.

@antnewman The fix looks solid — deterministic risk_id generation (hash of title + cited_byte_offset) with a tight ^[a-f0-9]{12}$ pattern is the right shape. Pinning it to a SHA-256 prefix rather than a free string also means any runtime that validates the schema will implicitly catch malformed IDs before they propagate downstream.

The fifth spec edge is well-scoped and distinct from the other four:

# Finding Scope
1 Workflow-local $ref in output_schema Schema ergonomics
2 First-class kind: human_gate Design discussion
3 Tool-name registry RFC needed
4 steps.<name>.completed undocumented Doc bug
5 Cross-step contract integrity not statically checkable Validator gap

#5 is arguably the highest-value one for tooling authors — it's the gap that allowed the original risk_id omission to pass validate(file) → ok: true. Worth calling that out explicitly in the issue body.

Would you like me to open the GitHub issue for finding #5 now rather than waiting for post-merge? I have enough context to draft it (including the backlink to this PR and the concrete extract_risks → triage example as the motivating case).

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/pda-risk-register-assurance.logic.md">

<violation number="1" location="examples/pda-risk-register-assurance.logic.md:93">
P2: The suggested `risk_id` derivation is not uniqueness-safe for a field used as the join key across downstream steps.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

instructions: |
From the risk register and any narrative mentioning risk, threat, issue,
dependency, or assumption, extract candidate risk items. For each:
- risk_id (stable identifier; deterministic — e.g. SHA-256 of
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The suggested risk_id derivation is not uniqueness-safe for a field used as the join key across downstream steps.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At examples/pda-risk-register-assurance.logic.md, line 93:

<comment>The suggested `risk_id` derivation is not uniqueness-safe for a field used as the join key across downstream steps.</comment>

<file context>
@@ -90,6 +90,9 @@ steps:
     instructions: |
       From the risk register and any narrative mentioning risk, threat, issue,
       dependency, or assumption, extract candidate risk items. For each:
+      - risk_id (stable identifier; deterministic — e.g. SHA-256 of
+        title + cited_byte_offset, truncated to 12 hex chars. Must be unique
+        within this run and reproducible across reruns of the same artefact.)
</file context>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDA Platform Worked Example

2 participants