Skip to content

feat(core): verification gate wrapper with quick fixes and escalation#434

Merged
frankbria merged 3 commits intomainfrom
feat/415-verification-gate-wrapper
Mar 10, 2026
Merged

feat(core): verification gate wrapper with quick fixes and escalation#434
frankbria merged 3 commits intomainfrom
feat/415-verification-gate-wrapper

Conversation

@frankbria
Copy link
Owner

@frankbria frankbria commented Mar 10, 2026

Summary

Closes #415

Enhances the VerificationWrapper (used by all external engines: claude-code, opencode, codex) to match ReactAgent's comprehensive self-correction capabilities:

  • Quick fix integration: tries pattern-based fixes (missing imports, lint errors) before re-invoking the expensive external adapter
  • Fix attempt tracking: FixAttemptTracker prevents repeating the same failed fix and detects escalation patterns
  • Escalation blocker: creates informative blocker when tracker recommends escalation or all correction rounds exhausted (returns blocked instead of failed)
  • Shared helper: build_escalation_question() used by both VerificationWrapper and ReactAgent for consistent blocker messages
  • Default retries bumped: max_correction_rounds from 3 → 5, matching ReactAgent

Files Changed

File Change
codeframe/core/adapters/verification_wrapper.py Enhanced with quick fixes, fix tracking, escalation, shared helper
codeframe/core/react_agent.py Refactored _create_escalation_blocker to use shared build_escalation_question
codeframe/core/runtime.py Updated max_correction_rounds=5 for external engines
tests/core/adapters/test_verification_wrapper.py 22 tests (was 11) covering all new behavior

Test plan

  • Quick fix applied before adapter re-invocation (3 tests)
  • Fix tracker records gate failures
  • Escalation creates blocker and returns blocked status
  • Max retries exhausted creates blocker
  • Default max_correction_rounds is 5
  • build_escalation_question shared helper (3 tests)
  • Integration scenario: engine succeeds → gates fail → retry → gates pass
  • All 97 existing tests (wrapper + react agent) pass with no regressions
  • Lint clean (ruff)

Summary by CodeRabbit

  • New Features
    • Verification now attempts automatic quick fixes before escalating failures.
    • New fix-tracking captures attempts and outcomes across correction rounds.
    • Improved escalation messages include concise error summaries and attempted fixes.
    • Default correction attempts increased from 3 to 5.
    • Optional verbose diagnostics for troubleshooting verification flows.

…nd escalation (#415)

Brings the VerificationWrapper (used by all external engines) to parity
with ReactAgent's comprehensive self-correction:

- Quick fix integration: try pattern-based fixes before re-invoking adapter
- Fix attempt tracking: FixAttemptTracker prevents repeated failed fixes
- Escalation blocker: creates blocker when tracker recommends or retries exhausted
- Default max_correction_rounds bumped from 3 to 5
- Shared build_escalation_question() used by both wrapper and ReactAgent
- 22 tests covering all new behavior + integration scenario
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main feature: adding quick fixes and escalation to the verification gate wrapper.
Linked Issues check ✅ Passed All core requirements from issue #415 are met: verification gates run uniformly across engines, max retries increased to 5, quick fixes applied before re-invocation, fix tracking implemented, escalation blockers created, and shared helper function added.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #415: VerificationWrapper enhancements, quick fix integration, fix tracking, escalation logic, shared helper function, and test coverage expansion.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/415-verification-gate-wrapper

Comment @coderabbitai help to get the list of available commands and usage tips.

- Record fix_tracker outcome AFTER quick fix attempt (not before),
  preventing premature escalation from inflated failure counts
- Record SUCCESS outcome when quick fix works
- Add EscalationDecision type annotation to _create_escalation_blocker
- Replace __import__ hack with direct FixOutcome import in test
@claude
Copy link

claude bot commented Mar 10, 2026

PR 434 review - see next comment for details

@claude
Copy link

claude bot commented Mar 10, 2026

Code Review: Verification Gate Wrapper Enhancement

This is a solid PR that extends external engines with the same self-correction capabilities ReactAgent already has. Test coverage is substantially improved (11 to 22 tests). A few issues worth addressing before merge:

Bug: output field dropped from blocker results

Both _create_escalation_blocker and _create_exhaustion_blocker return AgentResult without preserving the adapter last output. The previous failed path had output=result.output and that output is now silently dropped when escalating to a blocker. Both methods would need to receive the last adapter result or output as a parameter to preserve it.

Layering concern: core depending on adapters

react_agent.py (core layer) now imports from codeframe.core.adapters.verification_wrapper at line 29. The adapters subdirectory is for external agent adapter implementations and having core reach into it inverts the expected dependency direction. build_escalation_question is a pure utility that only depends on FixAttemptTracker, so fix_tracker.py would be a more natural home for it.

Minor: off-by-one in escalation timing

Escalation is checked at the start of each round against the previous round recorded failures since record_outcome has not run yet for the current round. This means MAX_SAME_ERROR_ATTEMPTS=3 actually triggers escalation on round 4 not round 3. The test in TestEscalationIntegration pre-fills the tracker with 4 failures to compensate. Worth a comment in the loop clarifying the check-before-record ordering.

Minor: fix_tracker is a public attribute

self.fix_tracker is intentionally public for test inspection. A brief comment noting it is exposed for observability would help future readers.

Quick fix failures not individually tracked

When find_quick_fix returns a fix but apply_quick_fix returns False, the code falls through to record FixOutcome.FAILED with fix_description=verification_gate. The specific quick fix name is not recorded so the tracker cannot prevent retrying the same failing quick fix in a subsequent round.

What is working well

  • build_escalation_question extraction cleanly deduplicates logic between VerificationWrapper and ReactAgent
  • Escalation path correctly prefers blocked over failed which is informative for human-in-the-loop
  • _try_quick_fix as a named method with clear return semantics is easy to follow
  • Test structure with class-per-behavior grouping is much better than the flat original
  • Wrapping blockers.create() in try/except with warning log is the right defensive pattern

The output field issue is the most impactful item. The layering concern is worth addressing before merge. All other items are minor.

@claude
Copy link

claude bot commented Mar 10, 2026

Review summary for feat(core): verification gate wrapper with quick fixes and escalation - see full review in next comment

@claude
Copy link

claude bot commented Mar 10, 2026

Full review posted - please see the detailed review comment below.

@claude
Copy link

claude bot commented Mar 10, 2026

Review: feat(core): verification gate wrapper with quick fixes and escalation

This is a solid implementation that closes the parity gap between VerificationWrapper (used by external engines) and ReactAgent internal self-correction loop. The structure is clean, the test expansion is thorough, and the build_escalation_question refactor is a good DRY improvement. A few issues worth addressing before merging.

Bug - FixOutcome used via dynamic import hack in test: In the escalation test, FixOutcome is accessed via a dynamic import call instead of a proper top-level import. FixOutcome is already exported from codeframe.core.fix_tracker. Just add it to the top-level import alongside FixAttemptTracker: from codeframe.core.fix_tracker import FixAttemptTracker, FixOutcome. This is the single most important fix - the dynamic import approach is fragile and hard to read.

Architecture concern - coupling direction in react_agent.py: react_agent.py now imports from codeframe.core.adapters.verification_wrapper to get build_escalation_question. This creates a dependency from core/react_agent.py to core/adapters/verification_wrapper.py. If verification_wrapper.py ever needs to reference ReactAgent, it would create a circular import. A better home for build_escalation_question would be core/blockers.py or a small core/self_correction.py utility - both ReactAgent and VerificationWrapper would then import from a neutral location rather than one depending on the other.

Lost adapter output in blocker results: Both _create_escalation_blocker and _create_exhaustion_blocker return AgentResult without the output field from the most recent adapter run. The old failure path preserved result.output. The adapter last response before escalation is valuable for debugging. Consider passing last_output through to the blocker helpers.

Fix tracker records FAILED before quick fix is attempted: In the correction loop, the outcome is recorded as FAILED before the quick fix is tried. If a quick fix then succeeds on the next gate check, the tracker still counted this round as a failure. This is semantically odd and could cause premature escalation on long runs where quick fixes are intermittently successful.

Minor - get_error_summary() is a new call on gate results: The old code only called self._format_gate_errors(gate_result). The new code also calls gate_result.get_error_summary(). The or fallback only handles None/empty return, not a missing method. Worth confirming this method exists on all GateResult implementations, or guarding with hasattr.

What is working well: Test structure with separate test classes per concern (TestQuickFixIntegration, TestFixTrackerIntegration, TestEscalationIntegration, TestBuildEscalationQuestion) makes failures easy to diagnose. Using blocked instead of failed on exhaustion is better UX - tasks surface for human review rather than silently failing. The _verbose_print helper is consistent with ReactAgent pattern. The fallback error formatting with the or chain between get_error_summary() and _format_gate_errors() is good defensive coding. Max correction rounds at 5 matches ReactAgent and is appropriate for external engine round-trips.

Summary: The two items that should definitely be fixed before merge are the dynamic import hack in tests and verifying get_error_summary() compatibility on real GateResult objects. The architecture coupling concern and lost adapter output are worth addressing but are lower severity.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/core/adapters/test_verification_wrapper.py (1)

3-13: ⚠️ Potential issue | 🟠 Major

Mark this module as v2.

Please add pytestmark = pytest.mark.v2 below the import block. Without it, these new verification-wrapper tests can be excluded from v2-selective runs.

As per coding guidelines, "New v2 Python tests must be marked with @pytest.mark.v2 decorator or pytestmark = pytest.mark.v2".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/adapters/test_verification_wrapper.py` around lines 3 - 13, Add
the module-level pytest marker by inserting "pytestmark = pytest.mark.v2"
immediately below the import block in this test module
(tests/core/adapters/test_verification_wrapper.py); this ensures the new
verification-wrapper tests (which exercise VerificationWrapper and
build_escalation_question and related fixtures) are included in v2-selective
runs. Place the single line after the last import so the file-level marker
applies to all tests in the module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@codeframe/core/adapters/verification_wrapper.py`:
- Around line 216-223: The wrapper currently creates escalation blockers itself
via the blockers.create call inside the try/except blocks (the two occurrences
that catch Exception and log "Failed to create escalation blocker"); remove
those local blockers.create calls (and their surrounding try/except) so the
runtime remains the single owner of blocker creation when a wrapped adapter
returns status="blocked" with blocker_question, leaving the wrapper to only
return the status/blocker_question to be handled by the runtime.
- Around line 147-159: The quick-fix, escalation-to-blocker, and
retries-exhausted branches in VerificationWrapper currently only print/return
and must emit events via the core/events.py API; update the branches around
_try_quick_fix, the call to _create_escalation_blocker, and the
retries-exhausted path to emit explicit events (e.g.,
VerificationQuickFixApplied, VerificationEscalatedToBlocker,
VerificationRetriesExhausted) including relevant metadata (task_id,
error_summary, escalation, round_num, outcome) and keep the existing calls to
fix_tracker.record_outcome and _verbose_print; ensure events are imported from
core/events.py and emitted before returning or continuing so the runtime bridge
can observe the state transitions.

---

Outside diff comments:
In `@tests/core/adapters/test_verification_wrapper.py`:
- Around line 3-13: Add the module-level pytest marker by inserting "pytestmark
= pytest.mark.v2" immediately below the import block in this test module
(tests/core/adapters/test_verification_wrapper.py); this ensures the new
verification-wrapper tests (which exercise VerificationWrapper and
build_escalation_question and related fixtures) are included in v2-selective
runs. Place the single line after the last import so the file-level marker
applies to all tests in the module.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5f256cc-7f90-4811-91a9-c13575f48bfb

📥 Commits

Reviewing files that changed from the base of the PR and between 6138b22 and 860979f.

📒 Files selected for processing (4)
  • codeframe/core/adapters/verification_wrapper.py
  • codeframe/core/react_agent.py
  • codeframe/core/runtime.py
  • tests/core/adapters/test_verification_wrapper.py

Comment on lines +147 to +159
return self._create_escalation_blocker(
task_id, error_summary, escalation,
)

# 3. Try quick fix first (no adapter re-invocation needed)
if self._try_quick_fix(error_summary):
self.fix_tracker.record_outcome(
error_summary, "verification_gate", FixOutcome.SUCCESS,
)
self._verbose_print(
f"[VerificationWrapper] Quick fix applied (round {round_num + 1})"
)
continue # Re-run gates without re-invoking adapter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Emit events for the new quick-fix and blocker transitions.

These branches only print() or return an AgentResult; they never emit a dedicated event for quick-fix success, escalation-to-blocker, or retries-exhausted. That means the runtime bridge has nothing to forward for the most important new verification state changes.

As per coding guidelines, "All core modules must emit events for state transitions via core/events.py for audit and observability".

Also applies to: 190-192, 208-259

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/core/adapters/verification_wrapper.py` around lines 147 - 159, The
quick-fix, escalation-to-blocker, and retries-exhausted branches in
VerificationWrapper currently only print/return and must emit events via the
core/events.py API; update the branches around _try_quick_fix, the call to
_create_escalation_blocker, and the retries-exhausted path to emit explicit
events (e.g., VerificationQuickFixApplied, VerificationEscalatedToBlocker,
VerificationRetriesExhausted) including relevant metadata (task_id,
error_summary, escalation, round_num, outcome) and keep the existing calls to
fix_tracker.record_outcome and _verbose_print; ensure events are imported from
core/events.py and emitted before returning or continuing so the runtime bridge
can observe the state transitions.

…utput

- Move build_escalation_question to core/fix_tracker.py (neutral location),
  fixing the layering concern where core/react_agent imported from adapters
- Preserve adapter last output in blocker results (was silently dropped)
- Consolidate test imports
@claude
Copy link

claude bot commented Mar 10, 2026

Follow-up Review

Good progress since the last review — several issues from earlier rounds have been addressed:

  • Output preservation resolved — both _create_escalation_blocker and _create_exhaustion_blocker now accept last_output and pass it through to AgentResult.output.
  • Architecture coupling resolved — build_escalation_question lives in fix_tracker.py; react_agent.py imports from there, not from verification_wrapper.py.
  • get_error_summary() compatibility resolved — confirmed the method exists on the real GateResult class in gates.py.
  • Test import clarity resolved — FixOutcome is imported at the top of the test file cleanly, no dynamic imports.

Three items remain:

1. Missing pytestmark = pytest.mark.v2 in test file (actionable)

tests/core/adapters/test_verification_wrapper.py has no module-level marker. Per CLAUDE.md: when adding new v2 functionality, mark tests with pytestmark = pytest.mark.v2 at module level. Without this, all 22 new tests are excluded from uv run pytest -m v2 runs. This was flagged by CodeRabbit's inline comment and is still outstanding.

2. Failed quick-fix attempts not recorded with specific description (minor)

From the previous review — still present. When find_quick_fix returns a fix but apply_quick_fix returns (False, ...), _try_quick_fix returns False and the main loop records FixOutcome.FAILED for "verification_gate" without the specific fix description. The tracker cannot prevent retrying the same failing quick fix in a later round. Consider capturing the fix description from find_quick_fix and passing it through even on failure paths.

3. _create_exhaustion_blocker builds message inline instead of via shared helper (minor)

_create_escalation_blocker delegates to build_escalation_question to include attempted fixes, total failure count, and truncated error. _create_exhaustion_blocker builds its message manually and omits that tracker context. A human receiving an exhaustion blocker gets less diagnostic information than one from an escalation blocker. Consider using build_escalation_question here with a synthetic reason like f"Max correction rounds ({self._max_correction_rounds}) exhausted".


Overall the implementation is solid and earlier structural concerns are resolved. Items 1 and 2 are worth fixing before merge; item 3 is a quality improvement.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/core/adapters/test_verification_wrapper.py (1)

1-10: ⚠️ Potential issue | 🟡 Minor

Mark this test module as v2.

These new verification-wrapper tests are missing the required pytest.mark.v2 marker, so marker-based CI runs can skip them.

Suggested fix
 import pytest
 from pathlib import Path
 from unittest.mock import MagicMock, patch
 
+pytestmark = pytest.mark.v2
+
 from codeframe.core.adapters.verification_wrapper import VerificationWrapper
 from codeframe.core.adapters.agent_adapter import AgentAdapter, AgentEvent, AgentResult
 from codeframe.core.fix_tracker import FixAttemptTracker, FixOutcome, build_escalation_question
As per coding guidelines, "Test files must use the `@pytest.mark.v2` decorator or module-level `pytestmark = pytest.mark.v2` for v2 functionality tests".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/adapters/test_verification_wrapper.py` around lines 1 - 10, The
test module for VerificationWrapper is missing the v2 pytest marker; add the
marker by either setting a module-level pytestmark = pytest.mark.v2 or
decorating the tests/class with `@pytest.mark.v2` so CI will include them as v2
tests—update the test module (test_verification_wrapper.py) to import pytest if
needed and apply pytest.mark.v2 via pytestmark or decorators around the tests
that reference VerificationWrapper, AgentAdapter/AgentEvent/AgentResult,
FixAttemptTracker, FixOutcome and GateStatus.
♻️ Duplicate comments (2)
codeframe/core/adapters/verification_wrapper.py (2)

127-140: ⚠️ Potential issue | 🟠 Major

Emit explicit events for quick-fix, escalation, and exhaustion transitions.

The new branches change verification state in important ways, but they only continue or return an AgentResult. Without dedicated events here, observers cannot distinguish “quick fix applied”, “escalated to blocker”, and “retries exhausted”.

As per coding guidelines, "All core modules must emit events for state transitions via core/events.py for audit and observability".

Also applies to: 171-175, 203-255

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/core/adapters/verification_wrapper.py` around lines 127 - 140, The
verification branches currently return or continue without emitting transition
events, so instrument the VerificationWrapper to publish explicit events (using
the event API in core/events.py) whenever a quick-fix is applied, an escalation
blocker is created, or retries are exhausted: call the event emitter in the code
paths that call _try_quick_fix (after fix_tracker.record_outcome and before
continue) to emit a "verification.quick_fix" event with task_id, error_summary
and round_num; in the branch that returns the result from
_create_escalation_blocker emit a "verification.escalation_blocker" event
including task_id, escalation and last_output; and in the final exhaustion path
(where retries are exceeded) emit a "verification.exhausted" event with task_id,
error_summary and total rounds—use the existing symbols (_try_quick_fix,
_create_escalation_blocker, fix_tracker.record_outcome, AgentResult/return path)
to locate insertion points and follow the event payload conventions in
core/events.py.

203-240: ⚠️ Potential issue | 🟠 Major

Let the runtime stay the single owner of blocker persistence.

Both blocker helpers call blockers.create(...) and then also return status="blocked" with a blocker_question. If the runtime already persists blockers for blocked adapter results, this creates duplicate blocker records and the locally created blocker ID is discarded anyway.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/core/adapters/verification_wrapper.py` around lines 203 - 240, The
code currently calls blockers.create(...) inside the blocker helper methods and
also returns an AgentResult(status="blocked", blocker_question=...), which
duplicates persistence if the runtime already persists blockers; remove the
local persistence so the runtime remains the single owner. Concretely, delete
the try/except wrappers that call blockers.create in _create_exhaustion_blocker
and in the corresponding escalation helper (the block that calls blockers.create
then returns AgentResult with status="blocked" and blocker_question) and simply
return the AgentResult with blocker_question and error info; keep the question
construction and logging behavior but do not call blockers.create or swallow its
exceptions here.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@codeframe/core/adapters/verification_wrapper.py`:
- Around line 59-60: Reset the instance's FixAttemptTracker at the start of each
run() to avoid bleed-over from prior tasks: inside the VerificationWrapper.run
method (or whichever class defines run), reassign self.fix_tracker =
FixAttemptTracker() at the start of the method (and similarly before any
per-task loop that may call escalation logic) so each invocation begins with a
fresh tracker and previous failures won't trigger immediate escalation or
exhaustion; update references to self.fix_tracker accordingly.
- Around line 118-145: The quick-fix path in verification_wrapper.py currently
marks a fix as FixOutcome.SUCCESS as soon as _try_quick_fix(error_summary)
returns true; instead, leave only the initial attempt recorded (use
fix_tracker.record_attempt) and defer calling fix_tracker.record_outcome(...,
FixOutcome.SUCCESS) until after the subsequent gate re-run actually succeeds; if
the re-run still fails, record FixOutcome.FAILED for that same error_summary and
verification_gate so the fix_tracker can prevent repeating bad fixes or
escalate; apply the same change to the identical logic around the other
quick-fix block referenced (the block at lines ~177-189) so both quick-fix
branches only record SUCCESS after a verified successful re-run.

---

Outside diff comments:
In `@tests/core/adapters/test_verification_wrapper.py`:
- Around line 1-10: The test module for VerificationWrapper is missing the v2
pytest marker; add the marker by either setting a module-level pytestmark =
pytest.mark.v2 or decorating the tests/class with `@pytest.mark.v2` so CI will
include them as v2 tests—update the test module (test_verification_wrapper.py)
to import pytest if needed and apply pytest.mark.v2 via pytestmark or decorators
around the tests that reference VerificationWrapper,
AgentAdapter/AgentEvent/AgentResult, FixAttemptTracker, FixOutcome and
GateStatus.

---

Duplicate comments:
In `@codeframe/core/adapters/verification_wrapper.py`:
- Around line 127-140: The verification branches currently return or continue
without emitting transition events, so instrument the VerificationWrapper to
publish explicit events (using the event API in core/events.py) whenever a
quick-fix is applied, an escalation blocker is created, or retries are
exhausted: call the event emitter in the code paths that call _try_quick_fix
(after fix_tracker.record_outcome and before continue) to emit a
"verification.quick_fix" event with task_id, error_summary and round_num; in the
branch that returns the result from _create_escalation_blocker emit a
"verification.escalation_blocker" event including task_id, escalation and
last_output; and in the final exhaustion path (where retries are exceeded) emit
a "verification.exhausted" event with task_id, error_summary and total
rounds—use the existing symbols (_try_quick_fix, _create_escalation_blocker,
fix_tracker.record_outcome, AgentResult/return path) to locate insertion points
and follow the event payload conventions in core/events.py.
- Around line 203-240: The code currently calls blockers.create(...) inside the
blocker helper methods and also returns an AgentResult(status="blocked",
blocker_question=...), which duplicates persistence if the runtime already
persists blockers; remove the local persistence so the runtime remains the
single owner. Concretely, delete the try/except wrappers that call
blockers.create in _create_exhaustion_blocker and in the corresponding
escalation helper (the block that calls blockers.create then returns AgentResult
with status="blocked" and blocker_question) and simply return the AgentResult
with blocker_question and error info; keep the question construction and logging
behavior but do not call blockers.create or swallow its exceptions here.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ff383c6-4649-413d-9437-a4033f1efdf9

📥 Commits

Reviewing files that changed from the base of the PR and between 860979f and 5299499.

📒 Files selected for processing (4)
  • codeframe/core/adapters/verification_wrapper.py
  • codeframe/core/fix_tracker.py
  • codeframe/core/react_agent.py
  • tests/core/adapters/test_verification_wrapper.py

Comment on lines +59 to 60
self.fix_tracker = FixAttemptTracker()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reset the fix tracker at the start of each run().

FixAttemptTracker is stored on the wrapper instance, so failures from a previous task can bleed into the next run() and trigger an immediate escalation or early exhaustion on a fresh task.

Suggested fix
 def run(
     self,
     task_id: str,
     prompt: str,
     workspace_path: Path,
     on_event: Callable[[AgentEvent], None] | None = None,
 ) -> AgentResult:
     """Run the inner adapter, then verify with gates. Self-correct on failure."""
+    self.fix_tracker.reset()
 
     # Initial run
     result = self._inner.run(task_id, prompt, workspace_path, on_event)

Also applies to: 65-75

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/core/adapters/verification_wrapper.py` around lines 59 - 60, Reset
the instance's FixAttemptTracker at the start of each run() to avoid bleed-over
from prior tasks: inside the VerificationWrapper.run method (or whichever class
defines run), reassign self.fix_tracker = FixAttemptTracker() at the start of
the method (and similarly before any per-task loop that may call escalation
logic) so each invocation begins with a fresh tracker and previous failures
won't trigger immediate escalation or exhaustion; update references to
self.fix_tracker accordingly.

Comment on lines +118 to +145
# 1. Record the attempt (outcome deferred until we know if fix works)
self.fix_tracker.record_attempt(error_summary, "verification_gate")

# 2. Check escalation based on prior history
escalation = self.fix_tracker.should_escalate(error_summary)
if escalation.should_escalate:
self.fix_tracker.record_outcome(
error_summary, "verification_gate", FixOutcome.FAILED,
)
return self._create_escalation_blocker(
task_id, error_summary, escalation,
last_output=result.output,
)

# 3. Try quick fix first (no adapter re-invocation needed)
if self._try_quick_fix(error_summary):
self.fix_tracker.record_outcome(
error_summary, "verification_gate", FixOutcome.SUCCESS,
)
self._verbose_print(
f"[VerificationWrapper] Quick fix applied (round {round_num + 1})"
)
continue # Re-run gates without re-invoking adapter

# 4. No quick fix — record failure and re-invoke adapter with error context
self.fix_tracker.record_outcome(
error_summary, "verification_gate", FixOutcome.FAILED,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't mark a quick-fix path successful before the rerun actually passes.

Here a quick fix is counted as SUCCESS as soon as it applies, even though the next gate run may still fail with the same error. That leaves no failed record for the attempted quick fix, so the tracker can't reliably prevent retrying the same bad fix or escalate on repeated failures.

Based on learnings, self-correction attempts must use core/fix_tracker.py to prevent repeating failed fixes.

Also applies to: 177-189

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@codeframe/core/adapters/verification_wrapper.py` around lines 118 - 145, The
quick-fix path in verification_wrapper.py currently marks a fix as
FixOutcome.SUCCESS as soon as _try_quick_fix(error_summary) returns true;
instead, leave only the initial attempt recorded (use
fix_tracker.record_attempt) and defer calling fix_tracker.record_outcome(...,
FixOutcome.SUCCESS) until after the subsequent gate re-run actually succeeds; if
the re-run still fails, record FixOutcome.FAILED for that same error_summary and
verification_gate so the fix_tracker can prevent repeating bad fixes or
escalate; apply the same change to the identical logic around the other
quick-fix block referenced (the block at lines ~177-189) so both quick-fix
branches only record SUCCESS after a verified successful re-run.

@frankbria frankbria merged commit 3bcf94b into main Mar 10, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 4] Verification Gate Wrapper for External Agents

1 participant