Skip to content

add test-after-edit eval condition#36

Open
orban wants to merge 1 commit into
mainfrom
eval/test-after-edit-v2
Open

add test-after-edit eval condition#36
orban wants to merge 1 commit into
mainfrom
eval/test-after-edit-v2

Conversation

@orban
Copy link
Copy Markdown
Owner

@orban orban commented Apr 15, 2026

Summary

Adds a fifth experimental condition test_after_edit to the eval harness. The condition injects a preamble that forces the agent to run relevant tests after every source-file edit before making further changes, isolating the effect of tight test-feedback loops from context-injection effects (none/flat_llm/intent_layer).

Wired through:

  • Condition enum (task_runner.py)
  • preamble map in TaskRunner._build_prompt
  • YAML-tasks condition list in run() (cli.py)
  • reporter display label (reporter.py)
  • new preamble string (prompt_builder.py)

Test plan

  • pytest eval-harness/tests/test_task_runner.py::test_condition_enum
  • Full run against one repo to confirm reports render the new condition label correctly

Notes

  • Dropped an unrelated bash -lcsh -c revert in docker_runner.py that had slipped into the same working tree — that change would have reintroduced exit 127 for repos using uv (see commit 9ab8632).
  • Replaces add test-after-edit eval condition #35, which picked up 17 unrelated commits from nightshift/bus-factor.

adds a fifth experimental condition that constrains the agent to run
tests after every source-file edit before making further changes. isolates
the effect of tight test-driven feedback loops from context-injection
effects (none/flat_llm/intent_layer).

wired through Condition enum, prompt builder, reporter display labels,
and the run() CLI YAML_CONDITIONS list. test_condition_enum updated to
cover the new variant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant