test: add synthesizer deterministic contract tests#47
test: add synthesizer deterministic contract tests#47kavishkartha05 wants to merge 1 commit intoCodelab-Davis:mainfrom
Conversation
📝 WalkthroughWalkthroughA new test module is added for ChangesSynthesizer Unit Tests
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/unit/test_synthesizer.py (2)
8-14: ⚡ Quick winMake
_spy_modelresilient to message ordering.Capturing
messages[0]["content"]assumes index 0 is always the user prompt. Filtering by role makes these tests less brittle to harmless internal message-structure changes.Suggested diff
def _spy_model(response: str = "synthesized answer"): """Returns (model_fn, calls). Each call appends the user-role prompt text.""" calls: list[str] = [] def model_fn(messages: list[dict]) -> str: - calls.append(messages[0]["content"]) + user_msg = next((m for m in messages if m.get("role") == "user"), messages[-1]) + calls.append(user_msg["content"]) return response🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_synthesizer.py` around lines 8 - 14, The test helper _spy_model currently captures messages[0]["content"], which breaks if message order changes; update the model_fn inside _spy_model to locate the user prompt by filtering the messages list for entries where message["role"] == "user" (e.g., pick the first matching user message or join multiple user contents) and append that content to calls so the test records the user prompt regardless of message ordering.
37-50: ⚡ Quick winAssert model-response passthrough and explicit call count in all multi-chunk tests.
For multi-chunk paths, asserting the returned value equals the model response (and consistently asserting
len(calls) == 1) strengthens the contract and gives clearer failures.Suggested diff
def test_multi_chunk_with_questions_prompt_contains_chunk_content_and_sources() -> None: model_fn, calls = _spy_model() @@ - Synthesizer(model_fn).synthesize(chunks, questions=["What is X?"]) + result = Synthesizer(model_fn).synthesize(chunks, questions=["What is X?"]) assert len(calls) == 1 + assert result == "synthesized answer" prompt = calls[0] @@ def test_multi_chunk_with_single_question_marks_it_current() -> None: @@ - Synthesizer(model_fn).synthesize(chunks, questions=["Tell me about alpha"]) + result = Synthesizer(model_fn).synthesize(chunks, questions=["Tell me about alpha"]) + assert len(calls) == 1 + assert result == "synthesized answer" assert "Tell me about alpha (current)" in calls[0] @@ def test_multi_chunk_with_multiple_questions_only_last_is_marked_current() -> None: @@ - Synthesizer(model_fn).synthesize(chunks, questions=questions) + result = Synthesizer(model_fn).synthesize(chunks, questions=questions) + assert len(calls) == 1 + assert result == "synthesized answer" prompt = calls[0] @@ def test_multi_chunk_without_questions_prompt_contains_chunk_content_and_sources() -> None: @@ - Synthesizer(model_fn).synthesize(chunks) + result = Synthesizer(model_fn).synthesize(chunks) assert len(calls) == 1 + assert result == "synthesized answer" prompt = calls[0]Also applies to: 53-60, 64-78, 81-93
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/test_synthesizer.py` around lines 37 - 50, The multi-chunk tests should assert the Synthesizer.synthesize return value equals the fake model response and consistently assert a single model invocation; update each multi-chunk test (e.g., test_multi_chunk_with_questions_prompt_contains_chunk_content_and_sources and the other multi-chunk test functions that use _spy_model and calls) to capture the synthesizer result (result = Synthesizer(model_fn).synthesize(...)), add assert len(calls) == 1 immediately after calling synthesize, and add an assertion like assert result == calls[0] (or the expected model response) so the tests verify both single-call behavior and passthrough of the model response.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unit/test_synthesizer.py`:
- Around line 8-14: The test helper _spy_model currently captures
messages[0]["content"], which breaks if message order changes; update the
model_fn inside _spy_model to locate the user prompt by filtering the messages
list for entries where message["role"] == "user" (e.g., pick the first matching
user message or join multiple user contents) and append that content to calls so
the test records the user prompt regardless of message ordering.
- Around line 37-50: The multi-chunk tests should assert the
Synthesizer.synthesize return value equals the fake model response and
consistently assert a single model invocation; update each multi-chunk test
(e.g., test_multi_chunk_with_questions_prompt_contains_chunk_content_and_sources
and the other multi-chunk test functions that use _spy_model and calls) to
capture the synthesizer result (result = Synthesizer(model_fn).synthesize(...)),
add assert len(calls) == 1 immediately after calling synthesize, and add an
assertion like assert result == calls[0] (or the expected model response) so the
tests verify both single-call behavior and passthrough of the model response.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 390e0c01-f583-4c29-adbe-c9ed03032cb0
📒 Files selected for processing (1)
tests/unit/test_synthesizer.py
Pull Request
Use a Conventional Commit title:
feat: ...,fix: ...,docs: ...,chore: ...,refactor: ...,test: ...,build: ...,ci: ...,perf: ..., orrevert: ....Summary
Adds deterministic offline tests for all three branches of
Synthesizer.synthesize(): empty chunks, single-chunk fast path, and multi-chunk prompt construction. Verifies the model is never called for the first two paths and that prompt content, source labels, and(current)question marking are correct for the multi-chunk path.Linked context
Closes #18
PR Size
PR Type
Context / Motivation
Synthesizerhad no unit tests despite having three distinct code paths with specific output contracts. This PR pins those contracts so future changes to synthesis logic have a regression baseline.Changes
Added
tests/unit/test_synthesizer.pywith 6 tests:[Source: X]labels(current)(current), others are plainHow to test
uv run pytest -m unit -vtest_synthesizer.pyshould pass with no network or Ollama dependencyChecklist
Notes for reviewers
The
_spy_modelhelper follows the same pattern astest_decomposer_prompts.py, in that it capturesmessages[0]["content"]so assertions read against real prompt text. The single-chunk fast path test asserts the model is never called, which essentially documents the existing short-circuit behaviour and catches any future accidental removalSummary by CodeRabbit