fix: migrate mentor topic extraction from gpt-4 to gpt-4.1-mini (#4671) #4675

beastoin · 2026-02-08T08:43:37Z

Summary

Replace raw OpenAI(model="gpt-4") client in extract_topics() with shared llm_mini (gpt-4.1-mini) langchain client
Remove standalone OpenAI client init, os import, and client guard — aligns with codebase conventions
gpt-4.1-mini is ~75x cheaper on input tokens ($0.40/M vs $30/M) for a trivial JSON topic extraction task

Impact

~35% reduction in total OpenAI spend (this single call was ~37% of spend at ~82K req/30h)
Gains usage tracking via existing _usage_callback on llm_mini
No quality regression — confirmed by gpt-5.1 LLM judge eval

Quality Eval: gpt-5.1 Judge (10 samples)

Metric	gpt-4 (A)	gpt-4.1-mini (B)
Head-to-head wins	3	4 (3 ties)
Relevance (1-5)	5.0	5.0
Completeness (1-5)	4.7	4.8
Granularity (1-5)	4.5	4.8
Overall (1-5)	4.73	4.87
Avg latency	1,205ms	881ms (1.4x faster)
JSON validity	100%	100%

gpt-4.1-mini wins or ties 7/10 head-to-head matchups and scores higher on all three quality dimensions. No quality regression.

Files Changed

backend/utils/mentor_notifications.py — swap gpt-4 raw client → llm_mini.invoke()
backend/tests/unit/test_mentor_notifications.py — 7 unit tests (source-level + functional)
backend/tests/integration/test_mentor_topics_eval.py — gpt-5.1 judge eval (10 samples + 3 edge cases)
backend/test.sh — register new test file

Test plan

7 unit tests pass (source-level: no raw OpenAI, uses llm_mini; functional: valid JSON, invalid JSON, exception, integration)
All existing backend tests pass (backend/test.sh)
gpt-5.1 judge eval: mini scores 4.87/5 vs gpt-4's 4.73/5 — wins 4, ties 3, loses 3
Monitor OpenAI dashboard post-deploy for cost drop on gpt-4-0613 line

Closes #4671

🤖 Generated with Claude Code

…ction (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…gration (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request is a great initiative to reduce costs by migrating the topic extraction from gpt-4 to the much cheaper gpt-4.1-mini model. The refactoring to use the shared llm_mini client is a clean approach that also brings the benefit of standardized usage tracking. The new unit tests are comprehensive and cover the changes well. I have one high-severity suggestion to ensure the reliability of the topic extraction does not regress due to a change in model parameters.

gemini-code-assist · 2026-02-08T08:45:00Z

backend/utils/mentor_notifications.py

-
-        # Parse the response text as JSON
-        response_text = response.choices[0].message.content.strip()
+        response_text = llm_mini.invoke(prompt).content.strip()


The previous implementation explicitly set temperature=0.3 and max_tokens=150. The new implementation using the shared llm_mini client will use its default temperature (likely 0.7) and no token limit. For a structured data extraction task like this, a lower temperature is important for ensuring reliable and consistently parseable JSON output. The higher default temperature increases randomness, which could lead to malformed JSON and cause json.loads() to fail. To prevent this potential regression and maintain the deterministic nature of the output, I recommend explicitly passing the original parameters to the invoke call.

Suggested change

response_text = llm_mini.invoke(prompt).content.strip()

response_text = llm_mini.invoke(prompt, temperature=0.3, max_tokens=150).content.strip()

…action (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

beastoin · 2026-02-08T09:13:01Z

lgtm

beastoin and others added 3 commits February 8, 2026 08:43

fix: replace gpt-4 with llm_mini (gpt-4.1-mini) in mentor topic extra…

71a419e

…ction (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add unit tests for mentor notification gpt-4 to gpt-4.1-mini mi…

01c6e70

…gration (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: add mentor_notifications test to test.sh (#4671)

f4a17fd

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist bot reviewed Feb 8, 2026

View reviewed changes

beastoin and others added 2 commits February 8, 2026 09:04

test: add integration eval comparing gpt-4 vs gpt-4.1-mini topic extr…

74b8b20

…action (#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: replace Jaccard with gpt-5.1 LLM judge for topic extraction eval (

b3bbba0

#4671) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

beastoin merged commit 0480520 into main Feb 8, 2026
1 check passed

beastoin deleted the fix/mentor-notifications-gpt4-to-mini-4671 branch February 8, 2026 09:13

This was referenced Feb 9, 2026

Migrate plugins gpt-4 hardcoded calls to gpt-4.1-mini (3 files, ~$280/day) #4690

Closed

fix: migrate 3 plugin notification files from gpt-4 to gpt-4.1-mini #4691

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: migrate mentor topic extraction from gpt-4 to gpt-4.1-mini (#4671) #4675

fix: migrate mentor topic extraction from gpt-4 to gpt-4.1-mini (#4671) #4675

beastoin commented Feb 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 8, 2026

Uh oh!

beastoin commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	response_text = llm_mini.invoke(prompt).content.strip()
	response_text = llm_mini.invoke(prompt, temperature=0.3, max_tokens=150).content.strip()

fix: migrate mentor topic extraction from gpt-4 to gpt-4.1-mini (#4671) #4675

fix: migrate mentor topic extraction from gpt-4 to gpt-4.1-mini (#4671) #4675

Conversation

beastoin commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Impact

Quality Eval: gpt-5.1 Judge (10 samples)

Files Changed

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

beastoin commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

beastoin commented Feb 8, 2026 •

edited

Loading