add benchmark.py module by mouayadkarimeh · Pull Request #56 · caretech-owl/gerd

mouayadkarimeh · 2026-02-26T23:12:41Z

Features
Model Comparison: Benchmarks reasoning models vs. non-reasoning models
No-Think Option Support: Integrates with the /no_think parameter to test different model behaviors
RAG Integration: Supports benchmarking with and without RAG (Retrieval-Augmented Generation) concept

Work in Progress
Answer Comparison: Implementation for comparing generated answers and calculating accuracy percentages
Profile Function: Optimization of the profile() function for more reliable result demonstration and output formatting

mouayadkarimeh

The latest commit contains both pre-commit fixes and the increased ruff complexity limit simultaneously. It's not just the complexity increase — the entire benchmark.py file has been prepared and is ready for review.

mouayadkarimeh added 2 commits February 27, 2026 00:01

feat: add benchmark.py module

b982e63

fix: increase ruff complexity limit for complex functions

83f6160

mouayadkarimeh commented Mar 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add benchmark.py module#56

add benchmark.py module#56
mouayadkarimeh wants to merge 2 commits intocaretech-owl:mainfrom
mouayadkarimeh:gerd/benchmark.py

mouayadkarimeh commented Feb 26, 2026

Uh oh!

mouayadkarimeh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mouayadkarimeh commented Feb 26, 2026

Uh oh!

mouayadkarimeh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant