Skip to content

add benchmark.py module#56

Open
mouayadkarimeh wants to merge 2 commits intocaretech-owl:mainfrom
mouayadkarimeh:gerd/benchmark.py
Open

add benchmark.py module#56
mouayadkarimeh wants to merge 2 commits intocaretech-owl:mainfrom
mouayadkarimeh:gerd/benchmark.py

Conversation

@mouayadkarimeh
Copy link
Copy Markdown
Contributor

Features
Model Comparison: Benchmarks reasoning models vs. non-reasoning models
No-Think Option Support: Integrates with the /no_think parameter to test different model behaviors
RAG Integration: Supports benchmarking with and without RAG (Retrieval-Augmented Generation) concept

Work in Progress
Answer Comparison: Implementation for comparing generated answers and calculating accuracy percentages
Profile Function: Optimization of the profile() function for more reliable result demonstration and output formatting

Copy link
Copy Markdown
Contributor Author

@mouayadkarimeh mouayadkarimeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit contains both pre-commit fixes and the increased ruff complexity limit simultaneously. It's not just the complexity increase — the entire benchmark.py file has been prepared and is ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant