Model-Benchmark-Suite

A user-friendly streamlit UI for running various lm_eval supported benchmarks on large language models and to compare them with one another.

Supported Benchmarks:

Quick Start

Clone into the repo:

git clone https://github.com/TeichAI/Model-Benchmark-Suite.git
cd Model-Benchmark-Suite

Install deps and start the app:

pip install -r requirements.txt
streamlit run app.py