EvalTuitor

EvalTuitor is a terminal-native evaluation runner for LLM applications. It allows developers to define evaluation suites in TOML, execute them against local or remote OpenAI-compatible models, and browse and compare results inside the terminal.

Technical Features

Performance: Compiled Rust binary with asynchronous parallel execution support.
Suite Definition: Configuration-driven test suites defined using standard TOML.
Interactive Interface: Terminal workspace to inspect test runs, failure outputs, and logs.
Directory Explorer: Integrated directory tree browser to preview files and switch project paths.
Configuration Management: Interactive terminal configuration editor to adjust models, limits, and API endpoints.
Run Comparison: Side-by-side comparison interface to examine output changes between historical runs.
Git Integration: Direct installer for git hooks to automate evaluations prior to committing or pushing code.
Report Export: Capability to export detailed run summaries to Markdown files.

Installation

Prerequisites

A Rust toolchain must be installed on your system.

Build Step

Clone the repository and compile the release binary:

cargo build --release

The resulting binary will be generated at ./target/release/evaltuitor.

Project Initialization

Initialize a new project context in the current directory:

./target/release/evaltuitor --init

This creates:

evaltuitor.toml (Configuration file)
evals/example.toml (Sample evaluation suite)

Command Line Usage

Execute all evaluation suites in the evals/ directory and launch the interface:

./target/release/evaltuitor

Execute a specific suite:

./target/release/evaltuitor evals/suite.toml

Override the default model endpoint:

./target/release/evaltuitor --model openai/gpt-4o

Run evaluations and output results to stdout without launching the interface:

./target/release/evaltuitor --no-tui

Configuration Schema (`evaltuitor.toml`)

Global and provider configurations are set in the project root:

[defaults]
model = "ollama/llama3.1"
temperature = 0.0
max_tokens = 2048
timeout_secs = 30
parallelism = 4

[providers.ollama]
base_url = "http://localhost:11434"

[providers.openai]
api_key_env = "OPENAI_API_KEY"

[providers.vllm]
base_url = "http://localhost:8000"

Test Suite Schema

Suite files should be stored under the evals/ directory:

[suite]
name = "Summarization Suite"
description = "Verifies LLM summary behaviors"

[config]
model = "openai/gpt-4o"
temperature = 0.2

[[tests]]
id = "summary-length-check"
prompt = "Summarize this: {{input}}"
input = "Evaluating AI systems requires structured testing..."
assert.type = "contains-all"
assert.values = ["AI", "testing"]

Supported Assertion Types

contains-all / contains-any / contains-none (substring checks)
exact-match (string equivalence)
regex (regular expression verification)
llm-judge (semantic evaluation scoring using a rubric prompt)
max-length / min-length (character length boundaries)
json-schema (structured output JSON validation)
custom (arbitrary external shell command execution)

Keyboard Reference

Standard Interface

j / k or Arrow keys: Navigate suites and test cases.
Tab: Cycle focus between Suites list, Tests list, and Details pane.
f: Toggle the display of failed tests only.
/: Filter test cases by ID or output string.
s: Toggle the configuration editor.
o: Toggle the project and directory explorer.
C: Open the run comparison list.
E: Export current view details to Markdown.
R: Re-execute failed test cases.
?: Toggle the help overlay.
q / Esc: Close overlays or exit the application.

Directory Explorer & Project Opener

j / k or Arrow keys: Move selection up and down.
Space / Enter: Expand or collapse the selected directory tree node.
a: Activate and open the selected folder (or parent folder) as the project.
Backspace / u: Navigate to the parent directory.
[ / ]: Scroll up and down in the preview content pane.
q / Esc: Exit the project explorer.

Git Hooks Setup

Manage git pre-commit, post-merge, and pre-push hooks directly:

# Install hooks in the current repository
./target/release/evaltuitor --install-hooks

# List installed hooks
./target/release/evaltuitor --list-hooks

# Uninstall hooks
./target/release/evaltuitor --uninstall-hooks

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
evals		evals
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EvalTuitor

Technical Features

Installation

Prerequisites

Build Step

Project Initialization

Command Line Usage

Configuration Schema (`evaltuitor.toml`)

Test Suite Schema

Supported Assertion Types

Keyboard Reference

Standard Interface

Directory Explorer & Project Opener

Git Hooks Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EvalTuitor

Technical Features

Installation

Prerequisites

Build Step

Project Initialization

Command Line Usage

Configuration Schema (evaltuitor.toml)

Test Suite Schema

Supported Assertion Types

Keyboard Reference

Standard Interface

Directory Explorer & Project Opener

Git Hooks Setup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Configuration Schema (`evaltuitor.toml`)

Packages