Open-source test harness for voice agent workflows.
- Simulate conversations — LLM-powered users talk to your agent, LLM judges score the results
- Test any platform — Retell, VAPI, LiveKit, Bland, Telnyx, or custom agents
- Convert between formats — Import from one platform, export to another via a unified graph IR
- Diagnose failures — Auto-fix broken prompts with an LLM-powered repair loop
- Run anywhere — CLI, Web UI, REST API, CI/CD
uv tool install voicetestOr add to a project (uv run voicetest to run):
uv add voicetestOr with pip:
pip install voicetestTry voicetest with a sample healthcare receptionist agent and 8 test cases:
# Set up an API key (free, no credit card at https://console.groq.com)
export GROQ_API_KEY=gsk_...
# Load demo and start web UI
voicetest demo --serveTip: If you have Claude Code installed, skip API key setup and use
claudecode/sonnetas your model. See Claude Code Passthrough.
voicetest serveAgent import, graph visualization, test execution with real-time streaming transcripts, run history, diagnosis, and more at http://localhost:8000.
Import from any supported format, convert through the unified AgentGraph, and export to any other:
Retell CF ─────┐ ┌───▶ Retell LLM
│ │
Retell LLM ────┼ ├───▶ Retell CF
│ │
VAPI ──────────┼ ├───▶ VAPI
│ │
Bland ─────────┼───▶ AgentGraph ──┼───▶ Bland
│ │
Telnyx ────────┤ ├───▶ Telnyx
│ │
LiveKit ───────┤ ├───▶ LiveKit
│ │
XLSForm ───────┤ ├───▶ Mermaid
│ │
Custom ────────┘ └───▶ Voicetest JSON
| Platform | Import | Push | Sync | API Key Env Var |
|---|---|---|---|---|
| Retell | ✓ | ✓ | ✓ | RETELL_API_KEY |
| VAPI | ✓ | ✓ | ✓ | VAPI_API_KEY |
| Bland | ✓ | ✓ | BLAND_API_KEY |
|
| Telnyx | ✓ | ✓ | ✓ | TELNYX_API_KEY |
| LiveKit | ✓ | ✓ | ✓ | LIVEKIT_API_KEY + LIVEKIT_API_SECRET |
Run voice agent tests in GitHub Actions to catch regressions before production:
name: Voice Agent Tests
on:
push:
paths: ["agents/**"]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v5
- run: uv tool install voicetest
- run: voicetest run --agent agents/receptionist.json --tests agents/tests.json --all
env:
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}Settings are stored in .voicetest/settings.toml:
[models]
agent = "groq/llama-3.1-8b-instant"
simulator = "groq/llama-3.1-8b-instant"
judge = "groq/llama-3.1-8b-instant"
[run]
max_turns = 20
audio_eval = false
streaming = falseAny LiteLLM-compatible model works — OpenAI, Anthropic, Google, Ollama, and more. See the full configuration reference.
Full documentation is at voicetest.dev/docs.
| Topic | Description |
|---|---|
| Getting Started | Install, demo, first test walkthrough |
| Core Concepts | Agent graphs, node types, test cases |
| CLI Reference | All commands and options |
| Features | Format conversion, diagnosis, audio eval, snippets, and more |
| Configuration | Models, settings, Claude Code, platform credentials |
| Architecture | Engine internals, DI, storage |
| Development | Contributing, Docker setup, code quality |
Questions, feedback, or partnerships: hello@voicetest.dev
Apache 2.0
