LangWatch (langwatch)

LangWatch is an open-source LLM observability, evaluation, and AI agent testing platform. OpenTelemetry-native tracing across LangChain, LangGraph, DSPy, OpenAI Agents, LiteLLM, Pydantic AI, CrewAI, AWS Bedrock, and more — paired with real-time and batch evaluators, prompt versioning, multi-turn agent simulations (open-source scenario framework with User Simulator and Judge Agents), datasets, an OpenAI/Anthropic-compatible AI Gateway with virtual keys and budgets, and a published REST API. Apache-2.0 core (with ee/ enterprise modules under commercial license); MIT-licensed SDKs; deployable as LangWatch Cloud or self-hosted via Docker Compose, Helm, Kind, or full Kubernetes.

URL: Visit APIs.json

Run: Capabilities Using Naftiko

Timestamps

Created: 2026-05-25
Modified: 2026-05-25

Plans

Plan	Currency	Price	Included Events / Month	Retention	Notes
Developer	EUR / USD	Free	50,000	14 days	2 users, 3 scenarios, community support
Growth	EUR	59 / core seat / month	200,000 (+ EUR 0.0005 / event)	30 days (+ EUR 3 / GB)	Unlimited lite-users, private support
Enterprise / Regulated	USD	Custom	Contract	Custom	On-prem, hybrid, SSO, SOC 2 / ISO 27001
Self-Hosted OSS	USD	Free (Apache-2.0)	Unlimited	Unlimited	Docker Compose / Helm / Kind

See plans/langwatch-plans-pricing.yml, rate-limits/langwatch-rate-limits.yml, and finops/langwatch-finops.yml.

Canonical OpenAPI

The full LangWatch REST surface lives in a single OpenAPI 3.1 document maintained inside the langwatch/langwatch monorepo. A mirrored copy is stored here at openapi/langwatch-openapi.json and is referenced by every API entry below.

APIs

LangWatch Traces API

Search, retrieve, and share LLM application traces ingested via OpenTelemetry.

Human URL: https://langwatch.ai/docs/api-reference/traces

LangWatch Evaluators API

Configure and manage scorer evaluators — RAGAS, safety, PII, semantic similarity, LLM-as-Judge variants.

LangWatch Monitors API

Online monitors that automatically score incoming production traces.

LangWatch Datasets API

Manage evaluation, regression, and fine-tuning datasets and their records.

LangWatch Prompts API

Version, tag, sync, and restore prompts across projects with feature-flag-style deployment.

LangWatch Scenarios API

Define multi-turn agent test scenarios used by the open-source scenario framework.

LangWatch Simulation Runs API

Query and retrieve completed agent simulation runs and batches.

LangWatch Suites API

Compose and execute batch test suites combining scenarios, datasets, and evaluators.

LangWatch Experiments API

Trigger and inspect batch experiment runs (including DSPy-driven optimization runs).

LangWatch Annotations API

Collaborative annotation and labeling workflows over traces.

LangWatch Analytics API

Time-series analytics over traces, tokens, cost, latency, and evaluator scores.

LangWatch Dashboards API

Create, reorder, and manage dashboards and their composed graphs.

LangWatch Projects API

Provision and manage projects (workspaces) — the top-level isolation boundary.

LangWatch API Keys API

Create, list, and revoke project API keys used by SDKs and automation.

LangWatch Secrets API

Encrypted credential storage for evaluator and integration secrets.

LangWatch Model Providers API

Configure model-provider credentials and per-project model defaults.

LangWatch AI Gateway API

OpenAI/Anthropic-compatible governance proxy — virtual keys, provider bindings, budgets, semantic cache rules.

LangWatch Workflows API

Compose, version, and run optimization-studio workflows.

LangWatch Agents API

Define and update agent records used by simulations and scenarios.

LangWatch Triggers API

Event-driven triggers that fire on trace conditions and monitor scores.

SDKs, MCP, and Companion Repos

Python SDK — pip install langwatch (instruments OpenAI, Azure, LiteLLM, DSPy, LangChain, plus any OTel client)
TypeScript SDK — npm i langwatch
MCP Server — @langwatch/mcp-server exposes Observability, Prompts, Datasets, Scenarios, and Evaluator tools to Claude / Cursor / other MCP clients
scenario — github.com/langwatch/scenario — open-source multi-turn agent testing with User Simulator and Judge Agents
better-agents — github.com/langwatch/better-agents — standards for building agents
langevals — github.com/langwatch/langevals — evaluator aggregation
cookbooks — github.com/langwatch/cookbooks — Jupyter example notebooks

Self-Hosting

Docker Compose, Helm chart, Kind, or full Kubernetes
Data layer: PostgreSQL + Redis + ClickHouse + OpenSearch
Apache-2.0 core; ee/ modules require commercial license

Common Properties

See apis.yml for the full common block — Portal, Documentation, API Reference, Self-Hosting docs, Pricing, OpenAPI, Application (Cloud Dashboard), SignUp, ChangeLog, Discord, LinkedIn, Twitter, YouTube, and the consolidated Features list.

Maintainer

Kin Lane — kin@apievangelist.com — @apievangelist

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
blogs		blogs
finops		finops
json-ld		json-ld
openapi		openapi
plans		plans
rate-limits		rate-limits
README.md		README.md
apis.yml		apis.yml

Folders and files

Latest commit

History

Repository files navigation

LangWatch (langwatch)

Tags

Timestamps

Plans

Canonical OpenAPI

APIs

LangWatch Traces API

LangWatch Evaluators API

LangWatch Monitors API

LangWatch Datasets API

LangWatch Prompts API

LangWatch Scenarios API

LangWatch Simulation Runs API

LangWatch Suites API

LangWatch Experiments API

LangWatch Annotations API

LangWatch Analytics API

LangWatch Dashboards API

LangWatch Projects API

LangWatch API Keys API

LangWatch Secrets API

LangWatch Model Providers API

LangWatch AI Gateway API

LangWatch Workflows API

LangWatch Agents API

LangWatch Triggers API

SDKs, MCP, and Companion Repos

Self-Hosting

Common Properties

Maintainer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages