Standards for building agents, better
-
Updated
Dec 31, 2025 - TypeScript
Standards for building agents, better
Agentic testing for agentic codebases
Ship agents you can audit.
Agent testing automation ๐ค by simulating users ๐ฅ and agents ๐ค with judge โ๏ธ(langwatch-scenario)
Qualitative benchmark suite for evaluating AI coding agents and orchestration paradigms on realistic, complex development tasks
๐ ๐๐ถ๐ญ๐ต๐ช-๐๐จ๐ฆ๐ฏ๐ต ๐๐บ๐ด๐ต๐ฆ๐ฎ ๐ง๐ฐ๐ณ ๐๐ณ๐ฐ๐ด๐ด-๐๐ฉ๐ฆ๐ค๐ฌ๐ช๐ฏ๐จ ๐๐ฉ๐ช๐ด๐ฉ๐ช๐ฏ๐จ ๐๐๐๐ด.
The Logic Firewall for AI Agents. Prevent infinite loops, token bombing, critical vulnerabilities and more before deployment.
๐งฎ Solve mathematical problems and write proofs in natural language using this easy-to-use reasoning harness. Enhance your problem-solving skills effortlessly.
Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.
To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."