AI Development Workflow

A multi-agent AI development pipeline for Claude Code. Catches bugs that "all tests passing" does not.

Built from research on 20+ academic papers and 50+ open-source repos, validated on a 12-module production integration. Mutation testing showed our TDD tests caught only 52% of potential bugs. After implementing this pipeline: 91% average.

What This Is

A 7-step pipeline that every code module goes through:

0. SPECIFY       → AI drafts acceptance criteria + edge cases + constraints
1. architect     → Design doc + threat model (constrained by spec)
2. gatekeeper    → Validate design + spec completeness
3. builder       → TDD implementation (tests derived from spec)
4. VERIFY        → Quality gates → attacker + reviewer → fix → Stryker → fix (loop)
5. cross-check   → Different model family reviews (Gemini CLI)
6. gatekeeper    → Final go/no-go with completion checklist
7. commit

5 specialized Claude Code agents, each with a specific role and specific tools:

Agent	Role	Model
`architect`	Design + threat model. Read-only.	opus
`builder`	TDD implementation. Fingerprints neighbors.	sonnet
`attacker`	Adversarial chaos testing. Tries to break it.	opus
`reviewer`	Pattern compliance via neighbor-diff. Read-only.	sonnet
`gatekeeper`	Go/no-go decisions. Read-only.	opus

Quick Start

See SETUP-GUIDE.md for step-by-step installation.

Why This Exists

See presentation.html for the full evidence — research findings, mutation testing results, A/B experiments, and the specific incidents that justified each pipeline step.

What's In This Repo

├── README.md              # You're here
├── SETUP-GUIDE.md         # Engineer setup (15 min)
├── presentation.html      # Evidence + rationale (CTO pitch)
├── LEARNINGS.md           # Full research narrative
├── templates/
│   ├── CLAUDE.md          # Pipeline template — adapt for your project
│   ├── agents/            # 5 agent definitions — genericized
│   │   ├── architect.md
│   │   ├── builder.md
│   │   ├── attacker.md
│   │   ├── reviewer.md
│   │   └── gatekeeper.md
│   └── prompts/
│       └── cross-validator.md
├── stryker/
│   ├── stryker.config.mjs # Reference Stryker config
│   └── run-stryker.sh     # Helper script
└── research/              # Research artifacts
    └── PRESENTATION.md    # Markdown source of presentation

Adapting to Your Project

Templates have [ADAPT] markers where you plug in project-specific details. The pipeline structure stays the same — only the domain knowledge changes.

See SETUP-GUIDE.md for details on what to adapt vs keep as-is.

Evidence

Metric	Before	After
Mutation score (worst module)	52%	95%
Mutation score (average)	80%	91%
Bugs caught by reviewer	0	1 CRITICAL (rate limiter misuse)
Bugs caught by Stryker	0	4 modules below 50% exposed

All numbers from our own codebase, not benchmarks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Development Workflow

What This Is

Quick Start

Why This Exists

What's In This Repo

Adapting to Your Project

Evidence

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
research		research
stryker		stryker
templates		templates
.gitignore		.gitignore
LEARNINGS.md		LEARNINGS.md
README.md		README.md
SETUP-GUIDE.md		SETUP-GUIDE.md
presentation.html		presentation.html

Folders and files

Latest commit

History

Repository files navigation

AI Development Workflow

What This Is

Quick Start

Why This Exists

What's In This Repo

Adapting to Your Project

Evidence

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages