Skip to content

storehubai/ai-dev-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Development Workflow

A multi-agent AI development pipeline for Claude Code. Catches bugs that "all tests passing" does not.

Built from research on 20+ academic papers and 50+ open-source repos, validated on a 12-module production integration. Mutation testing showed our TDD tests caught only 52% of potential bugs. After implementing this pipeline: 91% average.

What This Is

A 7-step pipeline that every code module goes through:

0. SPECIFY       → AI drafts acceptance criteria + edge cases + constraints
1. architect     → Design doc + threat model (constrained by spec)
2. gatekeeper    → Validate design + spec completeness
3. builder       → TDD implementation (tests derived from spec)
4. VERIFY        → Quality gates → attacker + reviewer → fix → Stryker → fix (loop)
5. cross-check   → Different model family reviews (Gemini CLI)
6. gatekeeper    → Final go/no-go with completion checklist
7. commit

5 specialized Claude Code agents, each with a specific role and specific tools:

Agent Role Model
architect Design + threat model. Read-only. opus
builder TDD implementation. Fingerprints neighbors. sonnet
attacker Adversarial chaos testing. Tries to break it. opus
reviewer Pattern compliance via neighbor-diff. Read-only. sonnet
gatekeeper Go/no-go decisions. Read-only. opus

Quick Start

See SETUP-GUIDE.md for step-by-step installation.

Why This Exists

See presentation.html for the full evidence — research findings, mutation testing results, A/B experiments, and the specific incidents that justified each pipeline step.

What's In This Repo

├── README.md              # You're here
├── SETUP-GUIDE.md         # Engineer setup (15 min)
├── presentation.html      # Evidence + rationale (CTO pitch)
├── LEARNINGS.md           # Full research narrative
├── templates/
│   ├── CLAUDE.md          # Pipeline template — adapt for your project
│   ├── agents/            # 5 agent definitions — genericized
│   │   ├── architect.md
│   │   ├── builder.md
│   │   ├── attacker.md
│   │   ├── reviewer.md
│   │   └── gatekeeper.md
│   └── prompts/
│       └── cross-validator.md
├── stryker/
│   ├── stryker.config.mjs # Reference Stryker config
│   └── run-stryker.sh     # Helper script
└── research/              # Research artifacts
    └── PRESENTATION.md    # Markdown source of presentation

Adapting to Your Project

Templates have [ADAPT] markers where you plug in project-specific details. The pipeline structure stays the same — only the domain knowledge changes.

See SETUP-GUIDE.md for details on what to adapt vs keep as-is.

Evidence

Metric Before After
Mutation score (worst module) 52% 95%
Mutation score (average) 80% 91%
Bugs caught by reviewer 0 1 CRITICAL (rate limiter misuse)
Bugs caught by Stryker 0 4 modules below 50% exposed

All numbers from our own codebase, not benchmarks.

About

Multi-agent AI development pipeline for Claude Code. 7-step workflow with mutation testing, adversarial testing, and cross-model validation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors