Skip to content

Scans Git repositories to detect suspicious commits that may have been generated by AI.

License

Notifications You must be signed in to change notification settings

CodeMeAPixel/Cadence

Repository files navigation

Cadence

Scans Git repositories to detect suspicious commits that may have been generated by AI.

Status: Ready to use | Tests: 70+ passing | Go: 1.23.0

Quick Start

Install

git clone https://github.com/codemeapixel/cadence.git
cd cadence
go build ./cmd/cadence

Analyze a Repository

# Generate default config
./cadence config > cadence.yaml

# Scan a repo for AI-generated commits
./cadence analyze /path/to/repo -o report.txt --config cadence.yaml

Output shows commits with unusual patterns, confidence scores, and reasons why each was flagged.

Usage

Analyze a Repository

# Generate default config
./cadence config > cadence.yml

# Scan a repo (auto-loads cadence.yml if in current directory)
./cadence analyze /path/to/repo -o report.txt

# Or specify config explicitly
./cadence analyze /path/to/repo -o report.txt --config cadence.yml

# With custom thresholds (overrides config)
./cadence analyze /path/to/repo \
  -o report.json \
  --suspicious-additions 500 \
  --max-additions-pm 100

# Analyze specific branch
./cadence analyze /path/to/repo \
  -o report.json \
  --branch main

# Exclude certain files (node_modules, lock files, etc)
./cadence analyze /path/to/repo \
  -o report.json \
  --exclude-files "*.min.js,package-lock.json"

Note: cadence.yml in the current directory is automatically loaded if no --config flag is specified.

Output Example (Text)

SUSPICIOUS COMMITS
Found 1 suspicious commit(s):

[1] Commit: a1b2c3d4
    Author:     John Doe <john@example.com>
    Date:       2024-01-27T10:30:00Z
    Confidence: 66.7%
    Additions:  1500 lines / 2000 total
    Deletions:  1200 lines / 1500 total
    Files:      45 files changed
    Time Delta: 0.50 minutes
    Velocity:   3000 additions/min | 2400 deletions/min
    
    Reasons:
    - Large commit: 1500 additions (threshold: 500)
    - Fast velocity: 3000 additions/min (threshold: 100)

Output Example (JSON)

{
  "suspicious_commits": [
    {
      "hash": "a1b2c3d4...",
      "author": "John Doe",
      "timestamp": "2024-01-27T10:30:00Z",
      "confidence_score": 0.667,
      "additions_filtered": 1500,
      "deletions_filtered": 1200,
      "addition_velocity_per_min": 3000.0,
      "reasons": [
        "Large commit: 1500 additions (threshold: 500)",
        "Fast velocity: 3000 additions/min (threshold: 100)"
      ]
    }
  ]
}

Detection Strategies

Cadence flags commits that are suspicious based on:

Strategy What it looks for Indicator
Velocity Abnormally fast coding >100 additions/min
Size Huge commits >500 additions
Timing Rapid-fire commits <60 sec apart
Additions Only No deletions, all adds >90% additions
Merge Pattern Unusual merge behavior Context-dependent

Confidence Score: Increases with each triggered strategy. Multiple signals = higher confidence.

AI-Powered Analysis (Optional)

Cadence can leverage OpenAI's GPT models to analyze flagged commits for additional AI-generation indicators. This is optional and requires an OpenAI API key.

Why Use AI Analysis?

  • Second opinion: AI provides independent assessment of suspicious commits
  • Token efficient: Only analyzes already-flagged commits (not all commits)
  • Lightweight: Uses GPT-4 Mini for cost efficiency
  • Complementary: Works alongside statistical detection, not instead of it

Setup

  1. Get an OpenAI API key from https://platform.openai.com/api-keys
  2. Enable in config or environment:
# Via config file (cadence.yaml)
ai:
  enabled: true
  provider: "openai"
  api_key: "sk-..."  # or use env var below
  model: "gpt-4-mini"

# OR via environment variable
export CADENCE_AI_KEY="sk-..."
  1. Run analysis as normal - AI kicks in automatically for suspicious commits

Output

AI analysis appears in both text and JSON reports:

Text Report:

    AI Analysis:     likely AI-generated

JSON Report:

"ai_analysis": "likely AI-generated"

Cost Estimation

  • Average suspicious commit: ~200 tokens
  • GPT-4 Mini: ~$0.00015 per 1K tokens
  • Cost per analysis: ~$0.00003 (3 cents per 1000 commits)

Configuration

Config File (YAML)

Create a cadence.yaml:

thresholds:
  # Commit size limits
  suspicious_additions: 500      # additions per commit
  suspicious_deletions: 1000     # deletions per commit
  
  # Velocity limits
  max_additions_per_min: 100     # additions per minute
  max_deletions_per_min: 500     # deletions per minute
  
  # Timing
  min_time_delta_seconds: 60     # seconds between commits

# Files to ignore
exclude_files:
  - "*.min.js"
  - "package-lock.json"
  - "yarn.lock"

Command Line Flags

./cadence analyze <repo> [flags]

Flags:
  -o, --output string              Output file (required) - .txt or .json
  --suspicious-additions int       Flag commits >N additions (default: 500)
  --suspicious-deletions int       Flag commits >N deletions (default: 1000)
  --max-additions-pm float         Max additions per minute (default: 100)
  --max-deletions-pm float         Max deletions per minute (default: 500)
  --min-time-delta int            Min seconds between commits (default: 60)
  --branch string                 Branch to analyze (default: all)
  --exclude-files strings         File patterns to exclude
  --config string                 Config file path

Environment Variables

# Set webhook server config
export CADENCE_WEBHOOK_PORT=3000
export CADENCE_WEBHOOK_SECRET="your-secret-key"
export CADENCE_WEBHOOK_MAX_WORKERS=4

Webhook Server

Start the Server

./cadence webhook --port 3000 --secret "webhook-secret-key"

Configure GitHub Webhook

  1. Repository Settings → Webhooks → Add webhook
  2. Payload URL: https://your-server:3000/webhooks/github
  3. Content type: application/json
  4. Secret: Use same value as --secret flag
  5. Events: Select "Push events"

API Endpoints

Receive webhook push event

POST /webhooks/github
POST /webhooks/gitlab

Returns:

{
  "job_id": "uuid",
  "status": "processing"
}

Check job status

GET /jobs/:id

Returns:

{
  "id": "job-uuid",
  "status": "completed|processing|pending|failed",
  "repo": "repo-name",
  "branch": "main",
  "timestamp": "2024-01-27T10:30:00Z",
  "result": {
    "suspicious_commits": [...]
  }
}

List recent jobs

GET /jobs?limit=50

Health check

GET /health

How It Works

  1. GitHub sends push webhook → HTTP POST to /webhooks/github
  2. Cadence returns immediately with a job ID
  3. Analysis happens in background (non-blocking)
  4. Poll /jobs/:id to check progress
  5. Results available when status is completed

Common Questions

Q: Can I use this in CI/CD?
A: Yes. Run cadence analyze in your pipeline, parse the JSON output, and fail the build if suspicious commits found.

Q: How accurate is it?
A: Depends on your thresholds. Aggressive settings catch more but have more false positives. Start with defaults and tune.

Q: What about non-AI code that looks suspicious?
A: The confidence score helps - legitimate fast commits might trigger one strategy but not multiple. Check the reasons.

Q: Does it work with GitHub/GitLab Enterprise?
A: Webhooks work with any Git host. Self-hosted instances need network access to your Cadence server.

Q: Can I extend it?
A: Yes. Detection strategies are pluggable interfaces in internal/detector/. Add custom logic easily.

Development

Build

go build ./cmd/cadence

Run Tests

go test ./...
go test -cover ./...  # With coverage

Project Structure

cmd/cadence/          - CLI commands (analyze, webhook, config)
internal/
  analyzer/           - Repository analyzer orchestrator
  detector/           - Detection strategies
  git/                - Git operations
  metrics/            - Statistics and velocity calculations
  reporter/           - Output formatting (text, JSON)
  config/             - Configuration loading
  webhook/            - Webhook server (GitHub, GitLab)
  errors/             - Error types
test/                 - Integration tests

Adding Custom Detection Strategies

Create a new strategy in internal/detector/:

type CustomStrategy struct{}

func (s *CustomStrategy) Name() string {
    return "custom_detection"
}

func (s *CustomStrategy) Detect(pair *git.CommitPair, stats *metrics.RepositoryStats) (bool, string) {
    if isCustomSuspicious(pair) {
        return true, "Your reason here"
    }
    return false, ""
}

Register it in internal/detector/detector.go and it will automatically be used.

About

Scans Git repositories to detect suspicious commits that may have been generated by AI.

Resources

License

Stars

Watchers

Forks

Sponsor this project

  •