Version: 1.0 Last Updated: 2025-01-15 Status: Active Development
This document outlines the comprehensive testing strategy for CopyTree, including test types, coverage targets, infrastructure, and ongoing improvements. The strategy aligns with the project's event-driven pipeline architecture, transformer trait system, and graceful error recovery philosophy.
- Test Types & Organization
- Test Infrastructure
- Coverage Targets
- Implemented Test Suites
- Pending Test Suites
- CI/CD Integration
- Best Practices
- Rollout Timeline
tests/
├── unit/ # Unit tests for individual modules
│ ├── commands/ # CLI command tests
│ ├── config/ # Configuration system tests
│ ├── pipeline/ # Pipeline and stage tests
│ ├── services/ # Service layer tests
│ ├── transformers/ # Transformer tests
│ ├── transforms/ # Transform system tests
│ ├── ui/ # UI component tests
│ └── utils/ # Utility function tests
├── integration/ # Multi-module integration tests
├── e2e/ # End-to-end CLI tests
├── performance/ # Performance benchmarks
├── helpers/ # Test helper utilities
│ ├── determinism.js # Normalization utilities
│ ├── fixtures.js # Fixture management
│ └── pipeline.js # Pipeline test helpers
├── fixtures/ # Test data
│ ├── goldens/ # Golden file outputs
│ ├── simple-project/ # Small test project
│ ├── images/ # Image test files
│ └── pdfs/ # PDF test files
└── mocks/ # Mock implementations
Purpose: Test individual functions, classes, and modules in isolation.
Scope:
- Pipeline stage contracts and lifecycle hooks
- Transformer trait enforcement
- Configuration hierarchy and precedence
- Service retry/fallback logic
- Error categorization
- Utility functions
Characteristics:
- Run in < 10ms per test
- No external dependencies (network, filesystem where possible)
- Use mocks for external services
- Deterministic outputs
Purpose: Test interaction between multiple modules without full system deployment.
Scope:
- Stage orchestration across multiple stages
- Event emission and propagation
- External source handling (with local git repos)
- Cache behavior (memory + file layers)
- Profile loading and inheritance
Characteristics:
- Run in < 100ms per test
- Use local resources (no network)
- Test real file I/O patterns
- Validate data flows between components
Purpose: Validate complete user workflows from CLI invocation to output.
Scope:
- All output formats (XML, JSON, Markdown, Tree)
- Profile usage and filtering
- Git integration (modified, branch comparison)
- Transformer pipeline execution
- Error handling and recovery
Characteristics:
- Run in < 1s per test
- Use golden file comparisons
- Test actual CLI entry points
- Cover happy paths and common errors
Purpose: Ensure performance remains within acceptable bounds.
Scope:
- Time budgets per project size (small/medium/large)
- Memory usage patterns
- Concurrency scaling
- Cache effectiveness
- Streaming behavior for large files
Characteristics:
- Run on schedule (nightly/weekly) in CI
- Track trends over time
- Fail on regressions > 10%
- Generate performance reports
Purpose: Test system behavior under unusual or adversarial conditions.
Scope:
- Path traversal attempts
- Symlink loops
- Malformed files (PDFs, CSVs, images)
- Non-UTF8 content
- Very large files (> 10MB)
- Case-insensitive path collisions
Provides normalization utilities for stable test outputs:
import { normalizeForGolden, normalizePaths, normalizeTimestamps } from './helpers/determinism.js';
// Normalize all nondeterministic data
const normalized = normalizeForGolden(output, { basePath: projectRoot });
// Selective normalization
const pathsNormalized = normalizePaths(content, { basePath: '/project' });
const idsNormalized = normalizeIds(content);
const metricsNormalized = normalizeMetrics(content);Features:
- Path normalization (absolute → relative, OS-agnostic)
- Timestamp normalization (ISO 8601, Unix epochs)
- UUID and request ID normalization
- Performance metrics normalization (durations, memory, sizes)
- Token count normalization
Fixture management for test data:
import { createSimpleProject, createLargeProject, createLocalGitRepo } from './helpers/fixtures.js';
// Create test projects programmatically
const projectPath = createSimpleProject('test', { withGit: true });
// Create performance test projects
const largeProject = createLargeProject('large', 500, { avgFileSize: 2048 });
// Create local git repos for external source testing
const repo = createLocalGitRepo('test-repo', {
'README.md': '# Test',
'src/index.js': 'console.log("hello");'
});Features:
- Programmatic fixture creation
- Golden file management
- Local git repository creation
- Temporary directory management
- Cleanup utilities
Pipeline testing utilities:
import { PipelineEventCollector, createTestPipeline, runPipelineWithEvents } from './helpers/pipeline.js';
// Collect and validate pipeline events
const collector = new PipelineEventCollector(pipeline);
await pipeline.run(input);
const validation = collector.validateContract();
expect(validation.valid).toBe(true);
// Create mock pipelines
const pipeline = createTestPipeline([stage1, stage2, stage3]);Features:
- Event collection and validation
- Pipeline contract enforcement
- Mock stage creation
- Performance snapshot capture
- Stage input/output contract validation
Golden files are stored in tests/fixtures/goldens/ and represent expected outputs for regression testing.
Usage:
import { readGolden, matchGolden } from './helpers/fixtures.js';
// Compare against golden file
const normalized = normalizeForGolden(output);
expect(normalized).toMatchGolden('simple.xml');
// Update mode (use sparingly)
// UPDATE_GOLDEN=true npm testConventions:
- One golden file per format per fixture
- Naming:
<fixture-name>.<format>(e.g.,simple.xml,simple.json) - Always normalize before comparison
- Update only with explicit approval
Target: 80% across all metrics (branches, functions, lines, statements)
Current Status: Enforced via jest.config.js
coverageThreshold: {
global: {
branches: 80,
functions: 80,
lines: 80,
statements: 80
}
}src/pipeline/Pipeline.jssrc/pipeline/Stage.jssrc/config/ConfigManager.jssrc/transforms/TransformerRegistry.jssrc/utils/GitUtils.js
- All pipeline stages (
src/pipeline/stages/*.js) - Core transformers (FileLoader, PDF, Image, CSV, Markdown)
- Service layer (
src/services/*.js)
- UI components (
src/ui/*.js) - Utility functions (
src/utils/*.js) - Command implementations (
src/commands/*.js)
All CLI commands should have comprehensive tests covering:
- Argument parsing
- Option validation
- Error handling
- Output formatting
- Exit codes
Location: tests/unit/pipeline/events.contract.test.js
Coverage:
pipeline:start,pipeline:complete,pipeline:erroreventsstage:start,stage:complete,stage:erroreventsfile:batchevents with action types- Event timing and ordering validation
- Performance metrics in events
- Error recovery event flow
Test Count: 20+ tests
Key Validations:
- All required event fields present
- Event ordering is correct
- Timing data is accurate
- Memory deltas are tracked
- Error events include stage context
Location: tests/unit/transforms/traits.enforcement.test.js
Coverage:
- Heavy transformer identification and scheduling
- Idempotent trait behavior
- Input/output type matching and chaining
- Dependency resolution and circular detection
- Priority-based ordering
- Real-world transformer trait validation
Test Count: 20+ tests
Key Validations:
- Heavy transformers respect budget constraints
- Idempotent transformers can be reapplied safely
- Type incompatibilities prevent invalid chains
- Dependencies are resolved in correct order
- Priority determines execution sequence
Location: tests/unit/config/config.hierarchy.test.js
Coverage:
- Precedence order: default < user < project < env < CLI
- Nested object deep merging
- Environment variable mapping (GEMINI_API_KEY, etc.)
- Provenance tracking for all values
- Dot notation access
- File loading from search paths
Test Count: 28 tests
Key Validations:
- Highest precedence source wins for each key
- Nested values merge correctly across levels
- Environment variables map to config keys
- Provenance tracks which source set each value
- Missing values fall through to defaults
Location: tests/e2e/cli.outputs.test.js (planned)
Scope:
- All output formats (XML, JSON, Markdown, Tree)
- Profile application
- Include/exclude patterns
- Git filters (--modified, --changed )
- Transformer pipeline
- Streaming output
- Character limits
- Instructions injection
Approach:
it.each(['xml', 'json', 'tree', 'markdown'])('produces stable %s output', async (format) => {
const { stdout } = await runCLI(['--format', format, fixturePath('simple-project')]);
const normalized = normalizeForGolden(stdout);
expect(normalized).toMatchGolden(`simple.${format}`);
});Location: tests/integration/external-sources.test.js (planned)
Scope:
- Local git repository cloning
- Cache hit/miss behavior
- Branch detection
- Subpath support
- Rule application (include/exclude)
- Optional source handling
- Update detection
Approach:
it('clones and caches local git repo', async () => {
const repo = createLocalGitRepo('test-repo', files);
const firstRun = await copy({ source: repo.path });
const secondRun = await copy({ source: repo.path });
expect(secondRun.stats.cloneFromCache).toBe(true);
expect(secondRun.duration).toBeLessThan(firstRun.duration);
});Location: tests/integration/robustness.test.js (planned)
Scope:
- Path traversal attempts (
../../../etc/passwd) - Symlink loops
- Malformed files (corrupt PDFs, invalid CSV, broken images)
- Non-UTF8 content
- Files > 10MB (streaming threshold)
- Zero-byte files
- Case-insensitive collisions
- Deep nesting (> 100 levels)
Approach:
it('prevents path traversal', async () => {
const { files } = await runPipeline({
sourcePath: fixturesWith('../../../dangerous/path')
});
expect(files.every(f => isWithinSourcePath(f.path))).toBe(true);
});Location: .github/workflows/performance.yml (planned)
Scope:
- Scheduled runs (nightly/weekly)
- Small project (10 files): < 1s
- Medium project (100 files): < 5s
- Large project (500+ files): < 30s
- Memory ceiling: < 500MB
- Cache speedup: > 50% faster
- Concurrency scaling: linear up to 10 workers
Approach:
- name: Run performance benchmarks
run: npm run benchmark -- --json > perf-results.json
- name: Check performance budgets
run: node scripts/check-perf-budgets.js perf-results.json
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: perf-results
path: perf-results.jsonProposed Configuration:
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ${{ matrix.os }}
strategy:
matrix:
node: [20, 22] # Node.js 20+ required
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- run: npm ci
- run: npm run lint
- run: npm run test:unit
- run: npm run test:integration
- run: npm run test:e2e
- run: npm run test:coverage
performance:
runs-on: ubuntu-latest
if: github.event_name == 'schedule'
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 20
- run: npm ci
- run: npm run benchmark
- run: node scripts/check-perf-budgets.jsPull Request Gates:
- ✅ Linting passes
- ✅ Unit tests pass (100%)
- ✅ Integration tests pass (100%)
- ✅ E2E tests pass (100%)
- ✅ Coverage thresholds met (80% global, 95% critical)
Scheduled Checks (Nightly/Weekly):
- Performance benchmarks within budgets
- No regressions > 10%
- Memory usage trends
-
Follow AAA Pattern:
// Arrange const input = createTestInput(); // Act const result = await process(input); // Assert expect(result).toMatchExpected();
-
Use Descriptive Test Names:
it('retries retryable errors up to max attempts', async () => { ... }); // NOT: it('works', () => { ... });
-
Isolate Tests:
- Clean up resources in
afterEach - Don't rely on test execution order
- Use separate fixtures per test when possible
- Clean up resources in
-
Mock External Dependencies:
- APIs (Gemini, GitHub)
- File system (when appropriate)
- Time (Date.now(), setTimeout)
-
Use Helpers:
- Leverage
tests/helpers/utilities - Create reusable fixtures
- Extract common setup to functions
- Leverage
-
Fixtures:
- Keep fixtures minimal and focused
- One fixture per test scenario
- Store in
tests/fixtures/
-
Golden Files:
- Normalize before comparison
- Review diffs carefully before updating
- Use
UPDATE_GOLDEN=trueonly when necessary
-
Generated Data:
- Seed random generators for determinism
- Use factories for complex objects
- Avoid hard-coded timestamps/IDs
-
Keep Unit Tests Fast (< 10ms):
- No network calls
- Minimal file I/O
- Mock heavy operations
-
Limit Integration Test Scope:
- Test specific subsystems
- Use local resources
- Avoid full pipeline runs
-
Parallelize Where Possible:
- Jest runs tests in parallel by default
- Mark serial tests with
--runInBandif needed
- ✅ Test infrastructure and helpers
- ✅ Pipeline event contract tests
- ✅ Transformer trait enforcement tests
- ✅ Config hierarchy tests
Status: All core test infrastructure is in place and validated.
- 🔲 CLI golden output tests for all formats
- 🔲 External sources with local git tests
- 🔲 Robustness tests (path traversal, edge cases)
- 🔲 Complete pipeline integration tests
Target: 90% integration coverage
- 🔲 Performance benchmark CI integration
- 🔲 Regression detection and alerts
- 🔲 Memory profiling and optimization
- 🔲 Concurrency scaling tests
Target: Performance budgets enforced in CI
- 🔲 Mutation testing (Stryker)
- 🔲 Property-based testing (fast-check)
- 🔲 Expand coverage to 95% for critical paths
- 🔲 Test documentation and examples
Target: Comprehensive test suite with automated quality gates
Current (Phase 1 Complete):
- Total Tests: ~150+
- Unit Tests: ~120
- Integration Tests: ~20
- E2E Tests: ~10
- Coverage: 80% global (target met)
Target (All Phases Complete):
- Total Tests: 300+
- Unit Tests: 200+
- Integration Tests: 60+
- E2E Tests: 40+
- Coverage: 85% global, 95% critical paths
Targets:
- Small project (10 files): < 1s
- Medium project (100 files): < 5s
- Large project (500+ files): < 30s
- Memory usage: < 500MB
- Cache speedup: > 50%
Measured in CI: Nightly performance runs with trend tracking
-
Choose the Right Category:
- Unit: Single module/function
- Integration: Multiple modules
- E2E: Full CLI workflow
-
Use Existing Helpers:
- Check
tests/helpers/for utilities - Reuse fixtures when possible
- Check
-
Follow Naming Conventions:
- File:
<module>.test.js - Suite:
describe('<ModuleName>', () => { ... }) - Test:
it('<expected behavior>', () => { ... })
- File:
-
Add to Coverage Targets:
- Update thresholds in
jest.config.jsif needed - Aim for critical paths: 95%
- Standard paths: 80%
- Update thresholds in
-
Document Complex Tests:
- Add comments explaining non-obvious logic
- Reference related issues/PRs
- Explain why the test is necessary
# All tests
npm test
# Unit tests only
npm run test:unit
# Integration tests
npm run test:integration
# E2E tests
npm run test:e2e
# Coverage report
npm run test:coverage
# Watch mode
npm test -- --watch
# Specific file
npm test -- tests/unit/config/config.hierarchy.test.js
# Update golden files (use with caution)
UPDATE_GOLDEN=true npm test- Jest Documentation: https://jestjs.io/docs/getting-started
- Testing Best Practices: https://github.com/goldbergyoni/javascript-testing-best-practices
- Project Architecture: docs/technical/architecture.md
- CLI Reference: docs/cli/copytree-reference.md
This testing strategy provides a solid foundation for ensuring CopyTree's reliability, performance, and maintainability. With the core infrastructure in place (Phase 1 complete), we're well-positioned to expand coverage and enforce quality gates throughout the development process.
Key Principles:
- Comprehensive: Cover all layers (unit → integration → e2e → perf)
- Practical: Focus on high-value tests aligned with architecture
- Automated: Enforce quality gates in CI/CD
- Maintainable: Use helpers, fixtures, and clear conventions
- Iterative: Continuous improvement via phases and metrics
For questions or suggestions, please open an issue or discussion on GitHub.