Contributing to Decision Tree Library

Thank you for your interest in contributing to this machine learning library! This guide will help you contribute high-quality code that maintains the project's standards.

Getting Started
Development Workflow
Code Quality Standards
Testing Requirements
Pull Request Process
Code Style Guidelines
Project Structure
Common Issues

Getting Started

Prerequisites

Node.js 20+ or Bun 1.0+
npm, yarn, or bun package manager
Git
Basic understanding of TypeScript
Familiarity with machine learning concepts

Why Use Bun?

Bun is fully supported and offers several advantages for development:

Faster Installation: Package installation is significantly faster than npm
Built-in TypeScript: No need for ts-node or additional TypeScript tooling
Faster Test Execution: Bun's test runner is optimized for speed
Better Performance: Generally faster execution for JavaScript/TypeScript code
Compatible: Works with all existing npm packages and scripts

Setting Up Development Environment

Fork the repository on GitHub

Clone your fork locally:

git clone https://github.com/your-username/nodejs-decision-tree.git
cd nodejs-decision-tree

Install dependencies:

# Using npm
npm install

# Using Bun (recommended for faster installation)
bun install

Build the project:

# Using npm
npm run build

# Using Bun
bun run build

Run tests to ensure everything works:

# Using npm
npm test

# Using Bun (faster test execution)
bun test

Development Workflow

1. Create a Feature Branch

git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-description

Use descriptive branch names:

feature/add-new-algorithm
fix/performance-issue
docs/update-readme
test/add-edge-cases

2. Make Your Changes

Source Code: Make changes in the src/ directory
Tests: Add/update tests in the tst/ directory
Documentation: Update README.md and other docs as needed

3. Build and Test

Using npm

# Build the project
npm run build

# Run all tests
npm test

# Run specific test categories
npm test -- --grep "Decision Tree"
npm test -- --grep "Performance Tests"

# Run tests in watch mode (for development)
npm run test:watch

Using Bun (Recommended)

# Build the project
bun run build

# Run all tests (Bun has built-in TypeScript support)
bun test

# Run specific test categories
bun test --grep "Decision Tree"
bun test --grep "Performance Tests"

# Run tests in watch mode (for development)
bun test --watch

# Note: Bun can run TypeScript files directly without compilation
# For development, you can also run: bun test tst/*.ts

4. Commit Your Changes

Use conventional commit messages:

git add .
git commit -m "feat: add new feature description"
git commit -m "fix: resolve issue description"
git commit -m "test: add tests for new functionality"
git commit -m "docs: update documentation"

5. Push and Create Pull Request

git push origin feature/your-feature-name

Then create a pull request on GitHub.

Code Quality Standards

TypeScript Requirements

Full Type Safety: All code must be properly typed
No any Types: Avoid using any unless absolutely necessary
Interface Definitions: Define clear interfaces for all data structures
Generic Types: Use generics where appropriate for reusability

// ✅ Good
interface TrainingData {
  features: string[];
  target: string;
  data: Record<string, any>[];
}

// ❌ Bad
const data: any = [...];

Error Handling

Graceful Degradation: Handle edge cases gracefully
Clear Error Messages: Provide meaningful error messages
Input Validation: Validate inputs at appropriate boundaries

// ✅ Good
if (!Array.isArray(data)) {
  throw new Error('Training data must be an array');
}

// ❌ Bad
// Silent failure or unclear error

Performance Considerations

Efficient Algorithms: Use efficient algorithms and data structures
Memory Management: Avoid memory leaks and excessive memory usage
Scalability: Consider performance with large datasets
Benchmarking: Include performance tests for new features

Testing Requirements

Test Coverage

100% Pass Rate: All 421 tests must pass before merging
New Feature Tests: Add comprehensive tests for new functionality
Edge Case Coverage: Test boundary conditions and error scenarios
Performance Tests: Include performance benchmarks where applicable (basic performance tests only)

Test Categories

Add tests to appropriate files:

Test File	Purpose	When to Use
`decision-tree.ts`	Core Decision Tree functionality	New Decision Tree features
`random-forest.ts`	Random Forest functionality	New Random Forest features
`xgboost.ts`	XGBoost functionality	New XGBoost features
`performance-tests.ts`	Basic performance benchmarks	Performance-related changes
`edge-cases.ts`	Edge cases and error handling	Error handling improvements
`data-validation.ts`	Input validation	Input validation changes
`type-safety.ts`	TypeScript type safety	Type system changes

Test Structure

describe('Feature Name', function() {
  it('should handle normal case', () => {
    // Arrange
    const input = createTestData();
    
    // Act
    const result = functionUnderTest(input);
    
    // Assert
    assert.strictEqual(result, expectedValue);
  });

  it('should handle edge case', () => {
    // Test edge cases
  });

  it('should throw error for invalid input', () => {
    // Test error conditions
  });
});

Performance Testing

For performance-related changes, add basic benchmarks (avoid extensive performance tests):

it('should perform within reasonable time limits', () => {
  const startTime = Date.now();
  // Perform operation
  const endTime = Date.now();
  
  const executionTime = endTime - startTime;
  assert.ok(executionTime < MAX_ALLOWED_TIME, 
    `Operation took ${executionTime}ms, expected < ${MAX_ALLOWED_TIME}ms`);
});

Note: Extensive performance and memory usage tests have been removed to focus on core functionality. Only add basic performance tests when necessary.

Pull Request Process

Before Submitting

All 421 tests pass (npm test or bun test)
Code builds without errors (npm run build or bun run build)
New features have comprehensive tests
Documentation is updated
Code follows style guidelines
No breaking changes (or clearly documented)

Pull Request Template

## Description
Brief description of changes and motivation.

## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update
- [ ] Performance improvement
- [ ] Test addition/improvement

## Testing
- [ ] All existing tests pass
- [ ] New tests added for new functionality
- [ ] Manual testing completed
- [ ] Performance impact assessed (if applicable - basic tests only)
- [ ] Edge cases tested

## Checklist
- [ ] Code follows existing style guidelines
- [ ] Self-review completed
- [ ] Documentation updated (if needed)
- [ ] No breaking changes (or clearly documented if intentional)
- [ ] TypeScript types are properly defined
- [ ] Error handling is appropriate
- [ ] Performance considerations addressed

Review Process

Automated Checks: CI will run tests and build checks
Code Review: Maintainers will review code quality and functionality
Feedback: Address any feedback or requested changes
Approval: Once approved, changes will be merged

Code Style Guidelines

General Guidelines

Consistent Formatting: Follow existing code style
Clear Naming: Use descriptive variable and function names
Comments: Add comments for complex logic
ES Modules: Use ES module syntax (no CommonJS)

TypeScript Specific

// ✅ Good - Clear interfaces
interface AlgorithmConfig {
  nEstimators: number;
  maxDepth?: number;
  randomState?: number;
}

// ✅ Good - Proper typing
function trainModel(data: TrainingData[], config: AlgorithmConfig): void {
  // Implementation
}

// ❌ Bad - Unclear types
function train(data: any, options: any): any {
  // Implementation
}

File Organization

One class per file: Keep classes in separate files
Shared utilities: Put shared code in src/shared/
Clear exports: Use named exports where appropriate
Import organization: Group imports logically

Error Messages

// ✅ Good - Clear and actionable
throw new Error('Training data must be an array of objects with at least one sample');

// ❌ Bad - Vague
throw new Error('Invalid data');

Project Structure

src/
├── decision-tree.ts          # Main Decision Tree class
├── random-forest.ts          # Random Forest implementation
├── xgboost.ts               # XGBoost implementation
└── shared/                  # Shared utilities
    ├── types.ts             # TypeScript type definitions
    ├── loss-functions.ts    # Loss function implementations
    └── gradient-boosting.ts # Gradient boosting utilities

tst/                         # Test files
├── decision-tree.ts         # Decision Tree tests
├── random-forest.ts         # Random Forest tests
├── xgboost.ts              # XGBoost tests
├── performance-tests.ts    # Performance benchmarks
└── ...                     # Other test files

examples/                    # Usage examples
├── javascript-usage.js     # JavaScript examples
├── typescript-usage.ts     # TypeScript examples
└── ...                     # Algorithm-specific examples

Common Issues

Build Errors

Issue: TypeScript compilation errors Solution:

# Using npm
npm run build
# Fix any TypeScript errors shown

# Using Bun
bun run build
# Fix any TypeScript errors shown

Test Failures

Issue: Tests failing Solution:

# Using npm
npm test
# Fix failing tests or update expectations

# Using Bun
bun run test
# Fix failing tests or update expectations

Import/Export Issues

Issue: Module import/export errors Solution: Ensure you're using ES module syntax:

// ✅ Correct
import DecisionTree from 'decision-tree';

// ❌ Incorrect (CommonJS)
const DecisionTree = require('decision-tree');

Performance Issues

Issue: New code is too slow Solution:

Add performance tests
Optimize algorithms
Consider memory usage
Use appropriate data structures

Getting Help

Issues: Open an issue for bugs or feature requests
Discussions: Use GitHub Discussions for questions
Code Review: Ask for help during code review
Documentation: Check existing documentation and examples

Recognition

Contributors will be recognized in:

README.md contributors section
Release notes
GitHub contributors list

Thank you for contributing to this project! Your efforts help make this library better for everyone.

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History