Skip to content

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Jan 11, 2026

feat: Add optional Rust backend for improved performance
Add a PyO3-based Rust backend that provides optimized implementations of
performance-critical functions. The Rust code is pre-compiled into
platform-specific wheels so users don't need Rust installed.

Rust implementations:

  • Bootstrap weight generation (Rademacher, Mammen, Webb) with parallel execution
  • Synthetic control weight optimization via projected gradient descent
  • Simplex projection algorithm
  • OLS solving with LAPACK
  • HC1 and cluster-robust variance-covariance estimation

Key features:

  • Pure Python fallback always available (HAS_RUST_BACKEND flag)
  • Automatic detection and use of Rust backend when available
  • Error message translation for consistent Python exceptions
  • 26 new tests for Rust backend functions

Build changes:

  • Switch from setuptools to maturin build backend
  • Multi-platform wheel builds (Linux, macOS x86_64/ARM64, Windows)
  • GitHub Actions workflow updated for cross-platform builds

igerber and others added 5 commits January 11, 2026 14:01
Add a PyO3-based Rust backend that provides optimized implementations of
performance-critical functions. The Rust code is pre-compiled into
platform-specific wheels so users don't need Rust installed.

Rust implementations:
- Bootstrap weight generation (Rademacher, Mammen, Webb) with parallel execution
- Synthetic control weight optimization via projected gradient descent
- Simplex projection algorithm
- OLS solving with LAPACK
- HC1 and cluster-robust variance-covariance estimation

Key features:
- Pure Python fallback always available (HAS_RUST_BACKEND flag)
- Automatic detection and use of Rust backend when available
- Error message translation for consistent Python exceptions
- 26 new tests for Rust backend functions

Performance (release mode):
- Synthetic weights: 5.2x faster than NumPy
- OLS: Comparable to SciPy's LAPACK
- CallawaySantAnna bootstrap: 200 iterations in ~5ms

Build changes:
- Switch from setuptools to maturin build backend
- Multi-platform wheel builds (Linux, macOS x86_64/ARM64, Windows)
- GitHub Actions workflow updated for cross-platform builds

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major version bump for the addition of the optional Rust backend,
which represents a significant architectural change to the library.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add --backend argument to benchmark_basic.py and benchmark_callaway.py
  to select between pure Python and Rust backends
- Update run_benchmarks.py to run Python benchmarks twice (pure + rust)
  and display three-way timing comparison tables
- Add 20k scale configuration (20,000 units, 240k-360k observations)
- Update compare_results.py with three-way comparison report generation
- Update docs/benchmarks.rst with new benchmark results showing:
  - diff-diff is 2-22x faster than R across all scales
  - Rust backend shows minimal speedup for analytical SEs
  - Pure Python backend provides excellent performance without Rust

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add --backend argument to benchmark_synthdid.py for backend selection
- Update run_synthdid_benchmark() to run both pure Python and Rust backends
- Update docs/benchmarks.rst with SyntheticDiD results:
  - small: 2015x faster (7.5s R vs 0.004s Python)
  - 1k: 1082x faster (108s R vs 0.1s Python)
  - 5k: 725x faster (505s R vs 0.7s Python)
  - 10k: 429x faster (1106s R vs 2.6s Python)
- Update Dataset Sizes table to include SyntheticDiD configurations
- Update Key Observations to reflect massive SyntheticDiD speedups
- Note: Rust backend shows no additional speedup over pure Python

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous backend disable mechanism was broken due to Python's
import semantics - when modules import a boolean value, they get a
copy, not a reference. Setting `diff_diff.HAS_RUST_BACKEND = False`
after imports had no effect on already-imported modules.

Fix: Use DIFF_DIFF_BACKEND environment variable checked at import time.

Changes:
- diff_diff/__init__.py: Check DIFF_DIFF_BACKEND env var when setting
  HAS_RUST_BACKEND (supports 'auto', 'python', 'rust')
- benchmarks/python/benchmark_*.py: Parse --backend arg and set env var
  BEFORE importing diff_diff to ensure correct backend isolation
- docs/benchmarks.rst: Update with accurate benchmark results showing:
  - SyntheticDiD: Rust is 4-8x faster than pure Python
  - BasicDiD/CallawaySantAnna: Rust provides minimal benefit (~1x)

The fix enables proper measurement of pure Python vs Rust performance,
revealing that the Rust backend's benefit depends on the estimator.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
igerber pushed a commit that referenced this pull request Jan 11, 2026
Review covers methodology, code quality, performance, maintainability,
and technical debt analysis. Identifies critical issues with tolerance
constants and missing CI testing that should be addressed before merge.
igerber and others added 11 commits January 11, 2026 18:59
Critical fixes:
- Fix tolerance constant mismatch: Rust now uses 1e-8 to match Python
- Sync Cargo.toml version to 2.0.0 (matches pyproject.toml)

High priority fixes:
- Create diff_diff/_backend.py for backend detection to avoid circular
  imports. Modules now import from _backend.py instead of __init__.py
- Add comprehensive numerical equivalence tests comparing Rust and NumPy
  implementations (OLS, VCoV, bootstrap weights, synthetic weights)
- Update CLAUDE.md with Rust backend documentation and commands

CI improvements:
- Add .github/workflows/rust-test.yml for PR testing of Rust backend
- Tests Rust unit tests, Python tests with Rust, and pure Python fallback

Documentation:
- Add docstrings to _solve_ols_numpy, _compute_robust_vcov_numpy,
  and _generate_bootstrap_weights_batch_numpy

Deferred to post-merge:
- Rust code optimizations (matrix inversion, bootstrap allocation)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change dtolnay/rust-action to dtolnay/rust-toolchain (correct action name)
- Add maturin to pip install in python-fallback job
- Add fallback install command for edge cases

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add OPENBLAS_DIR and PKG_CONFIG_PATH env vars for macOS builds
- Use PYTHONPATH for python-fallback job instead of pip install
  (maturin requires Rust toolchain which defeats the purpose of
  testing pure Python fallback)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add rlib to crate-type so cargo test can compile the library
- Replace maturin-action develop with maturin build + pip install
  (develop command requires virtualenv which isn't set up)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move pyo3/extension-module to optional feature (not needed for tests)
- Update pyproject.toml to use the new feature name
- Use pip --find-links for wheel installation (glob wasn't expanding)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Without --no-index, pip was installing from PyPI instead of the
locally built wheel with Rust backend.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Maturin was building to rust/target/wheels/ but we were looking in
target/wheels/. Use -o dist to put wheel in a known location.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The source directory was shadowing the installed wheel. Running from
/tmp ensures Python imports the installed package with Rust backend.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pytest adds test directory parent to sys.path, causing source imports.
Copying tests to /tmp fully isolates from source directory.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Documents post-merge optimization opportunities from PR #58 review:
- Matrix inversion efficiency (Cholesky)
- Reduce bootstrap allocations
- Consider static BLAS linking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
igerber pushed a commit that referenced this pull request Jan 12, 2026
Re-reviewed after revisions. All critical and high-priority issues fixed:
- Tolerance constant now matches (1e-8)
- CI workflow added for Rust testing
- Numerical equivalence tests added
- Documentation updated
- Versions synchronized

Verdict: Approved for merge
@igerber igerber merged commit 9692b15 into main Jan 12, 2026
4 checks passed
@igerber igerber deleted the claude/add-rust-backend branch January 12, 2026 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants