Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,37 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - 2026-01-04

### Added
- **Goodman-Bacon decomposition** for TWFE diagnostics
- `BaconDecomposition` class for decomposing TWFE into weighted 2x2 comparisons
- `Comparison2x2` dataclass for individual comparisons (treated_vs_never, earlier_vs_later, later_vs_earlier)
- `BaconDecompositionResults` with weights and estimates by comparison type
- `bacon_decompose()` convenience function
- `plot_bacon()` visualization for decomposition results
- Integration via `TwoWayFixedEffects.decompose()` method
- **Power analysis** for study design
- `PowerAnalysis` class for analytical power calculations
- `PowerResults` and `SimulationPowerResults` dataclasses
- `compute_mde()`, `compute_power()`, `compute_sample_size()` convenience functions
- `simulate_power()` for Monte Carlo simulation-based power analysis
- `plot_power_curve()` visualization for power analysis
- Tutorial notebook: `docs/tutorials/06_power_analysis.ipynb`
- **Callaway-Sant'Anna multiplier bootstrap** for inference
- `CSBootstrapResults` with standard errors, confidence intervals, p-values
- Rademacher, Mammen, and Webb weight distributions
- Bootstrap inference for all aggregation methods
- **Troubleshooting guide** in documentation
- **Standard error computation guide** explaining SE differences across estimators

### Changed
- Updated package status to Production/Stable (was Alpha)
- SyntheticDiD bootstrap now warns when >5% of iterations fail

### Fixed
- Silent bootstrap failures in SyntheticDiD now produce warnings

## [0.6.0]

### Added
Expand Down Expand Up @@ -136,6 +167,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `to_dict()` and `to_dataframe()` export methods
- `is_significant` and `significance_stars` properties

[1.0.0]: https://github.com/igerber/diff-diff/compare/v0.6.0...v1.0.0
[0.6.0]: https://github.com/igerber/diff-diff/compare/v0.5.0...v0.6.0
[0.5.0]: https://github.com/igerber/diff-diff/compare/v0.4.0...v0.5.0
[0.4.0]: https://github.com/igerber/diff-diff/compare/v0.3.0...v0.4.0
Expand Down
2 changes: 1 addition & 1 deletion diff_diff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@
plot_sensitivity,
)

__version__ = "0.9.0"
__version__ = "1.0.0"
__all__ = [
# Estimators
"DifferenceInDifferences",
Expand Down
14 changes: 14 additions & 0 deletions diff_diff/estimators.py
Original file line number Diff line number Diff line change
Expand Up @@ -1774,6 +1774,20 @@ def _bootstrap_se(
continue

bootstrap_estimates = np.array(bootstrap_estimates)

# Warn if too many bootstrap iterations failed
n_successful = len(bootstrap_estimates)
failure_rate = 1 - (n_successful / self.n_bootstrap)
if failure_rate > 0.05:
warnings.warn(
f"Only {n_successful}/{self.n_bootstrap} bootstrap iterations succeeded "
f"({failure_rate:.1%} failure rate). Standard errors may be unreliable. "
f"This can occur with small samples, near-singular weight matrices, "
f"or insufficient pre-treatment periods.",
UserWarning,
stacklevel=2,
)

se = np.std(bootstrap_estimates, ddof=1) if len(bootstrap_estimates) > 1 else 0.0

return se, bootstrap_estimates
Expand Down
51 changes: 51 additions & 0 deletions docs/choosing_estimator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,57 @@ Common Pitfalls

*Solution*: Always specify ``cluster_col`` for panel data.

Standard Error Methods
----------------------

Different estimators compute standard errors differently. Understanding these
differences helps interpret results and choose appropriate inference.

.. list-table::
:header-rows: 1
:widths: 20 25 55

* - Estimator
- Default SE Method
- Details
* - ``DifferenceInDifferences``
- HC1 (heteroskedasticity-robust)
- Uses White's robust SEs by default. Specify ``cluster_col`` for cluster-robust SEs. Use ``inference='wild_bootstrap'`` for few clusters (<30).
* - ``TwoWayFixedEffects``
- Cluster-robust (unit level)
- Always clusters at unit level after within-transformation. Specify ``cluster_col`` to override. Use ``inference='wild_bootstrap'`` for few clusters.
* - ``MultiPeriodDiD``
- HC1 (heteroskedasticity-robust)
- Same as basic DiD. Cluster-robust available via ``cluster_col``. Wild bootstrap not yet supported for multi-coefficient inference.
* - ``CallawaySantAnna``
- Analytical (simple difference)
- Uses simple variance of group-time means. Use ``bootstrap()`` method for multiplier bootstrap inference with proper SEs, CIs, and p-values.
* - ``SyntheticDiD``
- Bootstrap or placebo-based
- Default uses bootstrap resampling. Set ``n_bootstrap=0`` for placebo-based inference using pre-treatment residuals.

**Recommendations by sample size:**

- **Large samples (N > 1000, clusters > 50)**: Default analytical SEs are reliable
- **Medium samples (clusters 30-50)**: Cluster-robust SEs recommended
- **Small samples (clusters < 30)**: Use wild cluster bootstrap (``inference='wild_bootstrap'``)
- **Very few clusters (< 10)**: Use Webb 6-point distribution (``weight_type='webb'``)

**Common pitfall:** Forgetting to cluster when units are observed multiple times.
For panel data, always cluster at the unit level unless you have a strong reason not to.

.. code-block:: python

# Good: Cluster at unit level for panel data
did = DifferenceInDifferences()
results = did.fit(data, outcome='y', treated='treated',
post='post', cluster_col='unit_id')

# Better for few clusters: Wild bootstrap
did = DifferenceInDifferences(inference='wild_bootstrap')
results = did.fit(data, outcome='y', treated='treated',
post='post', cluster_col='state')

When in Doubt
-------------

Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Quick Links

- :doc:`quickstart` - Get started with basic examples
- :doc:`choosing_estimator` - Which estimator should I use?
- :doc:`troubleshooting` - Common issues and solutions
- :doc:`r_comparison` - Comparison with R packages
- :doc:`python_comparison` - Comparison with Python packages
- :doc:`api/index` - Full API reference
Expand All @@ -51,6 +52,7 @@ Quick Links

quickstart
choosing_estimator
troubleshooting
r_comparison
python_comparison

Expand Down
Loading
Loading