diff --git a/CHANGELOG.md b/CHANGELOG.md
index 3332a47..50c4bc1 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,37 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [1.0.0] - 2026-01-04
+
+### Added
+- **Goodman-Bacon decomposition** for TWFE diagnostics
+  - `BaconDecomposition` class for decomposing TWFE into weighted 2x2 comparisons
+  - `Comparison2x2` dataclass for individual comparisons (treated_vs_never, earlier_vs_later, later_vs_earlier)
+  - `BaconDecompositionResults` with weights and estimates by comparison type
+  - `bacon_decompose()` convenience function
+  - `plot_bacon()` visualization for decomposition results
+  - Integration via `TwoWayFixedEffects.decompose()` method
+- **Power analysis** for study design
+  - `PowerAnalysis` class for analytical power calculations
+  - `PowerResults` and `SimulationPowerResults` dataclasses
+  - `compute_mde()`, `compute_power()`, `compute_sample_size()` convenience functions
+  - `simulate_power()` for Monte Carlo simulation-based power analysis
+  - `plot_power_curve()` visualization for power analysis
+  - Tutorial notebook: `docs/tutorials/06_power_analysis.ipynb`
+- **Callaway-Sant'Anna multiplier bootstrap** for inference
+  - `CSBootstrapResults` with standard errors, confidence intervals, p-values
+  - Rademacher, Mammen, and Webb weight distributions
+  - Bootstrap inference for all aggregation methods
+- **Troubleshooting guide** in documentation
+- **Standard error computation guide** explaining SE differences across estimators
+
+### Changed
+- Updated package status to Production/Stable (was Alpha)
+- SyntheticDiD bootstrap now warns when >5% of iterations fail
+
+### Fixed
+- Silent bootstrap failures in SyntheticDiD now produce warnings
+
 ## [0.6.0]
 
 ### Added
@@ -136,6 +167,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - `to_dict()` and `to_dataframe()` export methods
   - `is_significant` and `significance_stars` properties
 
+[1.0.0]: https://github.com/igerber/diff-diff/compare/v0.6.0...v1.0.0
 [0.6.0]: https://github.com/igerber/diff-diff/compare/v0.5.0...v0.6.0
 [0.5.0]: https://github.com/igerber/diff-diff/compare/v0.4.0...v0.5.0
 [0.4.0]: https://github.com/igerber/diff-diff/compare/v0.3.0...v0.4.0
diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py
index f029093..47d3238 100644
--- a/diff_diff/__init__.py
+++ b/diff_diff/__init__.py
@@ -85,7 +85,7 @@
     plot_sensitivity,
 )
 
-__version__ = "0.9.0"
+__version__ = "1.0.0"
 __all__ = [
     # Estimators
     "DifferenceInDifferences",
diff --git a/diff_diff/estimators.py b/diff_diff/estimators.py
index a2df476..808dc3b 100644
--- a/diff_diff/estimators.py
+++ b/diff_diff/estimators.py
@@ -1774,6 +1774,20 @@ def _bootstrap_se(
                 continue
 
         bootstrap_estimates = np.array(bootstrap_estimates)
+
+        # Warn if too many bootstrap iterations failed
+        n_successful = len(bootstrap_estimates)
+        failure_rate = 1 - (n_successful / self.n_bootstrap)
+        if failure_rate > 0.05:
+            warnings.warn(
+                f"Only {n_successful}/{self.n_bootstrap} bootstrap iterations succeeded "
+                f"({failure_rate:.1%} failure rate). Standard errors may be unreliable. "
+                f"This can occur with small samples, near-singular weight matrices, "
+                f"or insufficient pre-treatment periods.",
+                UserWarning,
+                stacklevel=2,
+            )
+
         se = np.std(bootstrap_estimates, ddof=1) if len(bootstrap_estimates) > 1 else 0.0
 
         return se, bootstrap_estimates
diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst
index f0a1965..ceb48dd 100644
--- a/docs/choosing_estimator.rst
+++ b/docs/choosing_estimator.rst
@@ -205,6 +205,57 @@ Common Pitfalls
 
    *Solution*: Always specify ``cluster_col`` for panel data.
 
+Standard Error Methods
+----------------------
+
+Different estimators compute standard errors differently. Understanding these
+differences helps interpret results and choose appropriate inference.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 20 25 55
+
+   * - Estimator
+     - Default SE Method
+     - Details
+   * - ``DifferenceInDifferences``
+     - HC1 (heteroskedasticity-robust)
+     - Uses White's robust SEs by default. Specify ``cluster_col`` for cluster-robust SEs. Use ``inference='wild_bootstrap'`` for few clusters (<30).
+   * - ``TwoWayFixedEffects``
+     - Cluster-robust (unit level)
+     - Always clusters at unit level after within-transformation. Specify ``cluster_col`` to override. Use ``inference='wild_bootstrap'`` for few clusters.
+   * - ``MultiPeriodDiD``
+     - HC1 (heteroskedasticity-robust)
+     - Same as basic DiD. Cluster-robust available via ``cluster_col``. Wild bootstrap not yet supported for multi-coefficient inference.
+   * - ``CallawaySantAnna``
+     - Analytical (simple difference)
+     - Uses simple variance of group-time means. Use ``bootstrap()`` method for multiplier bootstrap inference with proper SEs, CIs, and p-values.
+   * - ``SyntheticDiD``
+     - Bootstrap or placebo-based
+     - Default uses bootstrap resampling. Set ``n_bootstrap=0`` for placebo-based inference using pre-treatment residuals.
+
+**Recommendations by sample size:**
+
+- **Large samples (N > 1000, clusters > 50)**: Default analytical SEs are reliable
+- **Medium samples (clusters 30-50)**: Cluster-robust SEs recommended
+- **Small samples (clusters < 30)**: Use wild cluster bootstrap (``inference='wild_bootstrap'``)
+- **Very few clusters (< 10)**: Use Webb 6-point distribution (``weight_type='webb'``)
+
+**Common pitfall:** Forgetting to cluster when units are observed multiple times.
+For panel data, always cluster at the unit level unless you have a strong reason not to.
+
+.. code-block:: python
+
+   # Good: Cluster at unit level for panel data
+   did = DifferenceInDifferences()
+   results = did.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='unit_id')
+
+   # Better for few clusters: Wild bootstrap
+   did = DifferenceInDifferences(inference='wild_bootstrap')
+   results = did.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='state')
+
 When in Doubt
 -------------
 
diff --git a/docs/index.rst b/docs/index.rst
index cad1203..b60a9d4 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -40,6 +40,7 @@ Quick Links
 
 - :doc:`quickstart` - Get started with basic examples
 - :doc:`choosing_estimator` - Which estimator should I use?
+- :doc:`troubleshooting` - Common issues and solutions
 - :doc:`r_comparison` - Comparison with R packages
 - :doc:`python_comparison` - Comparison with Python packages
 - :doc:`api/index` - Full API reference
@@ -51,6 +52,7 @@ Quick Links
 
    quickstart
    choosing_estimator
+   troubleshooting
    r_comparison
    python_comparison
 
diff --git a/docs/troubleshooting.rst b/docs/troubleshooting.rst
new file mode 100644
index 0000000..19b86b9
--- /dev/null
+++ b/docs/troubleshooting.rst
@@ -0,0 +1,317 @@
+Troubleshooting
+===============
+
+This guide covers common issues and their solutions when using diff-diff.
+
+Data Issues
+-----------
+
+"No treated observations found"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** The estimator raises an error that no treated units were found.
+
+**Causes:**
+
+1. Treatment column contains wrong values (e.g., strings instead of 0/1)
+2. Treatment column has all zeros
+3. Column name is misspelled
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Check your treatment column
+   print(data['treated'].value_counts())
+
+   # Ensure binary 0/1 values
+   data['treated'] = (data['group'] == 'treatment').astype(int)
+
+   # Or use make_treatment_indicator
+   from diff_diff import make_treatment_indicator
+   data['treated'] = make_treatment_indicator(data, 'group', treated_value='treatment')
+
+"Panel is unbalanced"
+~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** TwoWayFixedEffects or CallawaySantAnna fails with unbalanced panel.
+
+**Causes:**
+
+1. Some units are missing observations for certain time periods
+2. Units have different numbers of observations
+
+**Solutions:**
+
+.. code-block:: python
+
+   from diff_diff import balance_panel
+
+   # Balance the panel (keeps only units with all periods)
+   balanced = balance_panel(data, unit='unit_id', time='period')
+   print(f"Dropped {len(data) - len(balanced)} observations")
+
+   # Alternative: check balance first
+   from diff_diff import validate_did_data
+   issues = validate_did_data(data, outcome='y', treated='treated',
+                               unit='unit_id', time='period')
+   print(issues)
+
+Estimation Errors
+-----------------
+
+"Singular matrix" or "Matrix is singular"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** Linear algebra error during estimation.
+
+**Causes:**
+
+1. Perfect collinearity in covariates
+2. Too few observations relative to parameters
+3. Fixed effects that absorb all variation
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Check for collinearity
+   import numpy as np
+   X = data[['x1', 'x2', 'x3']].values
+   print(f"Matrix rank: {np.linalg.matrix_rank(X)} vs {X.shape[1]} columns")
+
+   # Remove redundant covariates
+   # Or use fewer fixed effects
+
+   # For SyntheticDiD, increase regularization
+   sdid = SyntheticDiD(lambda_reg=1e-4)  # default is 1e-6
+
+"Bootstrap iterations failed" warning
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** SyntheticDiD warns that many bootstrap iterations failed.
+
+**Causes:**
+
+1. Small sample size leads to singular matrices in resamples
+2. Insufficient pre-treatment periods for weight computation
+3. Near-singular weight matrices
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Increase regularization
+   sdid = SyntheticDiD(lambda_reg=1e-4, n_bootstrap=500)
+
+   # Or use placebo-based inference instead
+   sdid = SyntheticDiD(n_bootstrap=0)  # Uses placebo inference
+
+   # Ensure sufficient pre-treatment periods (recommend >= 4)
+
+Standard Error Issues
+---------------------
+
+"Standard errors seem too small/large"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** SEs don't match expectations or other software.
+
+**Causes:**
+
+1. Wrong clustering level
+2. Not accounting for serial correlation
+3. Different SE formulas (HC0 vs HC1 vs cluster)
+
+**Solutions:**
+
+.. code-block:: python
+
+   # For panel data, always cluster at unit level
+   results = did.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='unit_id')
+
+   # Compare SE methods
+   did_robust = DifferenceInDifferences()
+   did_cluster = DifferenceInDifferences()
+   did_wild = DifferenceInDifferences(inference='wild_bootstrap')
+
+   r1 = did_robust.fit(data, outcome='y', treated='treated', post='post')
+   r2 = did_cluster.fit(data, outcome='y', treated='treated',
+                        post='post', cluster_col='unit_id')
+   r3 = did_wild.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='unit_id')
+
+   print(f"Robust SE: {r1.se:.4f}")
+   print(f"Cluster SE: {r2.se:.4f}")
+   print(f"Wild bootstrap SE: {r3.se:.4f}")
+
+"Wild bootstrap takes too long"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** Bootstrap inference is slow.
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Reduce number of bootstrap iterations (default is 999)
+   did = DifferenceInDifferences(inference='wild_bootstrap', n_bootstrap=499)
+
+   # Note: Fewer iterations = less precise p-values
+   # 499 is minimum recommended for publication
+
+Staggered Adoption Issues
+-------------------------
+
+"No never-treated units found"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** CallawaySantAnna fails when using ``control_group='never_treated'``.
+
+**Causes:**
+
+1. All units are eventually treated
+2. ``first_treat`` column has no never-treated indicator (typically 0 or inf)
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Check first_treat distribution
+   print(data['first_treat'].value_counts())
+
+   # Option 1: Use not-yet-treated as controls
+   cs = CallawaySantAnna(control_group='not_yet_treated')
+
+   # Option 2: Mark never-treated units correctly
+   # Never-treated should have first_treat = 0 or np.inf
+   data.loc[data['ever_treated'] == 0, 'first_treat'] = 0
+
+"Group-time effects have large standard errors"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** ATT(g,t) estimates are imprecise.
+
+**Causes:**
+
+1. Small cohort sizes
+2. Few comparison periods
+3. High variance in outcomes
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Check cohort sizes
+   print(data.groupby('first_treat')['unit_id'].nunique())
+
+   # Use bootstrap for better inference
+   results = cs.fit(data, ...)
+   bootstrap_results = results.bootstrap(n_bootstrap=999)
+
+   # Aggregate to get more precise estimates
+   event_study = results.aggregate('event_time')
+   overall_att = results.att  # Aggregated ATT
+
+Visualization Issues
+--------------------
+
+"Event study plot looks wrong"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** Plot has unexpected gaps, wrong reference period, or missing periods.
+
+**Solutions:**
+
+.. code-block:: python
+
+   from diff_diff import plot_event_study
+
+   # Check your results first
+   print(results.period_effects)  # or results.event_study_effects
+
+   # Specify reference period explicitly
+   plot_event_study(results, reference_period=-1)
+
+   # For CallawaySantAnna, aggregate first
+   event_study = results.aggregate('event_time')
+   plot_event_study(event_study)
+
+"Plot doesn't show in Jupyter"
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** Matplotlib figure doesn't display.
+
+**Solutions:**
+
+.. code-block:: python
+
+   import matplotlib.pyplot as plt
+
+   # Option 1: Use plt.show()
+   fig = plot_event_study(results)
+   plt.show()
+
+   # Option 2: Use inline magic (Jupyter)
+   %matplotlib inline
+
+   # Option 3: Return and display figure
+   fig = plot_event_study(results)
+   fig  # Display in Jupyter
+
+Performance Issues
+------------------
+
+"Estimation is slow"
+~~~~~~~~~~~~~~~~~~~~
+
+**Problem:** Fitting takes a long time.
+
+**Causes:**
+
+1. Large dataset with many fixed effects
+2. Bootstrap inference with many iterations
+3. CallawaySantAnna with many cohorts and time periods
+
+**Solutions:**
+
+.. code-block:: python
+
+   # Use absorb instead of fixed_effects for high-dimensional FE
+   twfe = TwoWayFixedEffects()
+   results = twfe.fit(data, outcome='y', treated='treated',
+                      unit='unit_id', time='period',
+                      absorb=['unit_id', 'period'])  # Faster than fixed_effects
+
+   # Reduce bootstrap iterations for initial exploration
+   did = DifferenceInDifferences(inference='wild_bootstrap', n_bootstrap=99)
+
+   # For CallawaySantAnna, start without bootstrap
+   cs = CallawaySantAnna()
+   results = cs.fit(data, ...)
+   # Only bootstrap for final results
+   bootstrap_results = results.bootstrap(n_bootstrap=999)
+
+Getting Help
+------------
+
+If you encounter issues not covered here:
+
+1. **Check the API documentation** for parameter details
+2. **Run validation** with ``validate_did_data()`` to catch data issues
+3. **Start simple** with basic DiD before adding complexity
+4. **Compare with known results** using ``generate_did_data()``
+
+.. code-block:: python
+
+   # Generate test data with known effect
+   from diff_diff import generate_did_data, DifferenceInDifferences
+
+   data = generate_did_data(n_units=100, n_periods=10, treatment_effect=2.0)
+   did = DifferenceInDifferences()
+   results = did.fit(data, outcome='y', treated='treated', post='post')
+   print(f"True effect: 2.0, Estimated: {results.att:.3f}")
+
+For bugs or feature requests, please open an issue on
+`GitHub <https://github.com/igerber/diff-diff/issues>`_.
diff --git a/pyproject.toml b/pyproject.toml
index 90187cb..d23b5b8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "diff-diff"
-version = "0.6.0"
+version = "1.0.0"
 description = "A library for Difference-in-Differences causal inference analysis"
 readme = "README.md"
 license = "MIT"
@@ -20,7 +20,7 @@ keywords = [
     "treatment-effects",
 ]
 classifiers = [
-    "Development Status :: 3 - Alpha",
+    "Development Status :: 5 - Production/Stable",
     "Intended Audience :: Science/Research",
     "Operating System :: OS Independent",
     "Programming Language :: Python :: 3",