Structural break detection and estimation for time series and panel data in Python.
"Did the relationship change? When? By how much?" — The three questions every applied researcher faces when parameters might not be constant.
Economic and financial time series rarely stay stable forever. Policy shifts, crises, and regime changes cause the data-generating process to break: coefficients jump, variances shift, trends bend. Ignoring these breaks means your regressions are misspecified, your confidence intervals are wrong, and policy conclusions drawn from pooled samples are unreliable.
Structural break methods ask: did something change, when, and by how much? The Bai–Perron (1998, 2003) framework answers these questions with formal test statistics, break-date estimation via global SSR minimization, and asymptotic confidence intervals for each break location.
Despite two decades of theoretical maturity, Python had no dedicated implementation that returns break dates, confidence intervals, regime coefficients, test statistics, and selection decisions as programmatic objects. xtbreak fills that gap.
- Test for structural breaks — SupF, UDmax, and sequential tests with known or unknown break counts
- Estimate break dates — global SSR minimization via the Bai–Perron dynamic programming algorithm
- Confidence intervals — asymptotic CIs for each estimated break location
- Break number selection — sequential testing and information criteria (BIC/LWZ)
- Regime statistics — per-regime coefficients, standard errors, and sample sizes in structured result objects
- Panel data support — partial structural breaks with cross-sectional dependence factors
- Serializable results — every result object is JSON-safe for logging, storage, and pipeline integration
- Pure Python + NumPy — no compiled extensions, no Stata dependency, installs anywhere Python runs
pip install git+https://github.com/gorgeousfish/pyxtbreak.git#subdirectory=xtbreak-pyOptional dependencies for full functionality:
pip install scipy # p-values for Chow tests
pip install pandas # DataFrame input supportDetect and estimate a single mean shift in a simulated series:
import numpy as np
from xtbreak import test, estimate
# Simulate: mean jumps from 0 to 3 at observation 50
np.random.seed(2024)
y = np.concatenate([np.random.normal(0, 1, 50), np.random.normal(3, 1, 50)])
x = np.ones((100, 1))
# Test for one structural break
result = test(y, x, breaks=1, trimming=0.15, vce='ssr')
print(f"SupF statistic: {result.statistic:.2f}")
# SupF statistic: 232.47
# Estimate the break location
est = estimate(y, x, breaks=1, trimming=0.15)
brk = est.break_estimates[0]
print(f"Estimated break at index: {brk.index}")
# Estimated break at index: 50
print(f"95% confidence interval: {brk.ci_index}")
# 95% confidence interval: (48, 52)
# Inspect regime-specific parameters
for rs in est.regime_statistics:
print(f"Regime {rs.regime_index}: coef={rs.coefficients[0]:.4f} "
f"se={rs.standard_errors[0]:.4f} n={rs.n_observations}")
# Regime 0: coef=-0.0002 se=0.1369 n=50
# Regime 1: coef=3.0600 se=0.1468 n=50The estimated break at index 50 exactly recovers the true change point. The confidence interval is tight (48–52), and regime coefficients cleanly separate the two means.
Every call returns a structured result object, not a print dump. You can:
# Access break dates programmatically
est.break_estimates[0].index # 50
est.break_estimates[0].ci_index # (48, 52)
# Iterate over regimes
for rs in est.regime_statistics:
rs.coefficients # np.ndarray
rs.standard_errors
rs.n_observations
# Serialize for reports
import json
json.dumps(est.to_dict()) # JSON-safe dictNo screen-scraping. No regex on printed output. Just Python objects.
| Situation | xtbreak fits? | Notes |
|---|---|---|
| Testing whether a regression relationship changed at unknown dates | Yes | Core use case — SupF and UDmax tests |
| Estimating when structural breaks occurred | Yes | Global SSR minimization with CIs |
| Choosing how many breaks best describe the data | Yes | Sequential tests + BIC/LWZ selection |
| Panel data with common break dates | Yes | Partial break model with factor CSD |
| Cointegration or unit-root break testing | No | Use dedicated cointegration packages |
| Real-time online change-point detection | No | Designed for retrospective full-sample analysis |
| Markov-switching or threshold models | No | Different modeling framework |
| Function | Purpose | Returns |
|---|---|---|
test() |
Test null of no breaks against alternative of k breaks | TestResult with statistic, critical values |
estimate() |
Estimate break locations and regime parameters | EstimateResult with breaks, CIs, regime stats |
select() |
Determine number of breaks via sequential/information criteria | SelectResult with selected count, step details |
All three functions accept NumPy arrays directly. Panel structure is specified via entity and time parameters.
For complete API documentation, see docs/api-reference.md.
For extended usage patterns (selection, robust VCE, panel, serialization), see docs/quickstart.md.
For runnable example scripts, see examples/.
- Retrospective analysis only — requires the full sample; not designed for streaming/online detection
- Linear models — breaks in linear regression coefficients; nonlinear models are out of scope
- No unit-root/cointegration break tests — focuses on Bai–Perron structural change in stationary or trend-stationary settings
- Trimming constraint — minimum regime length is controlled by the trimming parameter (default 15%); very short regimes near endpoints cannot be detected
- scipy optional — without scipy, Chow test p-values are unavailable (test statistics still computed)
This package implements the econometric framework from:
- Bai, J., & Perron, P. (1998). Estimating and testing linear models with multiple structural changes. Econometrica, 47–78.
- Bai, J., & Perron, P. (2003). Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18(1), 1–22.
- Ditzen, J., Karavias, Y., & Westerlund, J. (2025). Testing and estimating structural breaks in time series and panel data in Stata. The Stata Journal, 25(3), 526–560.
- Ditzen, J., Karavias, Y., & Westerlund, J. (2022). Multiple structural breaks in interactive effects panel data and the impact of quantitative easing on bank lending. arXiv preprint arXiv:2211.06707.
@software{xtbreak_python2026,
title = {xtbreak: Structural Break Detection and Estimation in Python},
author = {Cai, Xuanyu and Xu, Wenli},
year = {2026},
version = {0.1.0},
url = {https://github.com/gorgeousfish/xtbreak-py}
}
@article{ditzen2025xtbreak,
title = {Testing and estimating structural breaks in time series and panel data in {Stata}},
author = {Ditzen, Jan and Karavias, Yiannis and Westerlund, Joakim},
journal = {The Stata Journal},
volume = {25},
number = {3},
pages = {526--560},
year = {2025}
}Python Implementation:
- Xuanyu Cai, City University of Macau — xuanyuCAI@outlook.com
- Wenli Xu, City University of Macau — wlxu@cityu.edu.mo
Methodology:
- Jan Ditzen, Free University of Bozen-Bolzano
- Yiannis Karavias, University of Birmingham
- Joakim Westerlund, Lund University