Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f3c99b5
feat(purple_alien): align config_sweep with operational hyperparameters
Polichinel Feb 24, 2026
554f61a
fix(purple_alien): add missing 'name' key to sweep_config
Polichinel Feb 24, 2026
32e9215
fix(baselines): rename config key months → window_months, add time_steps
Polichinel Mar 14, 2026
94aad18
feat(configs): declare prediction_format in all model configs
Polichinel Mar 15, 2026
b04a97b
feat: add test suite, base_docs governance, fix config gaps and archi…
Polichinel Mar 15, 2026
66cab0b
chore: remove obsolete debug scripts and archived logs
Polichinel Mar 15, 2026
773c9f2
fix(tests): resolve ruff lint errors in test suite
Polichinel Mar 15, 2026
654b56e
feat: add integration test runner script
Polichinel Mar 15, 2026
df2c6ee
fix(models): add missing requirements.txt for 33 models
Polichinel Mar 15, 2026
5a2fd2e
fix(integration-tests): use single conda env, exclude purple_alien
Polichinel Mar 15, 2026
be49655
fix(configs): rename targets→regression_targets, metrics→regression_p…
Polichinel Mar 16, 2026
b75097b
fix(darts): add missing ReproducibilityGate parameters to 6 models
Polichinel Mar 16, 2026
90e3753
fix(darts): add missing architecture-specific params to 4 models
Polichinel Mar 16, 2026
f2f7cbc
fix(configs): fix queryset typo, migrate ensemble target/metrics keys
Polichinel Mar 16, 2026
e2d53e7
refactor(docs): merge base_docs/ into docs/, add reports/, remove scr…
Polichinel Mar 16, 2026
e08f901
feat(tests): add regression-prevention tests for top 3 review gaps
Polichinel Mar 16, 2026
88eef88
refactor(tests): import ReproducibilityGate params from views_r2darts2
Polichinel Mar 16, 2026
b580043
fix(docs): remove stale common/ reference from ADR-001
Polichinel Mar 16, 2026
8a5641b
fix(docs): remove 8 stale common/ references from governance docs
Polichinel Mar 16, 2026
f246257
feat(models): commit remaining ranger model files
Polichinel Mar 16, 2026
be69ee9
feat(integration-tests): add --level flag to filter by cm/pgm
Polichinel Mar 16, 2026
3922297
docs(integration-tests): add full guide and link from README
Polichinel Mar 17, 2026
f27facf
feat(integration-tests): add --library flag to filter by architecture…
Polichinel Mar 17, 2026
d3e6805
fix(docs): add --library flag to README integration testing table
Polichinel Mar 17, 2026
c846ca2
fix(configs): remove duplicate import in blank_space config_queryset
Polichinel Mar 17, 2026
b8709d8
fix(docs): remove hardcoded model count from integration test guide
Polichinel Mar 17, 2026
231a3ae
feat(ci): add pytest workflow for push and PR gates
Polichinel Mar 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
22 changes: 22 additions & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Run Tests
on:
push:
branches: [main, development]
pull_request:
branches: [main, development]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- uses: actions/setup-python@v4
with:
python-version: '3.11'

- name: Install dependencies
run: pip install views_pipeline_core pytest

- name: Run tests
run: pytest
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
# But please, take a second to consult with the team before doing so anyways.


# Integration test logs
logs/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down Expand Up @@ -86,6 +89,7 @@ coverage.xml
*.py,cover
.hypothesis/
.pytest_cache/
.ruff_cache/
cover/

# Translations
Expand Down Expand Up @@ -245,7 +249,7 @@ cython_debug/
*.bak

# txt logs
# *.txt
*.txt

# logs
*.log
Expand Down
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ APPWRITE_DATASTORE_PROJECT_ID=""
- [Ensemble scripts](#ensemble-scripts)
- [Ensemble filesystem](#ensemble-filesystem)
- [Running an ensemble](#running-an-ensemble)
- [Integration Testing](#integration-testing)
- [Implemented Models](#implemented-models)
- [Model Catalogs](#catalogs)
- [Country-Month Models](#country-month-model-catalog)
Expand Down Expand Up @@ -332,6 +333,41 @@ Consequently, in order to train a model and generate predictions, execute either
As of now, the only implemented model architecture is the [stepshifter model](https://github.com/views-platform/views-stepshifter/blob/main/README.md). Experienced users have the possibility to develop their own model architecture including their own model class manager. Head over to [views-pipeline-core](https://github.com/views-platform/views-pipeline-core) for further information on the model class manager and on how to develop new model architectures.


## Integration Testing
<a name="integration-testing"></a>

The repository includes an integration test runner that verifies models haven't been broken by changes in this repo or in upstream/downstream packages. It trains and evaluates every model end-to-end on calibration and validation partitions, running them sequentially in a single shared conda environment, and produces a summary table of PASS/FAIL/TIMEOUT results with per-model logs.

```bash
# Run all models (calibration + validation)
bash run_integration_tests.sh

# Run only country-month models
bash run_integration_tests.sh --level cm

# Run only baseline models
bash run_integration_tests.sh --library baseline

# Run specific models with a custom timeout
bash run_integration_tests.sh --models "counting_stars bad_blood" --timeout 3600
```

| Flag | Default | Description |
|------|---------|-------------|
| `--models "m1 m2"` | all models | Run only these models |
| `--level` `cm` or `pgm` | no filter | Run only models at this level of analysis |
| `--library NAME` | no filter | Run only models using this library (baseline/stepshifter/r2darts2/hydranet) |
| `--exclude "m1 m2"` | `"purple_alien"` | Skip these models (replaces the default, does not append) |
| `--partitions "p1 p2"` | `"calibration validation"` | Partitions to test |
| `--timeout SECONDS` | `1800` | Max wall-clock time per model run |
| `--env NAME` | `views_pipeline` | Conda environment to activate |

Logs are written to `logs/integration_test_<timestamp>/` with a `summary.log` and per-model logs under `{partition}/{model}.log`.

For the full guide — including how model discovery works, how to read failure logs, and important caveats — see [docs/run_integration_tests.md](docs/run_integration_tests.md).

---

## Implemented Models

In addition to the possibility of easily creating new models and ensembles, in order to maintain an organized and structured overview over all of the implemented models, the views-models repository also contains model catalogs containing all of the information about individual models. This information is collected from the metadata of each model and entails:
Expand Down
85 changes: 0 additions & 85 deletions compare_configs.py

This file was deleted.

38 changes: 9 additions & 29 deletions create_catalogs.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import importlib.util
import logging
logging.basicConfig(
level=logging.ERROR, format="%(asctime)s %(name)s - %(levelname)s - %(message)s"
Expand Down Expand Up @@ -38,27 +39,26 @@ def extract_models(model_class):
"""

model_dict = {}
tmp_dict = {}
config_meta = os.path.join(model_class.configs, 'config_meta.py')
config_deployment = os.path.join(model_class.configs, 'config_deployment.py')
config_hyperparameters = os.path.join(model_class.configs, 'config_hyperparameters.py')


if os.path.exists(config_meta):
logging.info(f"Found meta config: {config_meta}")
with open(config_meta, 'r') as file:
code = file.read()
exec(code, {}, tmp_dict)
model_dict.update(tmp_dict['get_meta_config']())
spec = importlib.util.spec_from_file_location("config_meta", config_meta)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
model_dict.update(module.get_meta_config())
model_dict['queryset'] = create_link(model_dict['queryset'], model_class.queryset_path) if 'queryset' in model_dict else 'None'


if os.path.exists(config_deployment):
logging.info(f"Found deployment config: {config_deployment}")
with open(config_deployment, 'r') as file:
code = file.read()
exec(code, {}, tmp_dict)
model_dict.update(tmp_dict['get_deployment_config']())
spec = importlib.util.spec_from_file_location("config_deployment", config_deployment)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
model_dict.update(module.get_deployment_config())

if os.path.exists(config_hyperparameters):
logging.info(f"Found hyperparameters config: {config_hyperparameters}")
Expand Down Expand Up @@ -192,9 +192,6 @@ def replace_table_in_section(content, section_name, new_table):


if __name__ == "__main__":
#import time
#start_time = time.time()

models_list_cm = []
models_list_pgm = []
ensemble_list = []
Expand Down Expand Up @@ -224,20 +221,6 @@ def replace_table_in_section(content, section_name, new_table):



# markdown_table_pgm = generate_markdown_table(models_list_pgm)
# with open('pgm_model_catalog.md', 'w') as f:
# f.write(markdown_table_pgm)

# markdown_table_cm = generate_markdown_table(models_list_cm)
# with open('cm_model_catalog.md', 'w') as f:
# f.write(markdown_table_cm)

# markdown_table_ensembles = generate_markdown_table(ensemble_list)
# with open('ensembles_catalog.md', 'w') as f:
# f.write(markdown_table_ensembles)





markdown_table_cm = generate_markdown_table(models_list_cm)
Expand All @@ -252,6 +235,3 @@ def replace_table_in_section(content, section_name, new_table):
markdown_table_ensembles,
)


#print("--- %s seconds ---" % (time.time() - start_time))

79 changes: 79 additions & 0 deletions docs/ADRs/000_use_of_adrs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# ADR-000: Use of Architecture Decision Records (ADRs)

**Status:** Accepted
**Date:** 2026-03-15
**Deciders:** Simon (project maintainer)
**Informed:** All contributors

---

## Context

views-models is a monorepo containing ~66 forecasting models, 5 ensembles, data extractors, postprocessors, and tooling for the VIEWS conflict prediction platform. The repository has multiple contributors, evolving conventions, and a history of implicit decisions that have led to architectural drift (e.g., two CLI patterns, duplicated partition configs, inconsistent config keys).

Without a shared record of *why* decisions were made, the project risks:
- Re-litigating settled questions (e.g., why all models use the same partition boundaries)
- Accidental reversals of critical design choices
- Accumulating invisible technical debt
- Losing institutional memory as contributors change

---

## Decision

We will use **Architecture Decision Records (ADRs)** to document significant technical, architectural, and conceptual decisions in this project.

ADRs are:
- Written in Markdown
- Stored in the repository under `docs/ADRs/`
- Numbered sequentially
- Treated as first-class project artifacts

---

## When to Write an ADR

Write an ADR when making a decision that:
- Affects model configuration conventions or required config keys
- Changes partition boundaries, training windows, or evaluation methodology
- Introduces new shared infrastructure or conventions
- Changes the CLI API pattern or model launcher conventions
- Modifies ensemble reconciliation logic or CM/PGM ordering
- Affects the CI/CD pipeline or catalog generation

Do **not** write ADRs for:
- Adding a new model that follows existing conventions
- Routine hyperparameter changes within a single model
- Documentation-only changes

---

## Lifecycle

- **Proposed** — decision under consideration
- **Accepted** — decision is active and authoritative
- **Superseded** — replaced by a newer ADR
- **Deprecated** — decision remains but should no longer be used

Decisions are never deleted. If a decision changes, it is **superseded**, not erased.

---

## Consequences

### Positive
- Clearer decision-making across a multi-contributor forecasting platform
- Fewer repeated debates about config conventions
- Easier onboarding for new model developers
- Better long-term coherence as the model zoo grows

### Negative
- Small upfront cost in writing
- Requires discipline to maintain

---

## References

- `docs/ADRs/adr_template.md`
- `docs/ADRs/README.md`
Loading
Loading