views-platform · Polichinel · Apr 2, 2026 · Mar 14, 2026 · Mar 31, 2026 · Mar 31, 2026
diff --git a/.gitignore b/.gitignore
@@ -215,4 +215,5 @@ cython_debug/
 
 # logs
 *.log
-*.log.*reports/
+*.log.*
+reports/
diff --git a/README.md b/README.md
@@ -59,10 +59,16 @@ The library is built on a **three-layer architecture** with a framework-agnostic
 ## 🚀 **Quick Start**
 
 ```python
-from views_evaluation import PandasAdapter, NativeEvaluator
+from views_evaluation import EvaluationFrame, NativeEvaluator
+import numpy as np
 
-# 1. Convert DataFrames → EvaluationFrame
-ef = PandasAdapter.from_dataframes(actual=actuals, predictions=predictions_list, target="ged_sb_best")
+# 1. Construct EvaluationFrame with NumPy arrays
+ef = EvaluationFrame(
+    y_true=y_true_array,
+    y_pred=y_pred_array,  # shape (N, S) where S >= 1
+    identifiers={'time': times, 'unit': units, 'origin': origins, 'step': steps},
+    metadata={'target': 'ged_sb_best'},
+)
 
 # 2. Configure and evaluate
 config = {
@@ -89,7 +95,7 @@ VIEWS Evaluation ensures **forecasting accuracy and model robustness** as the **
 
 ### **Pipeline Integration:**
 1. **Model Predictions** →
-2. **PandasAdapter** (DataFrame → EvaluationFrame) →
+2. **EvaluationFrame** (validated NumPy container) →
 3. **NativeEvaluator** (metrics computation) →
 4. **EvaluationReport** (structured results)  
 
@@ -195,7 +201,7 @@ config = {
 ---
 
 * **Data Integrity Checks**: Validates input arrays for shape consistency, NaN/infinity, and required identifiers.
-* **Automatic Index Matching**: `PandasAdapter` aligns actual and predicted values based on MultiIndex structures.
+* **Framework-Agnostic Core**: All evaluation operates on pure NumPy arrays via `EvaluationFrame`.
 * **Metric Catalog & Profiles**: Hyperparameters are managed through named evaluation profiles with a Chain of Responsibility resolver (model overrides → profile → fail loud).  
 
 ---
@@ -223,11 +229,11 @@ Level 0 — Pure Core (NumPy + SciPy only, zero framework imports)
   Profiles              Named hyperparameter sets (base, hydranet_ucdp, ...)
 
 Level 1 — Bridge / Adapter
-  PandasAdapter         DataFrame → EvaluationFrame conversion (PHASE-3-DELETE)
+  EvaluationFrame       Validated NumPy data container
   EvaluationReport      Results container with DataFrame/dict export
 
 Level 2 — Legacy Orchestrator
-  EvaluationManager     Deprecated wrapper; delegates to Level 0
+  MetricCatalog         Genome registry and parameter resolver
 ```
 
 **Key design decisions:**
@@ -244,7 +250,7 @@ views-evaluation/
 ├── views_evaluation/
 │   ├── __init__.py                        # Public API exports
 │   ├── adapters/
-│   │   └── pandas.py                      # PandasAdapter (PHASE-3-DELETE)
+│   │   └── __init__.py                     # Reserved for future framework bridges
 │   ├── evaluation/
 │   │   ├── config_schema.py               # EvaluationConfig TypedDict
 │   │   ├── evaluation_frame.py            # Core data container

diff --git a/documentation/ADRs/000_use_of_adrs.md b/documentation/ADRs/000_use_of_adrs.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/001_silicon_based_agent_protocol.md b/documentation/ADRs/001_silicon_based_agent_protocol.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/010_ontology_of_evaluation.md b/documentation/ADRs/010_ontology_of_evaluation.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/011_topology_and_dependency_rules.md b/documentation/ADRs/011_topology_and_dependency_rules.md
@@ -2,15 +2,17 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 
 ## Context
 
 In complex evaluation systems, architectural fragility often emerges not from incorrect logic, but from uncontrolled dependencies between components.
 
-The Evaluation repository pre-Feb 2026 suffered from "Pandas-heavy" coupling. Higher-level logic (EvaluationManager) depended on Pandas `MultiIndex` internals for alignment, which constrained our ability to scale probabilistic forecasts (N, S) due to memory/performance limits of Pandas' "lists-in-cells."
+The Evaluation repository pre-Feb 2026 suffered from "Pandas-heavy" coupling. Higher-level logic (e.g., Pipeline Core) depended on Pandas `MultiIndex` internals for alignment, which constrained our ability to scale probabilistic forecasts (N, S) due to memory/performance limits of Pandas' "lists-in-cells."
 
 Without explicit topology rules, we risk high-level math modules beginning to depend on implementation details (e.g., NumPy indexing vs Xarray coordinates).
 
@@ -29,16 +31,16 @@ Violations are architectural defects.
 The Evaluation Core is the lowest-level layer (most stable). 
 
 - **Level 0: Evaluation Core** (Pure NumPy, `EvaluationFrame`, `NativeEvaluator`). No external imports except `numpy` and `scipy`.
-- **Level 1: Adapters** (Framework-specific bridges like `PandasAdapter`). May depend on Level 0.
-- **Level 2: Orchestration** (e.g., `EvaluationManager`, Pipeline Core). May depend on Level 1 and Level 0.
+- **Level 1: Adapters** (Framework-specific bridges, reserved for future use). May depend on Level 0.
+- **Level 2: Orchestration** (e.g., Pipeline Core — external to this repo). May depend on Level 1 and Level 0.
 
 Dependency direction must always flow **toward the Core**.
 
 ## Forbidden Patterns
 
 - Math kernels importing `pandas` or `polars`.
 - `EvaluationFrame` containing anything other than NumPy arrays.
-- Higher-level modules (e.g., `EvaluationManager`) passing DataFrames directly into metric functions.
+- Higher-level modules (e.g., external orchestrators) passing DataFrames directly into metric functions.
 
 If a dependency feels “convenient but wrong,” it probably is.
 

diff --git a/documentation/ADRs/012_authority_over_inference.md b/documentation/ADRs/012_authority_over_inference.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 
@@ -58,5 +60,5 @@ the system **must fail loudly and immediately**.
 - Improves debuggability: we can inspect the `EvaluationFrame` and see exactly what the system *thinks* it is evaluating.
 
 ### Negative
-- Requires more metadata in the `EvaluationFrame` and `PandasAdapter`.
+- Requires more metadata in the `EvaluationFrame` and external adapters.
 - Some "convenient" hacks are disallowed.
diff --git a/documentation/ADRs/013_observability_and_explicit_failure.md b/documentation/ADRs/013_observability_and_explicit_failure.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/014_boundary_contracts_and_validation.md b/documentation/ADRs/014_boundary_contracts_and_validation.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 
@@ -25,7 +27,7 @@ Every boundary between components (e.g., Adapter → Core) must define:
 - Declared invariants.
 
 ### 2. Validation at Entry
-All configuration and external inputs must be validated at the system boundary (e.g., in `EvaluationManager` or `Adapters`).
+All configuration and external inputs must be validated at the system boundary (e.g., in the `EvaluationFrame` constructor or `NativeEvaluator`).
 - Before execution begins.
 - Before orchestration proceeds.
 

diff --git a/documentation/ADRs/020_multi_perspective_testing.md b/documentation/ADRs/020_multi_perspective_testing.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/021_intent_contracts_for_classes.md b/documentation/ADRs/021_intent_contracts_for_classes.md
@@ -2,7 +2,9 @@
 
 **Status:** Accepted  
 **Date:** 2026-02-25  
-**Deciders:** Project maintainers, Gemini CLI  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 
@@ -14,7 +16,7 @@ To prevent semantic drift, non-trivial classes require an explicit declaration o
 
 ## Decision
 
-All **non-trivial and substantial classes** (e.g., `EvaluationFrame`, `NativeEvaluator`, `PandasAdapter`) must have an explicit **intent contract**.
+All **non-trivial and substantial classes** (e.g., `EvaluationFrame`, `NativeEvaluator`, `EvaluationReport`) must have an explicit **intent contract**.
 
 An intent contract is a short, human-readable description of:
 - **Purpose**: what the class is for.

diff --git a/documentation/ADRs/022_evolution_and_stability.md b/documentation/ADRs/022_evolution_and_stability.md
@@ -3,6 +3,8 @@
 **Status:** Proposed (Deferred)  
 **Date:** 2026-02-25  
 **Deciders:** —  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ---
 

diff --git a/documentation/ADRs/023_technical_risk_register.md b/documentation/ADRs/023_technical_risk_register.md
@@ -0,0 +1,69 @@
+# ADR-023: Technical Risk Register
+
+**Status:** Accepted  
+**Date:** 2026-03-31  
+**Deciders:** Project maintainers  
+**Consulted:** —  
+**Informed:** All contributors  
+
+---
+
+## Context
+
+As the views-evaluation codebase matures through its EvaluationFrame refactor and metric catalog implementation, structural risks have been identified through repo-assimilation and expert review. Without a centralized, living register of these risks, concerns are scattered across reports, post-mortems, and tribal knowledge.
+
+A formalized risk register ensures that architectural concerns are:
+- tracked with consistent metadata,
+- prioritized by severity,
+- linked to their source of discovery,
+- and revisited systematically.
+
+---
+
+## Decision
+
+This repository maintains a **Technical Risk Register** at `reports/technical_risk_register.md` as a first-class governance artifact.
+
+### Concern Format
+
+Each entry uses:
+- **ID:** `C-xx` for concerns, `D-xx` for disagreements
+- **Tier:** 1 (critical) through 4 (informational)
+- **Trigger:** The specific circumstance under which the risk becomes actionable
+- **Source:** How the concern was identified (e.g. repo-assimilation, expert review, falsification audit)
+
+### Tier Definitions
+
+| Tier | Severity | Response |
+|------|----------|----------|
+| 1 | Critical — blocks release or causes data corruption | Must be resolved before next release |
+| 2 | High — significant architectural risk | Must have a mitigation plan within one sprint |
+| 3 | Medium — known weakness, bounded impact | Track and address opportunistically |
+| 4 | Low/Informational — minor or cosmetic | Document and revisit during tech debt cleanup |
+
+### Lifecycle
+
+- Concerns are opened during expert reviews, tech debt audits, repo-assimilation, and falsification audits.
+- Concerns are closed when the risk is resolved, mitigated, or explicitly accepted with rationale.
+- The register header tracks the total count for quick reference.
+
+---
+
+## Consequences
+
+### Positive
+- Centralized visibility of all known risks
+- Consistent prioritization and tracking
+- Prevents risks from being forgotten between conversations
+
+### Negative
+- Requires discipline to keep updated
+- Risk of register staleness if not reviewed regularly
+
+---
+
+## References
+
+- `reports/technical_risk_register.md`
+- Repo-assimilation output (2026-03-31)
+- `reports/technical_debt_backlog.md` (related but focuses on actionable debt, not structural risks)
diff --git a/documentation/ADRs/030_evaluation_strategy.md b/documentation/ADRs/030_evaluation_strategy.md
@@ -1,12 +1,10 @@
 # ADR-030: Evaluation Strategy
 
-| ADR Info            | Details           |
-|---------------------|-------------------|
-| Subject             | Evaluation Strategy  |
-| ADR Number          | 030   |
-| Status              | Accepted |
-| Author              | Xiaolong, Mihai|
-| Date                | 16.07.2025 |
+**Status:** Accepted  
+**Date:** 2025-07-16  
+**Deciders:** Xiaolong, Mihai  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ## Context
 To ensure reliable and realistic model performance assessment, our forecasting framework supports both **offline** and **online** evaluation strategies. These strategies serve complementary purposes: offline evaluation simulates the forecasting process retrospectively, while online evaluation assesses actual deployed forecasts against observed data.

diff --git a/documentation/ADRs/031_evaluation_metrics.md b/documentation/ADRs/031_evaluation_metrics.md
@@ -1,12 +1,10 @@
 # ADR-031: Evaluation Metrics
 
-| ADR Info            | Details            |
-|---------------------|--------------------|
-| Subject             | Evaluation Metrics |
-| ADR Number          | 031                |
-| Status              | Accepted           |
-| Author              | Xiaolong           |
-| Date                | 12.09.2024         |
+**Status:** Accepted  
+**Date:** 2024-09-12  
+**Deciders:** Xiaolong  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ## Context
 In the context of the VIEWS pipeline, it is necessary to evaluate the models using a robust set of metrics that account for the characteristics of conflict data, such as right-skewness and zero-inflation in the outcome variable.

diff --git a/documentation/ADRs/032_metric_calculation_schemas.md b/documentation/ADRs/032_metric_calculation_schemas.md
@@ -1,12 +1,10 @@
 # ADR-032: Metric Calculation Schemas
 
-| ADR Info            | Details           |
-|---------------------|-------------------|
-| Subject             | Metric Calculation  |
-| ADR Number          | 032   |
-| Status              | Accepted|
-| Author              | Mihai, Xiaolong|
-| Date                | 31.10.2024 |
+**Status:** Accepted  
+**Date:** 2024-10-31  
+**Deciders:** Mihai, Xiaolong  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ## Context
 Traditional machine learning metrics do not directly translate to time-series forecasting across multiple horizons. A standardized approach to regrouping data is necessary.

diff --git a/documentation/ADRs/040_evaluation_input_schema.md b/documentation/ADRs/040_evaluation_input_schema.md
@@ -1,18 +1,16 @@
 # ADR-040: Evaluation Input Schema
 
-| ADR Info            | Details                 |
-|---------------------|-------------------------|
-| Subject             | Evaluation Input Schema |
-| ADR Number          | 040                     |
-| Status              | Accepted                |
-| Author              | Xiaolong                |
-| Date                | 16.06.2025              |
+**Status:** Accepted  
+**Date:** 2025-06-16  
+**Deciders:** Xiaolong  
+**Consulted:** —  
+**Informed:** All contributors  
 
 ## Context
 
 A consistent input format is required to compare model performance across the VIEWS pipeline.
-Two integration paths exist: the native path (primary) and the legacy path (`EvaluationManager`,
-deprecated per ADR-011).
+The native path via `EvaluationFrame` is the sole integration path. The legacy
+`EvaluationManager` path was removed in Phase 3.
 
 ## Decision
 
@@ -42,9 +40,9 @@ Prediction type (point vs. sample) is determined structurally from the number of
 No name-based inference occurs (ADR-012). Callers must ensure all cells in a prediction column
 have the same number of values.
 
-### Native Path Invariants (PandasAdapter)
+### Native Path Invariants
 
-When using `PandasAdapter`, the following identifiers are synthesised automatically:
+When constructing an `EvaluationFrame`, the following identifiers must be provided:
 
 | Identifier | Source                                           |
 |------------|--------------------------------------------------|
-Original file line number
+Diff line change
@@ Expand Up / @@ -215,4 +215,5 @@ cython_debug/ @@
     # logs
     *.log
-    *.log.*reports/
+    *.log.*
+    reports/