diff --git a/reporting-toolkit/CLAUDE.md b/reporting-toolkit/CLAUDE.md
new file mode 100644
index 0000000..249ae90
--- /dev/null
+++ b/reporting-toolkit/CLAUDE.md
@@ -0,0 +1,545 @@
+# OpenStack CI Analysis Reporting Toolkit
+
+This directory contains a complete toolkit for analyzing OpenStack CI job health, performance, and configuration. Use these scripts to generate comprehensive assessment reports.
+
+## Overview
+
+The toolkit provides data-driven analysis of OpenStack CI jobs by:
+1. Extracting job inventory from CI configuration files
+2. Fetching runtime metrics from Sippy API
+3. Analyzing job health, coverage, and optimization opportunities
+4. Comparing OpenStack against other cloud platforms
+5. Categorizing failures by root cause
+
+## Prerequisites
+
+Before running the scripts, ensure:
+
+```bash
+# Python 3.6+ with pyyaml
+python3 -m pip install pyyaml
+```
+
+## Running from Any Path
+
+All scripts support the `--output-dir` parameter, allowing you to run them from anywhere in the filesystem:
+
+```bash
+# Run from any directory, specify output location
+python3 /path/to/reporting-toolkit/extract_openstack_jobs.py \
+    --config-dir /path/to/release/ci-operator/config \
+    --output-dir /tmp/my-analysis \
+    --summary
+
+# Scripts will read input files from and write output files to --output-dir
+python3 /path/to/reporting-toolkit/fetch_job_metrics.py --output-dir /tmp/my-analysis
+```
+
+### Using the Shell Script
+
+The easiest way to run all analysis is with the shell script:
+
+```bash
+# From repo root - outputs to current directory
+./hack/openstack-ci-analysis/reporting-toolkit/run_analysis.sh
+
+# From anywhere - specify both directories
+/path/to/reporting-toolkit/run_analysis.sh \
+    --config-dir /path/to/release/ci-operator/config \
+    --output-dir /tmp/my-analysis
+
+# View help
+./run_analysis.sh --help
+```
+
+### Common Options
+
+All scripts support:
+- `--output-dir DIR`: Directory for input/output files (default: script directory or current directory)
+- `--help`: Show usage information
+
+Additional script-specific options:
+- `extract_openstack_jobs.py`: `--config-dir` to specify CI config location
+- `fetch_job_metrics.py`: `--force` to refetch cached data
+
+## Script Execution Order
+
+**IMPORTANT:** Scripts have dependencies and must be run in the correct order.
+
+### Phase 1: Data Collection
+
+Run these scripts first to gather raw data:
+
+```bash
+# Set your output directory
+OUTPUT_DIR=/tmp/openstack-analysis
+
+# 1. Extract job inventory from CI configuration
+python3 hack/openstack-ci-analysis/reporting-toolkit/extract_openstack_jobs.py \
+    --config-dir ci-operator/config \
+    --output-dir $OUTPUT_DIR \
+    --summary
+
+# 2. Fetch job metrics from Sippy API
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_job_metrics.py \
+    --output-dir $OUTPUT_DIR
+
+# 3. Calculate extended metrics (requires step 2)
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_extended_metrics.py \
+    --output-dir $OUTPUT_DIR
+
+# 4. Fetch platform comparison data
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_comparison_data.py \
+    --output-dir $OUTPUT_DIR
+```
+
+### Phase 2: Configuration Analysis
+
+These scripts analyze the job configuration (from Phase 1, step 1):
+
+```bash
+# Analyze potential redundancy
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_redundancy.py \
+    --output-dir $OUTPUT_DIR
+
+# Analyze coverage gaps across releases
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_coverage.py \
+    --output-dir $OUTPUT_DIR
+
+# Analyze trigger optimization opportunities
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_triggers.py \
+    --output-dir $OUTPUT_DIR
+```
+
+### Phase 3: Runtime Analysis
+
+These scripts analyze runtime metrics (requires Phase 1):
+
+```bash
+# Analyze platform comparison
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_platform_comparison.py \
+    --output-dir $OUTPUT_DIR
+
+# Analyze workflow pass rates
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_workflow_passrate.py \
+    --output-dir $OUTPUT_DIR
+
+# Categorize failures by root cause
+python3 hack/openstack-ci-analysis/reporting-toolkit/categorize_failures.py \
+    --output-dir $OUTPUT_DIR
+```
+
+## Script Descriptions
+
+### Data Collection Scripts
+
+#### extract_openstack_jobs.py
+Extracts all OpenStack CI jobs from `ci-operator/config/` files.
+
+**Input:** CI configuration YAML files
+**Output:**
+- `openstack_jobs_inventory.csv` - Complete job inventory
+- `openstack_jobs_inventory.json` - Job inventory (JSON format)
+
+**Key fields extracted:**
+- job_name, cluster_profile, job_type (presubmit/periodic)
+- workflow, schedule, minimum_interval
+- optional, always_run, skip_if_only_changed, run_if_changed
+- org, repo, branch, variant, config_file
+
+**Options:**
+- `--config-dir`: Path to config directory (default: ci-operator/config)
+- `--output-csv`: Output CSV file path
+- `--output-json`: Output JSON file path
+- `--summary`: Print summary statistics
+
+#### fetch_job_metrics.py
+Fetches job pass rate metrics from Sippy API.
+
+**Input:** None (fetches from Sippy API)
+**Output:**
+- `sippy_jobs_raw.json` - Raw Sippy API data (cached)
+- `job_metrics_report.md` - Pass rate metrics report
+- `job_metrics_summary.json` - Metrics summary
+
+**Data collected per job:**
+- current_pass_percentage, current_runs, current_passes
+- previous_pass_percentage, previous_runs, previous_passes
+- open_bugs, last_pass date
+
+**Options:**
+- `--force`: Refetch data even if cache exists
+
+#### fetch_extended_metrics.py
+Calculates extended metrics combining current + previous periods (~14 days).
+
+**Requires:** `sippy_jobs_raw.json` from fetch_job_metrics.py
+
+**Output:**
+- `extended_metrics.json` - Extended metrics data
+- `extended_metrics_jobs.json` - Per-job extended metrics
+- `extended_metrics_report.md` - Extended metrics report
+
+**Calculations:**
+- Combined pass rates across 14-day window
+- Trend analysis (improving/degrading/stable)
+- Problem job identification (<80% pass rate)
+- Estimated job durations by cluster profile
+
+#### fetch_comparison_data.py
+Fetches platform comparison data from Sippy API.
+
+**Input:** None (fetches from Sippy API)
+**Output:**
+- `platform_comparison_raw.json` - Raw platform data
+
+**Platforms compared:**
+- OpenStack, AWS, GCP, Azure, vSphere, Metal
+
+**Data collected:**
+- Job counts per platform per release
+- Total runs and passes
+- Pass rates by platform
+
+### Configuration Analysis Scripts
+
+#### analyze_redundancy.py
+Identifies redundant jobs and consolidation opportunities.
+
+**Requires:** `openstack_jobs_inventory.json`
+
+**Output:**
+- `redundant_jobs_report.md` - Redundancy analysis report
+- `redundant_jobs_report_data.json` - Raw analysis data
+
+**Analyzes:**
+- Jobs duplicated between openshift/ and openshift-priv/
+- Multiple jobs using same workflow + cluster in one repo
+- Presubmit trigger patterns
+
+#### analyze_coverage.py
+Analyzes test coverage across releases.
+
+**Requires:** `openstack_jobs_inventory.json`
+
+**Output:**
+- `coverage_gaps_report.md` - Coverage analysis report
+- `coverage_gaps_report_data.json` - Raw analysis data
+
+**Analyzes:**
+- Jobs per release
+- Cluster profile usage by release
+- Coverage gaps (tests missing from some releases)
+
+#### analyze_triggers.py
+Identifies trigger optimization opportunities.
+
+**Requires:** `openstack_jobs_inventory.json`
+
+**Output:**
+- `trigger_optimization_report.md` - Trigger optimization report
+- `trigger_optimization_report_data.json` - Raw analysis data
+
+**Analyzes:**
+- Jobs missing skip_if_only_changed patterns
+- Jobs missing run_if_changed patterns
+- Repos that could benefit from smarter triggering
+
+### Runtime Analysis Scripts
+
+#### analyze_platform_comparison.py
+Analyzes platform comparison data.
+
+**Requires:** `platform_comparison_raw.json`
+
+**Output:**
+- `platform_comparison_analysis.json` - Analysis results
+- `platform_comparison_report.md` - Platform comparison report
+
+**Provides:**
+- Platform ranking by pass rate
+- OpenStack vs other platforms comparison
+- Per-release platform breakdown
+- Gap analysis
+
+#### analyze_workflow_passrate.py
+Analyzes pass rates grouped by workflow/test scenario.
+
+**Requires:**
+- `openstack_jobs_inventory.json`
+- `sippy_jobs_raw.json`
+- `extended_metrics_jobs.json` (optional, enhances analysis)
+
+**Output:**
+- `workflow_passrate_analysis.json` - Analysis results
+- `workflow_passrate_report.md` - Workflow pass rate report
+
+**Workflow classification:**
+- Extracts workflow type from job names
+- Groups jobs by scenario (fips, dualstack, serial, etc.)
+- Categorizes as Critical (<50%), Warning (50-70%), OK (>70%)
+
+#### categorize_failures.py
+Categorizes job failures using heuristic classification.
+
+**Requires:**
+- `extended_metrics_jobs.json`
+- `sippy_jobs_raw.json` (optional, for bug counts)
+
+**Output:**
+- `failure_categories.json` - Categorized failures
+- `failure_categories_report.md` - Failure categorization report
+
+**Categories:**
+| Category | Criteria |
+|----------|----------|
+| Infrastructure | Low pass rate on install/provision jobs |
+| Flaky | 30-70% pass rate (inconsistent) |
+| Product Bug | 0% or low pass rate with bugs filed |
+| Needs Triage | Unknown cause, requires investigation |
+
+## Output Files Summary
+
+| Category | File | Description |
+|----------|------|-------------|
+| **Inventory** | openstack_jobs_inventory.json | Complete job inventory |
+| | openstack_jobs_inventory.csv | Job inventory (CSV) |
+| **Config Analysis** | redundant_jobs_report.md | Workflow duplication analysis |
+| | coverage_gaps_report.md | Cross-release coverage gaps |
+| | trigger_optimization_report.md | Trigger pattern analysis |
+| **Sippy Metrics** | sippy_jobs_raw.json | Cached Sippy API data |
+| | job_metrics_report.md | Pass rate metrics |
+| | extended_metrics_report.md | 14-day combined metrics |
+| **Platform Comparison** | platform_comparison_raw.json | Raw platform data |
+| | platform_comparison_report.md | Platform comparison report |
+| **Workflow Analysis** | workflow_passrate_report.md | Workflow pass rate report |
+| **Failure Categories** | failure_categories_report.md | Categorized failures |
+
+## Creating a Complete Assessment Report
+
+To create a comprehensive assessment report like `TEAM_REVIEW_OpenStack_CI_Assessment.md`, follow this process:
+
+### Step 1: Run All Scripts
+
+```bash
+# Set up environment
+cd /path/to/release
+
+# Phase 1: Data Collection
+python3 hack/openstack-ci-analysis/reporting-toolkit/extract_openstack_jobs.py --summary
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_job_metrics.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_extended_metrics.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/fetch_comparison_data.py
+
+# Phase 2: Configuration Analysis
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_redundancy.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_coverage.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_triggers.py
+
+# Phase 3: Runtime Analysis
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_platform_comparison.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/analyze_workflow_passrate.py
+python3 hack/openstack-ci-analysis/reporting-toolkit/categorize_failures.py
+```
+
+### Step 2: Review Generated Reports
+
+Read all generated `.md` reports to understand the findings:
+
+1. `job_metrics_report.md` - Overall pass rates by release
+2. `extended_metrics_report.md` - 14-day trends and problem jobs
+3. `platform_comparison_report.md` - OpenStack vs other platforms
+4. `workflow_passrate_report.md` - Which workflows are problematic
+5. `failure_categories_report.md` - Root cause categorization
+6. `coverage_gaps_report.md` - Missing test coverage
+7. `trigger_optimization_report.md` - Quick win optimizations
+8. `redundant_jobs_report.md` - Potential consolidation
+
+### Step 3: Create Executive Summary
+
+Structure the report with these sections:
+
+1. **Executive Summary**
+   - Total jobs analyzed
+   - Overall pass rate
+   - Number of problem jobs
+   - Key priorities
+
+2. **Job Inventory Overview**
+   - Distribution by cluster profile
+   - Distribution by organization
+   - Jobs by type (presubmit/periodic)
+
+3. **Periodic Job Health Analysis**
+   - Overall health metrics
+   - Pass rate by release
+   - Critical failures (0% pass rate)
+   - Degrading jobs
+   - Platform comparison
+   - Workflow analysis
+   - Failure categorization
+
+4. **Trigger Optimization**
+   - Jobs missing filters
+   - Recommended patterns
+
+5. **Coverage Gaps**
+   - Missing tests across releases
+   - CAPI/other notable gaps
+
+6. **Action Items**
+   - Immediate actions
+   - Short-term improvements
+   - Medium-term investigations
+
+### Step 4: Key Data Points to Include
+
+From the JSON data files, extract these key metrics:
+
+**From extended_metrics.json:**
+```python
+import json
+data = json.load(open('extended_metrics.json'))
+print(f"Total jobs: {data['overall']['total_jobs']}")
+print(f"Pass rate: {data['overall']['combined_pass_rate']:.1f}%")
+print(f"Problem jobs: {data['overall']['problem_job_count']}")
+```
+
+**From platform_comparison_analysis.json:**
+```python
+data = json.load(open('platform_comparison_analysis.json'))
+for p in data['overall']['platforms']:
+    print(f"{p['platform']}: {p['pass_rate']:.1f}%")
+```
+
+**From workflow_passrate_analysis.json:**
+```python
+data = json.load(open('workflow_passrate_analysis.json'))
+critical = [w for w in data['workflows'] if w['severity'] == 'critical']
+for w in critical:
+    print(f"{w['workflow']}: {w['pass_rate']:.1f}%")
+```
+
+**From failure_categories.json:**
+```python
+data = json.load(open('failure_categories.json'))
+for cat, count in data['summary']['by_category'].items():
+    pct = data['summary']['percentages'][cat]
+    print(f"{cat}: {count} ({pct}%)")
+```
+
+## Customization
+
+### Adding New Cluster Profiles
+
+Edit `extract_openstack_jobs.py` to add new profiles:
+
+```python
+OPENSTACK_CLUSTER_PROFILES = [
+    "openstack-vexxhost",
+    "openstack-vh-mecha-central",
+    # Add new profiles here
+]
+```
+
+### Adjusting Pass Rate Thresholds
+
+Edit `categorize_failures.py` to change thresholds:
+
+```python
+# Current thresholds
+CRITICAL_THRESHOLD = 50  # Below this = critical
+WARNING_THRESHOLD = 70   # Below this = warning
+PROBLEM_THRESHOLD = 80   # Below this = problem job
+```
+
+### Adding New Workflow Patterns
+
+Edit `analyze_workflow_passrate.py` to recognize new patterns:
+
+```python
+def extract_workflow_from_name(job_name):
+    # Add new patterns here
+    if "newpattern" in name_lower:
+        characteristics.append("newpattern")
+```
+
+## Sippy API Reference
+
+The scripts use these Sippy API endpoints:
+
+| Endpoint | Description |
+|----------|-------------|
+| `/api/jobs?release=X` | All jobs for a release |
+| `/api/variants?release=X` | Variant (platform) data |
+
+**Base URL:** https://sippy.dptools.openshift.org/api
+
+**Rate limiting:** Scripts include 1-second delays between requests.
+
+## Troubleshooting
+
+### "No Sippy data found"
+Run `fetch_job_metrics.py` before running analysis scripts that require Sippy data.
+
+### "No job inventory found"
+Run `extract_openstack_jobs.py` before running configuration analysis scripts.
+
+### Script fails with import error
+Ensure pyyaml is installed: `python3 -m pip install pyyaml`
+
+### Old cached data
+Use `--force` flag with fetch scripts to refresh cached data:
+```bash
+python3 fetch_job_metrics.py --force
+```
+
+## Example Analysis Session
+
+Here's a complete example of running an analysis and interpreting results:
+
+```bash
+# Run all scripts
+cd /path/to/release
+for script in extract_openstack_jobs fetch_job_metrics fetch_extended_metrics \
+              fetch_comparison_data analyze_redundancy analyze_coverage \
+              analyze_triggers analyze_platform_comparison \
+              analyze_workflow_passrate categorize_failures; do
+    echo "Running $script..."
+    python3 hack/openstack-ci-analysis/reporting-toolkit/${script}.py
+done
+
+# Check key findings
+echo "=== Key Findings ==="
+python3 -c "
+import json
+ext = json.load(open('extended_metrics.json'))
+plat = json.load(open('platform_comparison_analysis.json'))
+fail = json.load(open('failure_categories.json'))
+
+print(f'Overall pass rate: {ext[\"overall\"][\"combined_pass_rate\"]:.1f}%')
+print(f'Problem jobs: {ext[\"overall\"][\"problem_job_count\"]}')
+print(f'OpenStack rank: #{plat[\"openstack_position\"][\"rank\"]} of {plat[\"openstack_position\"][\"total\"]}')
+print(f'Flaky jobs: {fail[\"summary\"][\"by_category\"][\"flaky\"]}')
+print(f'Needs triage: {fail[\"summary\"][\"by_category\"][\"needs_triage\"]}')
+"
+```
+
+## Maintenance
+
+### Updating for New Releases
+
+When new OpenShift releases are added, update the RELEASES list in each script:
+
+```python
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22", "4.23"]
+```
+
+### Refreshing Data
+
+For fresh analysis, delete cached JSON files and re-run:
+
+```bash
+rm -f *_raw.json *_jobs.json
+# Then run all fetch scripts again
+```
diff --git a/reporting-toolkit/README.md b/reporting-toolkit/README.md
new file mode 100644
index 0000000..09b0d0c
--- /dev/null
+++ b/reporting-toolkit/README.md
@@ -0,0 +1,224 @@
+# OpenStack CI Analysis Reporting Toolkit
+
+A portable toolkit for analyzing OpenStack CI job health, performance, and configuration in OpenShift CI infrastructure.
+
+## Overview
+
+This toolkit provides comprehensive analysis of OpenStack CI jobs by:
+
+- **Extracting** job inventory from CI configuration files
+- **Fetching** runtime metrics from the [Sippy API](https://sippy.dptools.openshift.org/)
+- **Analyzing** job health, coverage gaps, and optimization opportunities
+- **Comparing** OpenStack pass rates against other cloud platforms (AWS, GCP, Azure, vSphere)
+- **Categorizing** failures by root cause (flaky, product bug, infrastructure, needs triage)
+
+## Prerequisites
+
+- Python 3.6+
+- PyYAML library
+
+```bash
+pip install pyyaml
+```
+
+## Quick Start
+
+### Option 1: Using the Shell Script (Recommended)
+
+```bash
+# Clone the release repository (or use existing clone)
+git clone https://github.com/openshift/release.git
+cd release
+
+# Run complete analysis - outputs to current directory
+./path/to/reporting-toolkit/run_analysis.sh
+
+# Or specify custom paths
+./path/to/reporting-toolkit/run_analysis.sh \
+    --config-dir /path/to/release/ci-operator/config \
+    --output-dir /tmp/my-analysis
+```
+
+### Option 2: Running Scripts Individually
+
+```bash
+# Set your paths
+TOOLKIT=/path/to/reporting-toolkit
+CONFIG_DIR=/path/to/release/ci-operator/config
+OUTPUT_DIR=/tmp/my-analysis
+
+# Phase 1: Data Collection
+python3 $TOOLKIT/extract_openstack_jobs.py --config-dir $CONFIG_DIR --output-dir $OUTPUT_DIR --summary
+python3 $TOOLKIT/fetch_job_metrics.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/fetch_extended_metrics.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/fetch_comparison_data.py --output-dir $OUTPUT_DIR
+
+# Phase 2: Configuration Analysis
+python3 $TOOLKIT/analyze_redundancy.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/analyze_coverage.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/analyze_triggers.py --output-dir $OUTPUT_DIR
+
+# Phase 3: Runtime Analysis
+python3 $TOOLKIT/analyze_platform_comparison.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/analyze_workflow_passrate.py --output-dir $OUTPUT_DIR
+python3 $TOOLKIT/categorize_failures.py --output-dir $OUTPUT_DIR
+```
+
+## Scripts
+
+### Data Collection
+
+| Script | Description |
+|--------|-------------|
+| `extract_openstack_jobs.py` | Extracts job inventory from `ci-operator/config/` YAML files |
+| `fetch_job_metrics.py` | Fetches pass rates and run counts from Sippy API |
+| `fetch_extended_metrics.py` | Calculates 14-day combined metrics and trends |
+| `fetch_comparison_data.py` | Fetches platform comparison data from Sippy |
+
+### Configuration Analysis
+
+| Script | Description |
+|--------|-------------|
+| `analyze_redundancy.py` | Identifies duplicate/overlapping jobs |
+| `analyze_coverage.py` | Finds test coverage gaps across releases |
+| `analyze_triggers.py` | Identifies trigger optimization opportunities |
+
+### Runtime Analysis
+
+| Script | Description |
+|--------|-------------|
+| `analyze_platform_comparison.py` | Compares OpenStack vs AWS/GCP/Azure/vSphere |
+| `analyze_workflow_passrate.py` | Analyzes pass rates by workflow/test type |
+| `categorize_failures.py` | Classifies failures by root cause |
+
+## Command Line Options
+
+All scripts support:
+- `--output-dir DIR` - Directory for input/output files (default: script directory)
+- `--help` - Show usage information
+
+Additional options:
+- `extract_openstack_jobs.py`: `--config-dir` for CI config location
+- `fetch_job_metrics.py`: `--force` to refresh cached data
+
+## Output Files
+
+### Reports (Markdown)
+
+| File | Description |
+|------|-------------|
+| `job_metrics_report.md` | Pass rate metrics by release |
+| `extended_metrics_report.md` | 14-day trends and problem jobs |
+| `platform_comparison_report.md` | OpenStack vs other platforms |
+| `workflow_passrate_report.md` | Pass rates by workflow type |
+| `failure_categories_report.md` | Failures by root cause |
+| `coverage_gaps_report.md` | Missing test coverage |
+| `trigger_optimization_report.md` | Trigger pattern improvements |
+| `redundant_jobs_report.md` | Potential job consolidation |
+
+### Data (JSON)
+
+| File | Description |
+|------|-------------|
+| `openstack_jobs_inventory.json` | Complete job inventory |
+| `sippy_jobs_raw.json` | Cached Sippy API data |
+| `extended_metrics.json` | Extended metrics data |
+| `platform_comparison_raw.json` | Platform comparison data |
+| `workflow_passrate_analysis.json` | Workflow analysis data |
+| `failure_categories.json` | Categorized failures |
+
+## Example Output
+
+After running the analysis, you'll see key findings like:
+
+```
+Platform Comparison:
+  1. vSphere: 80.7%
+  2. AWS: 73.9%
+  3. GCP: 71.2%
+  4. Metal: 69.8%
+  5. Azure: 68.2%
+  6. OpenStack: 50.4% <-- Gap to address
+
+Failure Categories:
+  - Flaky: 41.6%
+  - Needs Triage: 36.0%
+  - Product Bug: 22.5%
+
+Critical Workflows (0% pass rate):
+  - ccpmso
+  - upgrade
+  - singlestackv6
+```
+
+## Portability
+
+This toolkit is designed to be portable:
+
+1. **No hardcoded paths** - All paths are configurable via command-line options
+2. **Self-contained** - All scripts are in a single directory
+3. **Minimal dependencies** - Only requires Python 3.6+ and PyYAML
+
+To use in another project:
+```bash
+# Copy the toolkit
+cp -r reporting-toolkit /path/to/your/project/
+
+# Run from anywhere
+/path/to/your/project/reporting-toolkit/run_analysis.sh \
+    --config-dir /path/to/release/ci-operator/config \
+    --output-dir /path/to/output
+```
+
+## Data Sources
+
+- **CI Configuration**: `ci-operator/config/` in the [openshift/release](https://github.com/openshift/release) repository
+- **Runtime Metrics**: [Sippy API](https://sippy.dptools.openshift.org/) - OpenShift CI analytics platform
+
+## Cluster Profiles Analyzed
+
+The toolkit analyzes jobs using these OpenStack cluster profiles:
+- `openstack-vexxhost`
+- `openstack-vh-mecha-central`
+- `openstack-vh-mecha-az0`
+- `openstack-vh-bm-rhos`
+- `openstack-hwoffload`
+- `openstack-nfv`
+
+## Refreshing Data
+
+Sippy data is cached to avoid repeated API calls. To refresh:
+
+```bash
+# Refresh all data
+./run_analysis.sh --force
+
+# Or refresh just job metrics
+python3 fetch_job_metrics.py --output-dir $OUTPUT_DIR --force
+```
+
+## Troubleshooting
+
+### "No Sippy data found"
+Run `fetch_job_metrics.py` before analysis scripts that require Sippy data.
+
+### "No job inventory found"
+Run `extract_openstack_jobs.py` before configuration analysis scripts.
+
+### Import error for yaml
+Install PyYAML: `pip install pyyaml`
+
+### Config directory not found
+Ensure the path to `ci-operator/config` is correct. This should point to the config directory in the openshift/release repository.
+
+## For Claude Code Users
+
+See `CLAUDE.md` for detailed instructions on using this toolkit with Claude Code, including:
+- Step-by-step execution guide
+- Creating comprehensive assessment reports
+- Customization options
+- API reference
+
+## License
+
+This toolkit is part of the OpenShift CI infrastructure. See the main repository for license information.
diff --git a/reporting-toolkit/analyze_coverage.py b/reporting-toolkit/analyze_coverage.py
new file mode 100644
index 0000000..1dbedcc
--- /dev/null
+++ b/reporting-toolkit/analyze_coverage.py
@@ -0,0 +1,403 @@
+#!/usr/bin/env python3
+"""
+Analyze OpenStack CI test coverage across releases.
+
+This script identifies:
+1. Coverage matrix (which tests run on which releases)
+2. Coverage gaps (tests missing from certain releases)
+3. Release-to-release differences
+4. Workflow usage across releases
+"""
+
+import argparse
+import csv
+import json
+import sys
+from collections import defaultdict
+from pathlib import Path
+
+
+# Current and recent releases to focus on
+ACTIVE_RELEASES = [
+    "release-4.17",
+    "release-4.18",
+    "release-4.19",
+    "release-4.20",
+    "release-4.21",
+    "release-4.22",
+    "release-4.23",
+]
+
+MAIN_BRANCHES = ["main", "master"]
+
+
+def load_inventory(csv_path):
+    """Load job inventory from CSV."""
+    jobs = []
+    with open(csv_path, 'r', encoding='utf-8') as f:
+        reader = csv.DictReader(f)
+        for row in reader:
+            row['optional'] = row['optional'].lower() == 'true'
+            row['always_run'] = row['always_run'].lower() == 'true'
+            jobs.append(row)
+    return jobs
+
+
+def normalize_branch(branch):
+    """Normalize branch name for comparison."""
+    if branch in MAIN_BRANCHES:
+        return "main/master"
+    return branch
+
+
+def get_release_version(branch):
+    """Extract version number from release branch."""
+    if branch.startswith("release-"):
+        return branch.replace("release-", "")
+    return None
+
+
+def build_test_matrix(jobs):
+    """
+    Build a matrix of tests by (workflow, cluster_profile) across releases.
+    """
+    # Group by (org, repo) to see coverage per repo
+    repo_coverage = defaultdict(lambda: defaultdict(set))
+
+    # Group by (workflow, cluster_profile) to see overall test coverage
+    test_coverage = defaultdict(set)
+
+    for job in jobs:
+        org = job['org']
+        repo = job['repo']
+        branch = job['branch']
+        workflow = job['workflow']
+        cluster = job['cluster_profile']
+        job_name = job['job_name']
+
+        # Normalize branch
+        normalized = normalize_branch(branch)
+
+        # Track per-repo coverage
+        if workflow:
+            key = (org, repo, job_name, workflow, cluster)
+            repo_coverage[key][normalized].add(job['config_file'])
+
+        # Track workflow coverage
+        if workflow:
+            test_key = (workflow, cluster, job_name)
+            test_coverage[test_key].add(normalized)
+
+    return repo_coverage, test_coverage
+
+
+def analyze_release_coverage(jobs):
+    """
+    Analyze which releases have what coverage.
+    """
+    # Count jobs per release
+    release_counts = defaultdict(int)
+    for job in jobs:
+        branch = normalize_branch(job['branch'])
+        release_counts[branch] += 1
+
+    # Count unique test types per release
+    release_tests = defaultdict(set)
+    for job in jobs:
+        branch = normalize_branch(job['branch'])
+        key = (job['workflow'], job['cluster_profile'], job['job_name'])
+        release_tests[branch].add(key)
+
+    return release_counts, release_tests
+
+
+def find_coverage_gaps(jobs):
+    """
+    Find tests that exist in some releases but not others.
+    """
+    # Group jobs by (org, repo, job_name)
+    job_releases = defaultdict(set)
+    for job in jobs:
+        key = (job['org'], job['repo'], job['job_name'])
+        job_releases[key].add(job['branch'])
+
+    # Get all releases present per repo
+    repo_releases = defaultdict(set)
+    for job in jobs:
+        key = (job['org'], job['repo'])
+        repo_releases[key].add(job['branch'])
+
+    # Find gaps
+    gaps = []
+    for (org, repo, job_name), present_releases in job_releases.items():
+        all_releases = repo_releases[(org, repo)]
+
+        # Focus on active releases
+        active_present = set(r for r in present_releases if r in ACTIVE_RELEASES)
+        active_all = set(r for r in all_releases if r in ACTIVE_RELEASES)
+
+        # If job exists in some active releases but not all, it's a gap
+        missing = active_all - active_present
+        if missing and active_present:  # Has some active releases but missing others
+            gaps.append({
+                'org': org,
+                'repo': repo,
+                'job_name': job_name,
+                'present': sorted(active_present),
+                'missing': sorted(missing),
+            })
+
+    return gaps
+
+
+def analyze_workflow_coverage(jobs):
+    """
+    Analyze which workflows are used in which releases.
+    """
+    workflow_releases = defaultdict(lambda: defaultdict(int))
+
+    for job in jobs:
+        workflow = job['workflow']
+        if not workflow:
+            continue
+        branch = job['branch']
+        if branch in ACTIVE_RELEASES or branch in MAIN_BRANCHES:
+            normalized = normalize_branch(branch)
+            workflow_releases[workflow][normalized] += 1
+
+    return workflow_releases
+
+
+def analyze_cluster_profile_usage(jobs):
+    """
+    Analyze cluster profile usage by release.
+    """
+    profile_releases = defaultdict(lambda: defaultdict(int))
+
+    for job in jobs:
+        profile = job['cluster_profile']
+        branch = job['branch']
+        if branch in ACTIVE_RELEASES or branch in MAIN_BRANCHES:
+            normalized = normalize_branch(branch)
+            profile_releases[profile][normalized] += 1
+
+    return profile_releases
+
+
+def generate_report(jobs, output_file):
+    """Generate comprehensive coverage report."""
+    report = []
+    report.append("# OpenStack CI Test Coverage Analysis Report\n")
+    report.append(f"Total jobs analyzed: {len(jobs)}\n")
+
+    # Release Coverage Summary
+    report.append("\n## 1. Jobs by Release\n\n")
+    release_counts, release_tests = analyze_release_coverage(jobs)
+
+    # Sort releases naturally
+    def sort_key(r):
+        if r == "main/master":
+            return (1, "zzz")
+        if r.startswith("release-"):
+            ver = r.replace("release-", "")
+            parts = ver.split(".")
+            return (0, tuple(int(p) for p in parts if p.isdigit()))
+        return (2, r)
+
+    sorted_releases = sorted(release_counts.keys(), key=sort_key)
+
+    report.append("| Release | Total Jobs | Unique Tests |\n")
+    report.append("|---------|------------|---------------|\n")
+    for release in sorted_releases[-15:]:  # Last 15 releases
+        count = release_counts[release]
+        unique = len(release_tests[release])
+        report.append(f"| {release} | {count} | {unique} |\n")
+
+    # Cluster Profile Usage
+    report.append("\n## 2. Cluster Profile Usage by Release\n\n")
+    profile_usage = analyze_cluster_profile_usage(jobs)
+
+    # Header
+    active_releases_sorted = sorted(
+        [normalize_branch(r) for r in ACTIVE_RELEASES + MAIN_BRANCHES],
+        key=sort_key
+    )[-6:]  # Last 6
+
+    report.append("| Cluster Profile | " +
+                  " | ".join(active_releases_sorted) + " |\n")
+    report.append("|" + "-" * 17 + "|" +
+                  "|".join(["-" * 8 for _ in active_releases_sorted]) + "|\n")
+
+    for profile in sorted(profile_usage.keys()):
+        counts = [str(profile_usage[profile].get(r, 0))
+                  for r in active_releases_sorted]
+        report.append(f"| {profile} | " + " | ".join(counts) + " |\n")
+
+    # Workflow Usage
+    report.append("\n## 3. Workflow Usage by Release\n\n")
+    workflow_usage = analyze_workflow_coverage(jobs)
+
+    report.append("| Workflow | " +
+                  " | ".join(active_releases_sorted) + " |\n")
+    report.append("|" + "-" * 40 + "|" +
+                  "|".join(["-" * 8 for _ in active_releases_sorted]) + "|\n")
+
+    for workflow in sorted(workflow_usage.keys()):
+        counts = [str(workflow_usage[workflow].get(r, 0))
+                  for r in active_releases_sorted]
+        report.append(f"| {workflow} | " + " | ".join(counts) + " |\n")
+
+    # Coverage Gaps
+    report.append("\n## 4. Coverage Gaps\n")
+    report.append("Tests present in some active releases but missing from others.\n\n")
+
+    gaps = find_coverage_gaps(jobs)
+
+    if gaps:
+        # Group by repo
+        repo_gaps = defaultdict(list)
+        for gap in gaps:
+            repo_gaps[(gap['org'], gap['repo'])].append(gap)
+
+        report.append(f"Found {len(gaps)} coverage gaps across "
+                      f"{len(repo_gaps)} repositories.\n\n")
+
+        report.append("### By Repository\n\n")
+        for (org, repo), repo_gap_list in sorted(repo_gaps.items()):
+            report.append(f"#### {org}/{repo}\n\n")
+            report.append("| Job | Present | Missing |\n")
+            report.append("|-----|---------|----------|\n")
+            for gap in repo_gap_list[:10]:
+                present = ', '.join(gap['present'][:3])
+                if len(gap['present']) > 3:
+                    present += f" (+{len(gap['present'])-3})"
+                missing = ', '.join(gap['missing'][:3])
+                if len(gap['missing']) > 3:
+                    missing += f" (+{len(gap['missing'])-3})"
+                report.append(f"| {gap['job_name']} | {present} | {missing} |\n")
+            if len(repo_gap_list) > 10:
+                report.append(f"\n... and {len(repo_gap_list)-10} more gaps\n")
+            report.append("\n")
+    else:
+        report.append("No coverage gaps found in active releases.\n")
+
+    # Test Type Analysis
+    report.append("\n## 5. Test Type Coverage\n")
+    report.append("Summary of test types and their coverage.\n\n")
+
+    # Categorize by test name patterns
+    test_categories = {
+        'e2e-basic': [],
+        'e2e-conformance': [],
+        'e2e-csi': [],
+        'e2e-nfv': [],
+        'e2e-upgrade': [],
+        'e2e-other': [],
+    }
+
+    for job in jobs:
+        name = job['job_name'].lower()
+        if 'csi' in name or 'manila' in name or 'cinder' in name:
+            test_categories['e2e-csi'].append(job)
+        elif 'nfv' in name or 'sriov' in name or 'hwoffload' in name:
+            test_categories['e2e-nfv'].append(job)
+        elif 'upgrade' in name:
+            test_categories['e2e-upgrade'].append(job)
+        elif 'parallel' in name or 'serial' in name or 'conformance' in name:
+            test_categories['e2e-conformance'].append(job)
+        elif name.endswith('e2e-openstack') or name == 'e2e-openstack-ovn':
+            test_categories['e2e-basic'].append(job)
+        else:
+            test_categories['e2e-other'].append(job)
+
+    report.append("| Category | Total Jobs | Unique Tests |\n")
+    report.append("|----------|------------|---------------|\n")
+    for category, cat_jobs in sorted(test_categories.items()):
+        unique = len(set((j['job_name'], j['workflow']) for j in cat_jobs))
+        report.append(f"| {category} | {len(cat_jobs)} | {unique} |\n")
+
+    # Recommendations
+    report.append("\n## 6. Coverage Recommendations\n\n")
+
+    report.append("### Missing Coverage Areas\n\n")
+
+    # Check for releases with low coverage
+    active_counts = {r: release_counts.get(r, 0)
+                     for r in ACTIVE_RELEASES}
+    avg_count = sum(active_counts.values()) / len(active_counts) if active_counts else 0
+
+    low_coverage = [r for r, c in active_counts.items() if c < avg_count * 0.7]
+    if low_coverage:
+        report.append(f"1. **Low coverage releases**: {', '.join(sorted(low_coverage))} "
+                      f"have fewer jobs than average.\n\n")
+
+    # Check for profile gaps
+    for profile in sorted(profile_usage.keys()):
+        releases_with_profile = [r for r, c in profile_usage[profile].items() if c > 0]
+        if len(releases_with_profile) < len(active_releases_sorted) - 1:
+            missing = set(active_releases_sorted) - set(releases_with_profile)
+            if missing:
+                report.append(f"2. **{profile}**: Missing from {', '.join(sorted(missing))}\n\n")
+
+    report.append("### Consolidation Opportunities\n\n")
+    report.append("- Jobs that appear in all releases with same config could use shared workflows\n")
+    report.append("- Consider periodic-only coverage for older releases (4.17, 4.18)\n")
+    report.append("- Evaluate if all cluster profiles need coverage in all releases\n")
+
+    # Write report
+    with open(output_file, 'w', encoding='utf-8') as f:
+        f.write(''.join(report))
+
+    print(f"Report written to {output_file}", file=sys.stderr)
+
+    # Also write machine-readable data
+    json_output = output_file.replace('.md', '_data.json')
+    with open(json_output, 'w', encoding='utf-8') as f:
+        json.dump({
+            'release_counts': dict(release_counts),
+            'workflow_usage': {k: dict(v) for k, v in workflow_usage.items()},
+            'profile_usage': {k: dict(v) for k, v in profile_usage.items()},
+            'coverage_gaps': gaps[:100],
+            'test_categories': {k: len(v) for k, v in test_categories.items()},
+        }, f, indent=2)
+
+    print(f"Data written to {json_output}", file=sys.stderr)
+
+
+def main():
+    import os
+    script_dir = os.path.dirname(os.path.abspath(__file__))
+
+    parser = argparse.ArgumentParser(
+        description="Analyze OpenStack CI test coverage"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=script_dir,
+        help="Directory for input/output files (default: script directory)"
+    )
+    parser.add_argument(
+        "--inventory",
+        default="openstack_jobs_inventory.csv",
+        help="Inventory CSV filename (default: openstack_jobs_inventory.csv)"
+    )
+
+    args = parser.parse_args()
+
+    output_dir = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Coverage Analysis")
+    print("=" * 60)
+    print(f"Output directory: {output_dir}")
+    print()
+
+    inventory_path = os.path.join(output_dir, args.inventory)
+    output_path = os.path.join(output_dir, "coverage_gaps_report.md")
+
+    jobs = load_inventory(inventory_path)
+    generate_report(jobs, output_path)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/analyze_platform_comparison.py b/reporting-toolkit/analyze_platform_comparison.py
new file mode 100644
index 0000000..3cc5658
--- /dev/null
+++ b/reporting-toolkit/analyze_platform_comparison.py
@@ -0,0 +1,293 @@
+#!/usr/bin/env python3
+"""
+Analyze platform comparison data and generate report.
+Compares OpenStack CI pass rates against AWS, GCP, Azure, vSphere.
+"""
+
+import argparse
+import json
+import os
+import sys
+from datetime import datetime
+
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+TARGET_PLATFORMS = ["OpenStack", "AWS", "GCP", "Azure", "vSphere", "Metal"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Analyze platform comparison data and generate report"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for input/output files (default: script directory)"
+    )
+    return parser.parse_args()
+
+
+def load_comparison_data():
+    """Load platform comparison raw data."""
+    filepath = os.path.join(OUTPUT_DIR, "platform_comparison_raw.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_extended_metrics():
+    """Load extended metrics for OpenStack-specific data."""
+    filepath = os.path.join(OUTPUT_DIR, "extended_metrics.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def analyze_platforms(data, openstack_metrics):
+    """Analyze platform comparison data."""
+    results = {
+        "generated": datetime.now().isoformat(),
+        "overall": {},
+        "by_release": {},
+        "openstack_position": {},
+    }
+
+    # Overall platform comparison
+    overall = data.get("overall_by_platform", {})
+
+    # Calculate OpenStack baseline
+    openstack_rate = 0
+    if "OpenStack" in overall:
+        openstack_rate = overall["OpenStack"].get("pass_rate", 0)
+    elif openstack_metrics:
+        openstack_rate = openstack_metrics.get("overall", {}).get("combined_pass_rate", 0)
+
+    # Build comparison table
+    platforms = []
+    for platform in TARGET_PLATFORMS:
+        if platform in overall:
+            pdata = overall[platform]
+            rate = pdata.get("pass_rate", 0)
+            delta = rate - openstack_rate if platform != "OpenStack" else 0
+            platforms.append({
+                "platform": platform,
+                "job_count": pdata.get("job_count", 0),
+                "total_runs": pdata.get("total_runs", 0),
+                "total_passes": pdata.get("total_passes", 0),
+                "pass_rate": rate,
+                "vs_openstack": delta,
+            })
+
+    # Sort by pass rate descending
+    platforms.sort(key=lambda x: -x["pass_rate"])
+    results["overall"]["platforms"] = platforms
+
+    # Find OpenStack position
+    for i, p in enumerate(platforms):
+        if p["platform"] == "OpenStack":
+            results["openstack_position"]["rank"] = i + 1
+            results["openstack_position"]["total"] = len(platforms)
+            break
+
+    # Per-release comparison
+    for release in RELEASES:
+        release_data = data.get("releases", {}).get(release, {})
+        job_metrics = release_data.get("job_metrics", {})
+
+        release_platforms = []
+        for platform in TARGET_PLATFORMS:
+            if platform in job_metrics:
+                pdata = job_metrics[platform]
+                release_platforms.append({
+                    "platform": platform,
+                    "job_count": pdata.get("job_count", 0),
+                    "total_runs": pdata.get("total_runs", 0),
+                    "pass_rate": pdata.get("pass_rate", 0),
+                })
+
+        release_platforms.sort(key=lambda x: -x["pass_rate"])
+        results["by_release"][release] = release_platforms
+
+    return results
+
+
+def generate_report(analysis):
+    """Generate markdown report for platform comparison."""
+    report = []
+    report.append("# Platform Comparison Report")
+    report.append("")
+    report.append(f"**Generated:** {analysis['generated']}")
+    report.append("")
+    report.append("This report compares OpenStack CI job pass rates against other cloud platforms.")
+    report.append("")
+
+    # Executive summary
+    report.append("## Executive Summary")
+    report.append("")
+
+    platforms = analysis.get("overall", {}).get("platforms", [])
+    pos = analysis.get("openstack_position", {})
+
+    if pos:
+        report.append(f"OpenStack ranks **#{pos.get('rank', '?')} of {pos.get('total', '?')}** platforms by pass rate.")
+        report.append("")
+
+    # Find best performer for comparison
+    if platforms:
+        best = platforms[0]
+        openstack = next((p for p in platforms if p["platform"] == "OpenStack"), None)
+        if openstack and best["platform"] != "OpenStack":
+            gap = best["pass_rate"] - openstack["pass_rate"]
+            report.append(f"- **Gap to best ({best['platform']}):** {gap:+.1f}%")
+        if openstack:
+            report.append(f"- **OpenStack pass rate:** {openstack['pass_rate']:.1f}%")
+            report.append(f"- **OpenStack job volume:** {openstack['total_runs']:,} runs across {openstack['job_count']} jobs")
+    report.append("")
+
+    # Overall comparison table
+    report.append("## Overall Platform Comparison")
+    report.append("")
+    report.append("| Rank | Platform | Jobs | Runs | Pass Rate | vs OpenStack |")
+    report.append("|------|----------|------|------|-----------|--------------|")
+
+    for i, p in enumerate(platforms, 1):
+        delta = p.get("vs_openstack", 0)
+        delta_str = f"+{delta:.1f}%" if delta > 0 else (f"{delta:.1f}%" if delta < 0 else "baseline")
+        runs_str = f"{p['total_runs']:,}" if p['total_runs'] >= 1000 else str(p['total_runs'])
+        report.append(
+            f"| {i} | {p['platform']} | {p['job_count']} | {runs_str} | "
+            f"{p['pass_rate']:.1f}% | {delta_str} |"
+        )
+    report.append("")
+
+    # Key observations
+    report.append("## Key Observations")
+    report.append("")
+
+    if platforms:
+        openstack = next((p for p in platforms if p["platform"] == "OpenStack"), None)
+        if openstack:
+            # Calculate how many platforms are better
+            better = [p for p in platforms if p["pass_rate"] > openstack["pass_rate"]]
+            worse = [p for p in platforms if p["pass_rate"] < openstack["pass_rate"]]
+
+            if better:
+                report.append(f"### Platforms with Better Pass Rates ({len(better)})")
+                report.append("")
+                for p in better:
+                    gap = p["pass_rate"] - openstack["pass_rate"]
+                    report.append(f"- **{p['platform']}:** {p['pass_rate']:.1f}% (+{gap:.1f}% vs OpenStack)")
+                report.append("")
+
+            if worse:
+                report.append(f"### Platforms with Lower Pass Rates ({len(worse)})")
+                report.append("")
+                for p in worse:
+                    gap = openstack["pass_rate"] - p["pass_rate"]
+                    report.append(f"- **{p['platform']}:** {p['pass_rate']:.1f}% (-{gap:.1f}% vs OpenStack)")
+                report.append("")
+
+    # Per-release breakdown
+    report.append("## Pass Rate by Release")
+    report.append("")
+    report.append("| Release | " + " | ".join(TARGET_PLATFORMS) + " |")
+    report.append("|---------|" + "|".join(["-------"] * len(TARGET_PLATFORMS)) + "|")
+
+    for release in RELEASES:
+        release_data = analysis.get("by_release", {}).get(release, [])
+        rates = {}
+        for p in release_data:
+            rates[p["platform"]] = p["pass_rate"]
+
+        row = f"| {release} |"
+        for platform in TARGET_PLATFORMS:
+            if platform in rates:
+                row += f" {rates[platform]:.1f}% |"
+            else:
+                row += " - |"
+        report.append(row)
+    report.append("")
+
+    # Analysis
+    report.append("## Analysis")
+    report.append("")
+    report.append("### Potential Causes for Pass Rate Differences")
+    report.append("")
+    report.append("1. **Infrastructure maturity**: Platforms with longer CI history may have more stable infrastructure")
+    report.append("2. **Test suite differences**: Each platform runs different test subsets")
+    report.append("3. **Job volume**: Higher volume platforms may have more resources/attention")
+    report.append("4. **Platform complexity**: Some platforms have inherent complexity differences")
+    report.append("")
+
+    report.append("### Recommendations")
+    report.append("")
+    report.append("1. Investigate top-performing platform configurations for applicable improvements")
+    report.append("2. Compare test failure patterns across platforms")
+    report.append("3. Review infrastructure provisioning reliability")
+    report.append("")
+
+    report.append("---")
+    report.append("")
+    report.append("*Data Source: [Sippy](https://sippy.dptools.openshift.org/)*")
+    report.append("")
+
+    return "\n".join(report)
+
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Platform Comparison Analysis")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    # Load data
+    data = load_comparison_data()
+    if not data:
+        print("Error: No platform comparison data found.")
+        print("Run fetch_comparison_data.py first.")
+        sys.exit(1)
+
+    openstack_metrics = load_extended_metrics()
+
+    print(f"Loaded data from: {data.get('fetched_at')}")
+    print()
+
+    # Analyze
+    analysis = analyze_platforms(data, openstack_metrics)
+
+    # Save analysis
+    analysis_path = os.path.join(OUTPUT_DIR, "platform_comparison_analysis.json")
+    with open(analysis_path, 'w') as f:
+        json.dump(analysis, f, indent=2)
+    print(f"Saved: {analysis_path}")
+
+    # Generate report
+    report = generate_report(analysis)
+    report_path = os.path.join(OUTPUT_DIR, "platform_comparison_report.md")
+    with open(report_path, 'w') as f:
+        f.write(report)
+    print(f"Saved: {report_path}")
+
+    # Print summary
+    print()
+    print("=" * 60)
+    print("Summary:")
+    platforms = analysis.get("overall", {}).get("platforms", [])
+    for i, p in enumerate(platforms, 1):
+        marker = " <-- OpenStack" if p["platform"] == "OpenStack" else ""
+        print(f"  {i}. {p['platform']}: {p['pass_rate']:.1f}%{marker}")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/analyze_redundancy.py b/reporting-toolkit/analyze_redundancy.py
new file mode 100644
index 0000000..fb1c92f
--- /dev/null
+++ b/reporting-toolkit/analyze_redundancy.py
@@ -0,0 +1,309 @@
+#!/usr/bin/env python3
+"""
+Analyze OpenStack CI jobs for redundancy and consolidation opportunities.
+
+This script identifies:
+1. Duplicate jobs between openshift and openshift-priv organizations
+2. Similar tests running on the same code paths
+3. Jobs with overlapping functionality
+4. Presubmit jobs that could be consolidated
+"""
+
+import argparse
+import csv
+import json
+import sys
+from collections import defaultdict
+from pathlib import Path
+
+
+def load_inventory(csv_path):
+    """Load job inventory from CSV."""
+    jobs = []
+    with open(csv_path, 'r', encoding='utf-8') as f:
+        reader = csv.DictReader(f)
+        for row in reader:
+            # Convert boolean strings to actual booleans
+            row['optional'] = row['optional'].lower() == 'true'
+            row['always_run'] = row['always_run'].lower() == 'true'
+            jobs.append(row)
+    return jobs
+
+
+def analyze_same_workflow_same_branch(jobs):
+    """
+    Find cases where multiple jobs in the SAME repo/branch use identical
+    workflow + cluster_profile combinations.
+
+    These might be testing overlapping functionality and could potentially
+    be consolidated, though they may have different env vars or test suites.
+
+    NOTE: Jobs existing across different branches is EXPECTED, not redundant.
+    NOTE: Jobs in openshift/ vs openshift-priv/ are separate GitHub gates, not redundant.
+    """
+    duplicates = []
+
+    # Group jobs by (org, repo, branch, workflow, cluster_profile)
+    job_groups = defaultdict(list)
+    for job in jobs:
+        if not job['workflow']:
+            continue
+        key = (
+            job['org'],
+            job['repo'],
+            job['branch'],
+            job['workflow'],
+            job['cluster_profile']
+        )
+        job_groups[key].append(job)
+
+    for key, group in job_groups.items():
+        if len(group) > 1:
+            duplicates.append({
+                'org': key[0],
+                'repo': key[1],
+                'branch': key[2],
+                'workflow': key[3],
+                'cluster_profile': key[4],
+                'job_count': len(group),
+                'jobs': [j['job_name'] for j in group],
+                'files': list(set(j['config_file'] for j in group))
+            })
+
+    return duplicates
+
+
+def analyze_presubmit_triggers(jobs):
+    """
+    Analyze presubmit job trigger patterns.
+    Identify jobs that are always_run=true without throttling.
+    """
+    presubmit_jobs = [j for j in jobs if j['job_type'] == 'presubmit']
+
+    # Group by trigger pattern
+    always_run_no_throttle = []
+    always_run_with_throttle = []
+    optional_jobs = []
+    conditional_jobs = []
+
+    for job in presubmit_jobs:
+        if job['always_run']:
+            if job['minimum_interval']:
+                always_run_with_throttle.append(job)
+            else:
+                always_run_no_throttle.append(job)
+        elif job['optional']:
+            optional_jobs.append(job)
+        elif job['run_if_changed'] or job['skip_if_only_changed']:
+            conditional_jobs.append(job)
+        else:
+            # Default presubmit (runs on PR but not always)
+            optional_jobs.append(job)
+
+    return {
+        'always_run_no_throttle': always_run_no_throttle,
+        'always_run_with_throttle': always_run_with_throttle,
+        'optional': optional_jobs,
+        'conditional': conditional_jobs,
+    }
+
+
+def analyze_branch_consistency(jobs):
+    """
+    Find jobs that exist on some branches but not others.
+    Helps identify inconsistent coverage across releases.
+    """
+    # Group by (org, repo, job_name)
+    job_groups = defaultdict(set)
+    for job in jobs:
+        key = (job['org'], job['repo'], job['job_name'])
+        job_groups[key].add(job['branch'])
+
+    # Find jobs that have inconsistent branch coverage
+    inconsistencies = []
+    repo_branches = defaultdict(set)
+
+    for job in jobs:
+        repo_branches[(job['org'], job['repo'])].add(job['branch'])
+
+    for (org, repo, job_name), branches in job_groups.items():
+        all_branches = repo_branches[(org, repo)]
+        missing = all_branches - branches
+        if missing and len(branches) > 1:
+            inconsistencies.append({
+                'org': org,
+                'repo': repo,
+                'job_name': job_name,
+                'present_branches': sorted(branches),
+                'missing_branches': sorted(missing),
+            })
+
+    return inconsistencies
+
+
+def generate_report(jobs, output_file):
+    """Generate comprehensive redundancy report."""
+    report = []
+    report.append("# OpenStack CI Job Redundancy Analysis Report\n")
+    report.append(f"Total jobs analyzed: {len(jobs)}\n")
+
+    report.append("\n## Understanding This Report\n\n")
+    report.append("**What is NOT redundant:**\n")
+    report.append("- Jobs existing across different branches (release-4.20, release-4.21, etc.)\n")
+    report.append("- Jobs in both openshift/ and openshift-priv/ (separate GitHub gates)\n\n")
+    report.append("**What MAY be redundant:**\n")
+    report.append("- Multiple jobs in the SAME repo/branch using identical workflow+cluster\n")
+    report.append("- Jobs with overlapping test coverage\n\n")
+
+    # Same Workflow/Cluster in Same Repo/Branch
+    report.append("\n## 1. Multiple Jobs with Same Workflow+Cluster\n")
+    report.append("Cases where multiple jobs in the SAME repo/branch use identical\n")
+    report.append("workflow + cluster_profile combinations.\n\n")
+    report.append("These MAY be intentional (different test suites, env vars) or\n")
+    report.append("could potentially be consolidated.\n\n")
+
+    workflow_dups = analyze_same_workflow_same_branch(jobs)
+    if workflow_dups:
+        report.append(f"Found {len(workflow_dups)} cases of workflow duplication.\n\n")
+        report.append("| Org/Repo | Branch | Workflow | Jobs |\n")
+        report.append("|----------|--------|----------|------|\n")
+        for dup in sorted(workflow_dups,
+                          key=lambda x: x['job_count'], reverse=True)[:20]:
+            jobs_str = ', '.join(dup['jobs'][:3])
+            if len(dup['jobs']) > 3:
+                jobs_str += f" (+{len(dup['jobs'])-3} more)"
+            report.append(
+                f"| {dup['org']}/{dup['repo']} | {dup['branch']} | "
+                f"{dup['workflow']} | {jobs_str} |\n"
+            )
+    else:
+        report.append("No workflow duplications found.\n")
+
+    # Presubmit Trigger Analysis
+    report.append("\n## 2. Presubmit Trigger Analysis\n")
+    triggers = analyze_presubmit_triggers(jobs)
+
+    report.append(f"\n### Trigger Pattern Summary\n\n")
+    report.append("| Pattern | Count | % of Presubmits |\n")
+    report.append("|---------|-------|------------------|\n")
+
+    total_presubmit = sum(len(v) for v in triggers.values())
+    for pattern, jobs_list in triggers.items():
+        pct = len(jobs_list) / total_presubmit * 100 if total_presubmit else 0
+        report.append(f"| {pattern} | {len(jobs_list)} | {pct:.1f}% |\n")
+
+    # Always run without throttle is concerning
+    if triggers['always_run_no_throttle']:
+        report.append("\n### Always Run Jobs Without Throttling\n")
+        report.append("These run on every PR without minimum_interval.\n\n")
+        by_repo = defaultdict(list)
+        for job in triggers['always_run_no_throttle']:
+            by_repo[(job['org'], job['repo'])].append(job)
+
+        report.append("| Org/Repo | Jobs |\n")
+        report.append("|----------|------|\n")
+        for (org, repo), jobs_list in sorted(by_repo.items()):
+            job_names = ', '.join(set(j['job_name'] for j in jobs_list))[:60]
+            report.append(f"| {org}/{repo} | {job_names} |\n")
+
+    # Branch Consistency
+    report.append("\n## 3. Branch Coverage Inconsistencies\n")
+    report.append("Jobs present on some branches but missing from others.\n\n")
+
+    inconsistencies = analyze_branch_consistency(jobs)
+    if inconsistencies:
+        # Filter to significant inconsistencies (missing recent releases)
+        significant = [i for i in inconsistencies
+                       if any('release-4.2' in b or 'main' in b or 'master' in b
+                              for b in i['missing_branches'])]
+
+        report.append(f"Found {len(significant)} significant inconsistencies.\n\n")
+
+        if significant:
+            report.append("| Org/Repo | Job | Missing Branches |\n")
+            report.append("|----------|-----|------------------|\n")
+            for inc in significant[:30]:
+                missing = ', '.join(inc['missing_branches'][:3])
+                if len(inc['missing_branches']) > 3:
+                    missing += f" (+{len(inc['missing_branches'])-3})"
+                report.append(
+                    f"| {inc['org']}/{inc['repo']} | {inc['job_name']} | {missing} |\n"
+                )
+    else:
+        report.append("No significant inconsistencies found.\n")
+
+    # Consolidation Opportunities
+    report.append("\n## 4. Recommendations\n\n")
+
+    report.append("### Review Items\n\n")
+
+    if workflow_dups:
+        report.append("1. **Same workflow+cluster jobs**: Review jobs using identical\n")
+        report.append("   workflow+cluster in the same repo/branch. These may have\n")
+        report.append("   different env vars or test suites, but could potentially\n")
+        report.append("   be consolidated if testing overlapping functionality.\n")
+        report.append(f"   - Cases to review: {len(workflow_dups)}\n\n")
+
+    report.append("2. **Always-run jobs**: Review jobs marked `always_run: true` "
+                  "without `minimum_interval` throttling.\n")
+    report.append(f"   - Jobs to review: {len(triggers['always_run_no_throttle'])}\n\n")
+
+    report.append("3. **Branch inconsistencies**: Consider adding missing jobs "
+                  "to recent release branches for consistent coverage.\n")
+    report.append(f"   - Inconsistencies found: {len(inconsistencies)}\n")
+
+    # Write report
+    with open(output_file, 'w', encoding='utf-8') as f:
+        f.write(''.join(report))
+
+    print(f"Report written to {output_file}", file=sys.stderr)
+
+    # Also write machine-readable data
+    json_output = output_file.replace('.md', '_data.json')
+    with open(json_output, 'w', encoding='utf-8') as f:
+        json.dump({
+            'same_workflow_same_branch': workflow_dups,
+            'trigger_analysis': {k: len(v) for k, v in triggers.items()},
+            'branch_inconsistencies': inconsistencies[:100],
+        }, f, indent=2)
+
+    print(f"Data written to {json_output}", file=sys.stderr)
+
+
+def main():
+    import os
+    script_dir = os.path.dirname(os.path.abspath(__file__))
+
+    parser = argparse.ArgumentParser(
+        description="Analyze OpenStack CI jobs for redundancy"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=script_dir,
+        help="Directory for input/output files (default: script directory)"
+    )
+    parser.add_argument(
+        "--inventory",
+        default="openstack_jobs_inventory.csv",
+        help="Inventory CSV filename (default: openstack_jobs_inventory.csv)"
+    )
+
+    args = parser.parse_args()
+
+    output_dir = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Redundancy Analysis")
+    print("=" * 60)
+    print(f"Output directory: {output_dir}")
+    print()
+
+    inventory_path = os.path.join(output_dir, args.inventory)
+    output_path = os.path.join(output_dir, "redundant_jobs_report.md")
+
+    jobs = load_inventory(inventory_path)
+    generate_report(jobs, output_path)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/analyze_triggers.py b/reporting-toolkit/analyze_triggers.py
new file mode 100644
index 0000000..02e2799
--- /dev/null
+++ b/reporting-toolkit/analyze_triggers.py
@@ -0,0 +1,403 @@
+#!/usr/bin/env python3
+"""
+Analyze OpenStack CI job triggers for optimization opportunities.
+
+This script identifies:
+1. Jobs missing file-change filters (skip_if_only_changed, run_if_changed)
+2. Always-run jobs without throttling
+3. Repos that could benefit from smarter triggering
+4. Recommended patterns for skip_if_only_changed
+"""
+
+import argparse
+import csv
+import json
+import sys
+from collections import defaultdict
+from pathlib import Path
+
+
+# Common patterns for files that typically don't need E2E tests
+SKIP_PATTERNS = {
+    'documentation': [
+        r'^docs/',
+        r'\.md$',
+        r'^README',
+    ],
+    'ownership': [
+        r'(^|/)OWNERS(_ALIASES)?$',
+    ],
+    'github_config': [
+        r'^\.github/',
+    ],
+    'general': [
+        r'^CHANGELOG',
+        r'^LICENSE',
+        r'^DCO',
+        r'^SECURITY\.md$',
+    ],
+}
+
+# Suggested skip pattern for E2E tests
+SUGGESTED_SKIP_PATTERN = r'(^docs/)|(\\.md$)|((^|/)OWNERS(_ALIASES)?$)'
+
+
+def load_inventory(csv_path):
+    """Load job inventory from CSV."""
+    jobs = []
+    with open(csv_path, 'r', encoding='utf-8') as f:
+        reader = csv.DictReader(f)
+        for row in reader:
+            row['optional'] = row['optional'].lower() == 'true'
+            row['always_run'] = row['always_run'].lower() == 'true'
+            jobs.append(row)
+    return jobs
+
+
+def analyze_trigger_patterns(jobs):
+    """
+    Analyze the current trigger patterns used across jobs.
+    """
+    patterns = {
+        'has_skip_if_only_changed': [],
+        'has_run_if_changed': [],
+        'has_minimum_interval': [],
+        'always_run_true': [],
+        'optional_true': [],
+        'no_filters': [],  # Jobs with no trigger optimization
+    }
+
+    for job in jobs:
+        if job['skip_if_only_changed']:
+            patterns['has_skip_if_only_changed'].append(job)
+        if job['run_if_changed']:
+            patterns['has_run_if_changed'].append(job)
+        if job['minimum_interval']:
+            patterns['has_minimum_interval'].append(job)
+        if job['always_run']:
+            patterns['always_run_true'].append(job)
+        if job['optional']:
+            patterns['optional_true'].append(job)
+
+        # Jobs that could benefit from trigger optimization
+        if (job['job_type'] == 'presubmit' and
+            not job['skip_if_only_changed'] and
+            not job['run_if_changed'] and
+            not job['optional']):
+            patterns['no_filters'].append(job)
+
+    return patterns
+
+
+def group_jobs_by_repo(jobs):
+    """Group jobs by org/repo for analysis."""
+    repos = defaultdict(list)
+    for job in jobs:
+        key = (job['org'], job['repo'])
+        repos[key].append(job)
+    return repos
+
+
+def analyze_repo_trigger_status(repos):
+    """
+    For each repo, determine if it would benefit from skip_if_only_changed.
+    """
+    repo_analysis = []
+
+    for (org, repo), jobs in repos.items():
+        presubmit_jobs = [j for j in jobs if j['job_type'] == 'presubmit']
+
+        if not presubmit_jobs:
+            continue
+
+        # Count jobs with/without filters
+        with_skip = len([j for j in presubmit_jobs if j['skip_if_only_changed']])
+        with_run_if = len([j for j in presubmit_jobs if j['run_if_changed']])
+        optional = len([j for j in presubmit_jobs if j['optional']])
+        always_run = len([j for j in presubmit_jobs if j['always_run']])
+        no_filter = len([j for j in presubmit_jobs
+                         if not j['skip_if_only_changed']
+                         and not j['run_if_changed']
+                         and not j['optional']])
+
+        # Determine if repo could benefit
+        could_benefit = no_filter > 0 and with_skip == 0
+
+        repo_analysis.append({
+            'org': org,
+            'repo': repo,
+            'total_presubmit': len(presubmit_jobs),
+            'with_skip_pattern': with_skip,
+            'with_run_if_changed': with_run_if,
+            'optional': optional,
+            'always_run': always_run,
+            'no_filter': no_filter,
+            'could_benefit': could_benefit,
+            'job_names': sorted(set(j['job_name'] for j in presubmit_jobs)),
+        })
+
+    return repo_analysis
+
+
+def analyze_always_run_jobs(jobs):
+    """
+    Find jobs that are always_run=true without throttling.
+    These run on every PR and should be reviewed.
+    """
+    always_run_jobs = [j for j in jobs
+                       if j['always_run'] and j['job_type'] == 'presubmit']
+
+    # Group by whether they have minimum_interval
+    with_throttle = [j for j in always_run_jobs if j['minimum_interval']]
+    without_throttle = [j for j in always_run_jobs if not j['minimum_interval']]
+
+    return {
+        'with_throttle': with_throttle,
+        'without_throttle': without_throttle,
+    }
+
+
+def analyze_periodic_schedules(jobs):
+    """
+    Analyze periodic job schedules for optimization.
+    """
+    periodic_jobs = [j for j in jobs if j['job_type'] == 'periodic']
+
+    # Group by schedule pattern
+    schedules = defaultdict(list)
+    for job in periodic_jobs:
+        schedules[job['schedule']].append(job)
+
+    return schedules
+
+
+def generate_report(jobs, output_file):
+    """Generate comprehensive trigger optimization report."""
+    report = []
+    report.append("# OpenStack CI Trigger Optimization Report\n")
+    report.append(f"Total jobs analyzed: {len(jobs)}\n")
+
+    presubmit_jobs = [j for j in jobs if j['job_type'] == 'presubmit']
+    periodic_jobs = [j for j in jobs if j['job_type'] == 'periodic']
+
+    report.append(f"- Presubmit jobs: {len(presubmit_jobs)}\n")
+    report.append(f"- Periodic jobs: {len(periodic_jobs)}\n")
+
+    # Trigger Pattern Analysis
+    report.append("\n## 1. Current Trigger Pattern Usage\n\n")
+    patterns = analyze_trigger_patterns(jobs)
+
+    report.append("| Pattern | Count | % of Presubmits |\n")
+    report.append("|---------|-------|------------------|\n")
+
+    total_pre = len(presubmit_jobs)
+    for pattern, pattern_jobs in patterns.items():
+        count = len([j for j in pattern_jobs if j['job_type'] == 'presubmit'])
+        pct = count / total_pre * 100 if total_pre else 0
+        report.append(f"| {pattern} | {count} | {pct:.1f}% |\n")
+
+    # Jobs Missing Filters
+    report.append("\n## 2. Jobs Without Trigger Optimization\n")
+    report.append("Presubmit jobs without skip_if_only_changed, "
+                  "run_if_changed, or optional flags.\n\n")
+
+    no_filter_jobs = patterns['no_filters']
+    if no_filter_jobs:
+        # Group by repo
+        by_repo = defaultdict(list)
+        for job in no_filter_jobs:
+            by_repo[(job['org'], job['repo'])].append(job)
+
+        report.append(f"Found {len(no_filter_jobs)} jobs across "
+                      f"{len(by_repo)} repositories that could benefit from "
+                      f"trigger optimization.\n\n")
+
+        report.append("| Org/Repo | Jobs Without Filters | Job Names |\n")
+        report.append("|----------|----------------------|-----------|\n")
+
+        for (org, repo), repo_jobs in sorted(by_repo.items(),
+                                              key=lambda x: len(x[1]),
+                                              reverse=True)[:20]:
+            names = ', '.join(set(j['job_name'] for j in repo_jobs))[:50]
+            if len(names) >= 50:
+                names += "..."
+            report.append(f"| {org}/{repo} | {len(repo_jobs)} | {names} |\n")
+    else:
+        report.append("All presubmit jobs have some form of trigger optimization.\n")
+
+    # Repository Analysis
+    report.append("\n## 3. Repository Trigger Analysis\n")
+    report.append("Repositories that could benefit from adding "
+                  "`skip_if_only_changed` patterns.\n\n")
+
+    repos = group_jobs_by_repo(jobs)
+    repo_analysis = analyze_repo_trigger_status(repos)
+
+    # Filter to repos that could benefit
+    could_benefit = [r for r in repo_analysis if r['could_benefit']]
+
+    if could_benefit:
+        report.append(f"Found {len(could_benefit)} repositories that could "
+                      f"add skip patterns.\n\n")
+
+        report.append("| Org/Repo | Presubmits | No Filter | Suggested Action |\n")
+        report.append("|----------|------------|-----------|------------------|\n")
+
+        for repo in sorted(could_benefit,
+                           key=lambda x: x['no_filter'], reverse=True)[:25]:
+            action = f"Add skip_if_only_changed to {repo['no_filter']} jobs"
+            report.append(
+                f"| {repo['org']}/{repo['repo']} | {repo['total_presubmit']} | "
+                f"{repo['no_filter']} | {action} |\n"
+            )
+    else:
+        report.append("All repositories have adequate trigger patterns.\n")
+
+    # Suggested Skip Pattern
+    report.append("\n## 4. Recommended skip_if_only_changed Patterns\n\n")
+    report.append("For OpenStack E2E tests, we recommend:\n\n")
+    report.append("```yaml\n")
+    report.append("skip_if_only_changed: ")
+    report.append(f"{SUGGESTED_SKIP_PATTERN}\n")
+    report.append("```\n\n")
+
+    report.append("This pattern skips the job when changes only affect:\n")
+    report.append("- Documentation files (`docs/` directory)\n")
+    report.append("- Markdown files (`*.md`)\n")
+    report.append("- OWNERS files\n\n")
+
+    report.append("### Individual Component Patterns\n\n")
+    for category, patterns_list in SKIP_PATTERNS.items():
+        report.append(f"**{category}:**\n")
+        for p in patterns_list:
+            report.append(f"- `{p}`\n")
+        report.append("\n")
+
+    # Periodic Schedule Analysis
+    report.append("\n## 5. Periodic Job Schedule Analysis\n\n")
+    schedules = analyze_periodic_schedules(jobs)
+
+    if schedules:
+        report.append("| Schedule | Jobs | Examples |\n")
+        report.append("|----------|------|----------|\n")
+
+        for schedule, sched_jobs in sorted(schedules.items()):
+            examples = ', '.join(set(j['job_name'] for j in sched_jobs))[:40]
+            if len(examples) >= 40:
+                examples += "..."
+            report.append(f"| {schedule} | {len(sched_jobs)} | {examples} |\n")
+    else:
+        report.append("No periodic jobs found.\n")
+
+    # Optimization Recommendations
+    report.append("\n## 6. Optimization Recommendations\n\n")
+
+    report.append("### High Impact\n\n")
+
+    if could_benefit:
+        report.append(f"1. **Add skip_if_only_changed to {len(could_benefit)} repos**: "
+                      f"Approximately {sum(r['no_filter'] for r in could_benefit)} jobs "
+                      f"could skip runs on docs-only PRs.\n\n")
+
+    # Calculate potential savings
+    total_no_filter = len(patterns['no_filters'])
+    report.append(f"2. **Total presubmit jobs without filters**: {total_no_filter}\n")
+    report.append("   These jobs run on every non-optional PR regardless of "
+                  "which files changed.\n\n")
+
+    report.append("### Medium Impact\n\n")
+    report.append("3. **Review always_run jobs**: Ensure jobs marked `always_run: true` "
+                  "are truly required for every PR.\n\n")
+
+    report.append("4. **Add minimum_interval to high-frequency jobs**: "
+                  "Throttle jobs that don't need to run on every commit.\n\n")
+
+    report.append("### Implementation Steps\n\n")
+    report.append("1. For each repo without skip patterns:\n")
+    report.append("   - Identify which test jobs are full E2E (vs unit tests)\n")
+    report.append("   - Add `skip_if_only_changed` to E2E tests\n")
+    report.append("   - Keep unit tests running on all changes\n\n")
+
+    report.append("2. Example config change:\n")
+    report.append("```yaml\n")
+    report.append("tests:\n")
+    report.append("- as: e2e-openstack\n")
+    report.append("  skip_if_only_changed: (^docs/)|(\\\\..md$)|((^|/)OWNERS$)\n")
+    report.append("  steps:\n")
+    report.append("    cluster_profile: openstack-vexxhost\n")
+    report.append("    workflow: openshift-e2e-openstack-ipi\n")
+    report.append("```\n")
+
+    # Write report
+    with open(output_file, 'w', encoding='utf-8') as f:
+        f.write(''.join(report))
+
+    print(f"Report written to {output_file}", file=sys.stderr)
+
+    # Also write machine-readable data
+    json_output = output_file.replace('.md', '_data.json')
+    with open(json_output, 'w', encoding='utf-8') as f:
+        json.dump({
+            'trigger_patterns': {k: len(v) for k, v in patterns.items()},
+            'repos_without_skip': [
+                {
+                    'org': r['org'],
+                    'repo': r['repo'],
+                    'jobs_without_filter': r['no_filter'],
+                    'job_names': r['job_names'],
+                }
+                for r in could_benefit
+            ],
+            'jobs_without_filter': [
+                {
+                    'org': j['org'],
+                    'repo': j['repo'],
+                    'branch': j['branch'],
+                    'job_name': j['job_name'],
+                }
+                for j in patterns['no_filters']
+            ],
+            'periodic_schedules': {k: len(v) for k, v in schedules.items()},
+            'suggested_pattern': SUGGESTED_SKIP_PATTERN,
+        }, f, indent=2)
+
+    print(f"Data written to {json_output}", file=sys.stderr)
+
+
+def main():
+    import os
+    script_dir = os.path.dirname(os.path.abspath(__file__))
+
+    parser = argparse.ArgumentParser(
+        description="Analyze OpenStack CI job triggers for optimization"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=script_dir,
+        help="Directory for input/output files (default: script directory)"
+    )
+    parser.add_argument(
+        "--inventory",
+        default="openstack_jobs_inventory.csv",
+        help="Inventory CSV filename (default: openstack_jobs_inventory.csv)"
+    )
+
+    args = parser.parse_args()
+
+    output_dir = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Trigger Optimization Analysis")
+    print("=" * 60)
+    print(f"Output directory: {output_dir}")
+    print()
+
+    inventory_path = os.path.join(output_dir, args.inventory)
+    output_path = os.path.join(output_dir, "trigger_optimization_report.md")
+
+    jobs = load_inventory(inventory_path)
+    generate_report(jobs, output_path)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/analyze_workflow_passrate.py b/reporting-toolkit/analyze_workflow_passrate.py
new file mode 100644
index 0000000..5d26a60
--- /dev/null
+++ b/reporting-toolkit/analyze_workflow_passrate.py
@@ -0,0 +1,435 @@
+#!/usr/bin/env python3
+"""
+Analyze workflow pass rates by correlating job inventory with Sippy metrics.
+Maps inventory job names to Sippy data using substring matching.
+"""
+
+import argparse
+import json
+import os
+import sys
+from datetime import datetime
+from collections import defaultdict
+
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Analyze workflow pass rates"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for input/output files (default: script directory)"
+    )
+    return parser.parse_args()
+
+
+def load_job_inventory():
+    """Load job inventory."""
+    filepath = os.path.join(OUTPUT_DIR, "openstack_jobs_inventory.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_sippy_data():
+    """Load Sippy job data."""
+    filepath = os.path.join(OUTPUT_DIR, "sippy_jobs_raw.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_extended_metrics_jobs():
+    """Load extended metrics per job."""
+    filepath = os.path.join(OUTPUT_DIR, "extended_metrics_jobs.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def extract_workflow_from_name(job_name):
+    """Extract workflow pattern from job name."""
+    # Common workflow patterns in OpenStack job names
+    patterns = [
+        "openshift-e2e-openstack-ipi",
+        "openshift-e2e-openstack-upi",
+        "openshift-upgrade-openstack",
+        "openshift-e2e-openstack",
+        "openshift-installer-openstack",
+    ]
+
+    # Check for specific test scenarios in name
+    name_lower = job_name.lower()
+
+    # Extract key test characteristics
+    characteristics = []
+    if "serial" in name_lower:
+        characteristics.append("serial")
+    if "parallel" in name_lower:
+        characteristics.append("parallel")
+    if "fips" in name_lower:
+        characteristics.append("fips")
+    if "proxy" in name_lower:
+        characteristics.append("proxy")
+    if "dualstack" in name_lower:
+        characteristics.append("dualstack")
+    if "singlestackv6" in name_lower or "single-stack-v6" in name_lower:
+        characteristics.append("singlestackv6")
+    if "upgrade" in name_lower:
+        characteristics.append("upgrade")
+    if "nfv" in name_lower:
+        characteristics.append("nfv")
+    if "hwoffload" in name_lower:
+        characteristics.append("hwoffload")
+    if "ccpmso" in name_lower:
+        characteristics.append("ccpmso")
+    if "csi" in name_lower:
+        characteristics.append("csi")
+    if "manila" in name_lower:
+        characteristics.append("manila")
+    if "cinder" in name_lower:
+        characteristics.append("cinder")
+    if "externallb" in name_lower:
+        characteristics.append("externallb")
+    if "kuryr" in name_lower:
+        characteristics.append("kuryr")
+    if "hypershift" in name_lower:
+        characteristics.append("hypershift")
+    if "techpreview" in name_lower:
+        characteristics.append("techpreview")
+    if "etcd" in name_lower:
+        characteristics.append("etcd")
+
+    if characteristics:
+        return "-".join(sorted(characteristics))
+    return "e2e-default"
+
+
+def correlate_jobs(inventory, sippy_data, extended_jobs):
+    """Correlate inventory jobs with Sippy data."""
+    # Build Sippy job lookup by name
+    sippy_lookup = {}
+    for release, jobs in sippy_data.get("jobs_by_release", {}).items():
+        for job in jobs:
+            name = job.get("name", "")
+            sippy_lookup[name] = {
+                "release": release,
+                "current_runs": job.get("current_runs", 0),
+                "current_passes": job.get("current_passes", 0),
+                "previous_runs": job.get("previous_runs", 0),
+                "previous_passes": job.get("previous_passes", 0),
+                "pass_rate": job.get("current_pass_percentage", 0),
+            }
+
+    # Build extended metrics lookup
+    extended_lookup = {}
+    if extended_jobs:
+        for job in extended_jobs:
+            name = job.get("name", "")
+            extended_lookup[name] = job
+
+    # Group inventory jobs by workflow
+    workflow_jobs = defaultdict(list)
+
+    for inv_job in inventory:
+        job_name = inv_job.get("job_name", "")
+        workflow = inv_job.get("workflow", "") or extract_workflow_from_name(job_name)
+        job_type = inv_job.get("job_type", "")
+
+        # Only analyze periodic jobs (which have Sippy data)
+        if job_type != "periodic":
+            continue
+
+        # Try to find matching Sippy job
+        sippy_match = None
+        extended_match = None
+
+        # Look for exact or partial match
+        for sippy_name, sippy_job in sippy_lookup.items():
+            # Check if inventory job name is in Sippy job name or vice versa
+            if job_name in sippy_name or sippy_name.endswith(job_name):
+                sippy_match = sippy_job
+                extended_match = extended_lookup.get(sippy_name)
+                break
+
+        job_info = {
+            "job_name": job_name,
+            "workflow": workflow,
+            "cluster_profile": inv_job.get("cluster_profile", ""),
+            "org": inv_job.get("org", ""),
+            "repo": inv_job.get("repo", ""),
+            "branch": inv_job.get("branch", ""),
+            "has_sippy_data": sippy_match is not None,
+        }
+
+        if sippy_match:
+            job_info.update({
+                "release": sippy_match.get("release", ""),
+                "current_runs": sippy_match.get("current_runs", 0),
+                "current_passes": sippy_match.get("current_passes", 0),
+                "previous_runs": sippy_match.get("previous_runs", 0),
+                "previous_passes": sippy_match.get("previous_passes", 0),
+                "pass_rate": sippy_match.get("pass_rate", 0),
+            })
+        if extended_match:
+            job_info["combined_runs"] = extended_match.get("combined_runs", 0)
+            job_info["combined_pass_rate"] = extended_match.get("combined_pass_rate", 0)
+            job_info["trend"] = extended_match.get("trend", "")
+
+        # Extract scenario from job name
+        scenario = extract_workflow_from_name(job_name)
+        workflow_jobs[scenario].append(job_info)
+
+    return workflow_jobs
+
+
+def analyze_workflows(workflow_jobs):
+    """Analyze pass rates by workflow."""
+    results = {
+        "generated": datetime.now().isoformat(),
+        "workflows": [],
+        "summary": {},
+    }
+
+    workflow_stats = []
+
+    for workflow, jobs in workflow_jobs.items():
+        jobs_with_data = [j for j in jobs if j.get("has_sippy_data")]
+
+        if not jobs_with_data:
+            continue
+
+        total_runs = sum(j.get("current_runs", 0) + j.get("previous_runs", 0) for j in jobs_with_data)
+        total_passes = sum(j.get("current_passes", 0) + j.get("previous_passes", 0) for j in jobs_with_data)
+        pass_rate = (total_passes / total_runs * 100) if total_runs > 0 else 0
+
+        # Count problem jobs
+        problem_jobs = [j for j in jobs_with_data if j.get("pass_rate", 100) < 80]
+
+        # Calculate trend
+        improving = sum(1 for j in jobs_with_data if j.get("trend") == "improving")
+        degrading = sum(1 for j in jobs_with_data if j.get("trend") == "degrading")
+
+        trend = "stable"
+        if improving > degrading and improving > 0:
+            trend = "improving"
+        elif degrading > improving and degrading > 0:
+            trend = "degrading"
+
+        # Determine severity
+        severity = "ok"
+        if pass_rate < 50:
+            severity = "critical"
+        elif pass_rate < 70:
+            severity = "warning"
+        elif pass_rate < 80:
+            severity = "needs_attention"
+
+        workflow_stats.append({
+            "workflow": workflow,
+            "job_count": len(jobs_with_data),
+            "total_runs": total_runs,
+            "total_passes": total_passes,
+            "pass_rate": pass_rate,
+            "problem_job_count": len(problem_jobs),
+            "trend": trend,
+            "severity": severity,
+            "jobs": jobs_with_data,
+        })
+
+    # Sort by pass rate (lowest first = most problematic)
+    workflow_stats.sort(key=lambda x: x["pass_rate"])
+
+    results["workflows"] = workflow_stats
+
+    # Summary
+    total_workflows = len(workflow_stats)
+    critical = sum(1 for w in workflow_stats if w["severity"] == "critical")
+    warning = sum(1 for w in workflow_stats if w["severity"] == "warning")
+
+    results["summary"] = {
+        "total_workflows_analyzed": total_workflows,
+        "critical_workflows": critical,
+        "warning_workflows": warning,
+        "ok_workflows": total_workflows - critical - warning,
+    }
+
+    return results
+
+
+def generate_report(analysis):
+    """Generate markdown report for workflow analysis."""
+    report = []
+    report.append("# Workflow Pass Rate Analysis")
+    report.append("")
+    report.append(f"**Generated:** {analysis['generated']}")
+    report.append("")
+    report.append("This report analyzes pass rates grouped by test workflow/scenario type.")
+    report.append("")
+
+    # Summary
+    summary = analysis.get("summary", {})
+    report.append("## Summary")
+    report.append("")
+    report.append(f"| Metric | Count |")
+    report.append(f"|--------|-------|")
+    report.append(f"| Total Workflows Analyzed | {summary.get('total_workflows_analyzed', 0)} |")
+    report.append(f"| Critical (<50% pass rate) | {summary.get('critical_workflows', 0)} |")
+    report.append(f"| Warning (50-70% pass rate) | {summary.get('warning_workflows', 0)} |")
+    report.append(f"| OK (>70% pass rate) | {summary.get('ok_workflows', 0)} |")
+    report.append("")
+
+    workflows = analysis.get("workflows", [])
+
+    # Critical workflows
+    critical = [w for w in workflows if w["severity"] == "critical"]
+    if critical:
+        report.append("## Critical Workflows (Pass Rate < 50%)")
+        report.append("")
+        report.append("These workflows require immediate attention:")
+        report.append("")
+        report.append("| Workflow | Jobs | Runs | Pass Rate | Trend |")
+        report.append("|----------|------|------|-----------|-------|")
+        for w in critical:
+            trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(w["trend"], "")
+            report.append(
+                f"| {w['workflow']} | {w['job_count']} | {w['total_runs']} | "
+                f"**{w['pass_rate']:.1f}%** | {trend_icon} |"
+            )
+        report.append("")
+
+    # Warning workflows
+    warning = [w for w in workflows if w["severity"] == "warning"]
+    if warning:
+        report.append("## Warning Workflows (Pass Rate 50-70%)")
+        report.append("")
+        report.append("| Workflow | Jobs | Runs | Pass Rate | Trend |")
+        report.append("|----------|------|------|-----------|-------|")
+        for w in warning:
+            trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(w["trend"], "")
+            report.append(
+                f"| {w['workflow']} | {w['job_count']} | {w['total_runs']} | "
+                f"{w['pass_rate']:.1f}% | {trend_icon} |"
+            )
+        report.append("")
+
+    # All workflows table
+    report.append("## All Workflows by Pass Rate")
+    report.append("")
+    report.append("| Rank | Workflow | Jobs | Runs | Pass Rate | Problems | Trend |")
+    report.append("|------|----------|------|------|-----------|----------|-------|")
+    for i, w in enumerate(workflows, 1):
+        trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(w["trend"], "")
+        severity_marker = ""
+        if w["severity"] == "critical":
+            severity_marker = " ⚠️"
+        elif w["severity"] == "warning":
+            severity_marker = " ⚡"
+        report.append(
+            f"| {i} | {w['workflow']}{severity_marker} | {w['job_count']} | "
+            f"{w['total_runs']} | {w['pass_rate']:.1f}% | {w['problem_job_count']} | {trend_icon} |"
+        )
+    report.append("")
+
+    # Recommendations
+    report.append("## Recommendations")
+    report.append("")
+    if critical:
+        report.append("### Immediate Actions")
+        report.append("")
+        for w in critical[:5]:
+            report.append(f"- **{w['workflow']}**: {w['pass_rate']:.1f}% pass rate with {w['total_runs']} runs - investigate root cause")
+        report.append("")
+
+    if warning:
+        report.append("### Short-term Improvements")
+        report.append("")
+        for w in warning[:5]:
+            report.append(f"- **{w['workflow']}**: {w['pass_rate']:.1f}% pass rate - monitor and triage failures")
+        report.append("")
+
+    report.append("---")
+    report.append("")
+    report.append("*Data Sources: Job inventory + [Sippy](https://sippy.dptools.openshift.org/)*")
+    report.append("")
+
+    return "\n".join(report)
+
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Workflow Pass Rate Analysis")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    # Load data
+    inventory = load_job_inventory()
+    if not inventory:
+        print("Error: No job inventory found. Run extract_openstack_jobs.py first.")
+        sys.exit(1)
+    print(f"Loaded inventory: {len(inventory)} jobs")
+
+    sippy_data = load_sippy_data()
+    if not sippy_data:
+        print("Error: No Sippy data found. Run fetch_job_metrics.py first.")
+        sys.exit(1)
+    print(f"Loaded Sippy data from: {sippy_data.get('fetched_at')}")
+
+    extended_jobs = load_extended_metrics_jobs()
+    print(f"Extended metrics loaded: {extended_jobs is not None}")
+    print()
+
+    # Correlate and analyze
+    workflow_jobs = correlate_jobs(inventory, sippy_data, extended_jobs)
+    print(f"Found {len(workflow_jobs)} workflow types")
+
+    analysis = analyze_workflows(workflow_jobs)
+
+    # Save results
+    analysis_path = os.path.join(OUTPUT_DIR, "workflow_passrate_analysis.json")
+    with open(analysis_path, 'w') as f:
+        # Remove job details for smaller output
+        save_analysis = dict(analysis)
+        save_analysis["workflows"] = [
+            {k: v for k, v in w.items() if k != "jobs"}
+            for w in analysis["workflows"]
+        ]
+        json.dump(save_analysis, f, indent=2)
+    print(f"Saved: {analysis_path}")
+
+    # Generate report
+    report = generate_report(analysis)
+    report_path = os.path.join(OUTPUT_DIR, "workflow_passrate_report.md")
+    with open(report_path, 'w') as f:
+        f.write(report)
+    print(f"Saved: {report_path}")
+
+    # Print summary
+    print()
+    print("=" * 60)
+    print("Summary:")
+    summary = analysis.get("summary", {})
+    print(f"  Workflows analyzed: {summary.get('total_workflows_analyzed', 0)}")
+    print(f"  Critical (<50%): {summary.get('critical_workflows', 0)}")
+    print(f"  Warning (50-70%): {summary.get('warning_workflows', 0)}")
+    print(f"  OK (>70%): {summary.get('ok_workflows', 0)}")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/categorize_failures.py b/reporting-toolkit/categorize_failures.py
new file mode 100644
index 0000000..569c3f8
--- /dev/null
+++ b/reporting-toolkit/categorize_failures.py
@@ -0,0 +1,417 @@
+#!/usr/bin/env python3
+"""
+Categorize job failures using heuristic classification.
+Categories: Infrastructure, Flaky, Product Bug, Unknown/Needs Triage
+"""
+
+import argparse
+import json
+import os
+import sys
+from datetime import datetime
+from collections import defaultdict
+
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Categorize job failures using heuristic classification"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for input/output files (default: script directory)"
+    )
+    return parser.parse_args()
+
+
+def load_extended_metrics():
+    """Load extended metrics data."""
+    filepath = os.path.join(OUTPUT_DIR, "extended_metrics.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_extended_jobs():
+    """Load extended metrics per job."""
+    filepath = os.path.join(OUTPUT_DIR, "extended_metrics_jobs.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_sippy_data():
+    """Load raw Sippy data for additional context."""
+    filepath = os.path.join(OUTPUT_DIR, "sippy_jobs_raw.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def categorize_job(job):
+    """
+    Categorize a job failure based on heuristics.
+
+    Categories:
+    - infrastructure: Likely infrastructure/provisioning issues
+    - flaky: Inconsistent pass rates (30-70%)
+    - product_bug: Consistent failures with bugs filed
+    - needs_triage: Unknown, requires investigation
+    """
+    name = job.get("name", "").lower()
+    brief_name = job.get("brief_name", "").lower()
+    combined_rate = job.get("combined_pass_rate")
+    current_rate = job.get("current_pass_rate")
+    open_bugs = job.get("open_bugs", 0)
+    combined_runs = job.get("combined_runs", 0)
+    trend = job.get("trend", "")
+
+    # Skip jobs with no data
+    if combined_rate is None or combined_runs < 2:
+        return None, "insufficient_data"
+
+    # Jobs at or above 80% are not problem jobs
+    if combined_rate >= 80:
+        return None, "passing"
+
+    # Category determination heuristics
+
+    # 1. Product Bug: 0% pass rate or very low with bugs filed
+    if combined_rate == 0:
+        if open_bugs > 0:
+            return "product_bug", "0% pass rate with filed bugs"
+        else:
+            return "needs_triage", "0% pass rate, no bugs filed"
+
+    # 2. Infrastructure indicators
+    infra_keywords = [
+        "install", "provision", "bootstrap", "create",
+        "vpc", "network", "dns", "loadbalancer", "lb",
+    ]
+    is_infra_job = any(kw in name or kw in brief_name for kw in infra_keywords)
+
+    if combined_rate < 30 and is_infra_job:
+        return "infrastructure", "Low pass rate on infrastructure-related job"
+
+    # 3. Flaky: 30-70% pass rate (inconsistent)
+    if 30 <= combined_rate < 70:
+        if trend == "degrading":
+            return "flaky", "Inconsistent pass rate, trending worse"
+        elif trend == "improving":
+            return "flaky", "Inconsistent pass rate, trending better"
+        else:
+            return "flaky", "Inconsistent pass rate (30-70%)"
+
+    # 4. Product Bug: Low pass rate with bugs
+    if combined_rate < 50 and open_bugs > 0:
+        return "product_bug", f"Low pass rate with {open_bugs} open bug(s)"
+
+    # 5. Check for specific failure patterns in job name
+    if "etcd" in name or "scaling" in name:
+        return "product_bug", "Known problematic component"
+
+    if "techpreview" in name:
+        return "needs_triage", "Tech preview feature - expected instability"
+
+    # 6. Very low rate without bugs = needs investigation
+    if combined_rate < 30:
+        return "needs_triage", "Very low pass rate, needs investigation"
+
+    # 7. Moderate failures (70-80%)
+    if 70 <= combined_rate < 80:
+        if trend == "degrading":
+            return "needs_triage", "Recently degraded, needs investigation"
+        else:
+            return "flaky", "Borderline pass rate"
+
+    # Default
+    return "needs_triage", "Uncategorized failure"
+
+
+def categorize_all_jobs(extended_jobs, sippy_data):
+    """Categorize all problem jobs."""
+    results = {
+        "generated": datetime.now().isoformat(),
+        "categories": {
+            "infrastructure": [],
+            "flaky": [],
+            "product_bug": [],
+            "needs_triage": [],
+        },
+        "summary": {},
+        "by_release": defaultdict(lambda: defaultdict(list)),
+    }
+
+    # Build Sippy lookup for additional context
+    sippy_bugs = {}
+    if sippy_data:
+        for release, jobs in sippy_data.get("jobs_by_release", {}).items():
+            for job in jobs:
+                sippy_bugs[job.get("name", "")] = job.get("open_bugs", 0)
+
+    # Categorize each job
+    for job in extended_jobs:
+        # Ensure we have bug info
+        if job.get("open_bugs") is None and job.get("name") in sippy_bugs:
+            job["open_bugs"] = sippy_bugs[job.get("name")]
+
+        category, reason = categorize_job(job)
+
+        if category is None:
+            continue
+
+        job_info = {
+            "release": job.get("release", ""),
+            "name": job.get("name", ""),
+            "brief_name": job.get("brief_name", ""),
+            "combined_runs": job.get("combined_runs", 0),
+            "combined_pass_rate": job.get("combined_pass_rate"),
+            "current_pass_rate": job.get("current_pass_rate"),
+            "open_bugs": job.get("open_bugs", 0),
+            "trend": job.get("trend", ""),
+            "reason": reason,
+        }
+
+        results["categories"][category].append(job_info)
+        results["by_release"][job.get("release", "")][category].append(job_info)
+
+    # Sort each category by pass rate
+    for category in results["categories"]:
+        results["categories"][category].sort(
+            key=lambda x: x.get("combined_pass_rate") or 0
+        )
+
+    # Summary statistics
+    total_problems = sum(len(jobs) for jobs in results["categories"].values())
+    results["summary"] = {
+        "total_problem_jobs": total_problems,
+        "by_category": {
+            cat: len(jobs) for cat, jobs in results["categories"].items()
+        },
+        "percentages": {},
+    }
+
+    if total_problems > 0:
+        for cat, count in results["summary"]["by_category"].items():
+            results["summary"]["percentages"][cat] = round(count / total_problems * 100, 1)
+
+    return results
+
+
+def generate_report(analysis):
+    """Generate markdown report for failure categorization."""
+    report = []
+    report.append("# Failure Categorization Report")
+    report.append("")
+    report.append(f"**Generated:** {analysis['generated']}")
+    report.append("")
+    report.append("Jobs with pass rate below 80% are categorized by likely root cause.")
+    report.append("")
+
+    # Summary
+    summary = analysis.get("summary", {})
+    report.append("## Summary")
+    report.append("")
+    report.append(f"**Total Problem Jobs:** {summary.get('total_problem_jobs', 0)}")
+    report.append("")
+    report.append("| Category | Count | Percentage | Description |")
+    report.append("|----------|-------|------------|-------------|")
+
+    category_descriptions = {
+        "infrastructure": "Provisioning/infra failures",
+        "flaky": "Inconsistent (30-70% pass rate)",
+        "product_bug": "Known bugs filed",
+        "needs_triage": "Requires investigation",
+    }
+
+    by_cat = summary.get("by_category", {})
+    percentages = summary.get("percentages", {})
+    for cat in ["infrastructure", "flaky", "product_bug", "needs_triage"]:
+        count = by_cat.get(cat, 0)
+        pct = percentages.get(cat, 0)
+        desc = category_descriptions.get(cat, "")
+        report.append(f"| {cat.replace('_', ' ').title()} | {count} | {pct}% | {desc} |")
+    report.append("")
+
+    # Category breakdowns
+    categories = analysis.get("categories", {})
+
+    # Infrastructure issues
+    infra = categories.get("infrastructure", [])
+    if infra:
+        report.append("## Infrastructure Issues")
+        report.append("")
+        report.append("Jobs likely failing due to OpenStack provisioning or infrastructure problems:")
+        report.append("")
+        report.append("| Release | Job | Pass Rate | Runs | Reason |")
+        report.append("|---------|-----|-----------|------|--------|")
+        for job in infra[:15]:
+            rate = job.get("combined_pass_rate")
+            rate_str = f"{rate:.1f}%" if rate is not None else "N/A"
+            report.append(
+                f"| {job['release']} | {job['brief_name'][:40]} | {rate_str} | "
+                f"{job['combined_runs']} | {job['reason'][:30]} |"
+            )
+        if len(infra) > 15:
+            report.append(f"| ... | *{len(infra) - 15} more* | | | |")
+        report.append("")
+
+    # Flaky jobs
+    flaky = categories.get("flaky", [])
+    if flaky:
+        report.append("## Flaky Jobs")
+        report.append("")
+        report.append("Jobs with inconsistent pass rates (30-70%) indicating test or timing issues:")
+        report.append("")
+        report.append("| Release | Job | Pass Rate | Trend | Runs |")
+        report.append("|---------|-----|-----------|-------|------|")
+        for job in flaky[:15]:
+            rate = job.get("combined_pass_rate")
+            rate_str = f"{rate:.1f}%" if rate is not None else "N/A"
+            trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(job["trend"], "")
+            report.append(
+                f"| {job['release']} | {job['brief_name'][:40]} | {rate_str} | "
+                f"{trend_icon} | {job['combined_runs']} |"
+            )
+        if len(flaky) > 15:
+            report.append(f"| ... | *{len(flaky) - 15} more* | | | |")
+        report.append("")
+
+    # Product bugs
+    bugs = categories.get("product_bug", [])
+    if bugs:
+        report.append("## Product Bugs")
+        report.append("")
+        report.append("Jobs with known bugs filed - track via bug system:")
+        report.append("")
+        report.append("| Release | Job | Pass Rate | Open Bugs | Runs |")
+        report.append("|---------|-----|-----------|-----------|------|")
+        for job in bugs[:15]:
+            rate = job.get("combined_pass_rate")
+            rate_str = f"{rate:.1f}%" if rate is not None else "N/A"
+            report.append(
+                f"| {job['release']} | {job['brief_name'][:40]} | {rate_str} | "
+                f"{job['open_bugs']} | {job['combined_runs']} |"
+            )
+        if len(bugs) > 15:
+            report.append(f"| ... | *{len(bugs) - 15} more* | | | |")
+        report.append("")
+
+    # Needs triage
+    triage = categories.get("needs_triage", [])
+    if triage:
+        report.append("## Needs Triage")
+        report.append("")
+        report.append("Jobs requiring investigation to determine root cause:")
+        report.append("")
+        report.append("| Release | Job | Pass Rate | Runs | Reason |")
+        report.append("|---------|-----|-----------|------|--------|")
+        for job in triage[:15]:
+            rate = job.get("combined_pass_rate")
+            rate_str = f"{rate:.1f}%" if rate is not None else "N/A"
+            report.append(
+                f"| {job['release']} | {job['brief_name'][:40]} | {rate_str} | "
+                f"{job['combined_runs']} | {job['reason'][:30]} |"
+            )
+        if len(triage) > 15:
+            report.append(f"| ... | *{len(triage) - 15} more* | | | |")
+        report.append("")
+
+    # Recommendations
+    report.append("## Recommended Actions by Category")
+    report.append("")
+    report.append("### Infrastructure")
+    report.append("- Review OpenStack cloud health and quotas")
+    report.append("- Check for recurring provisioning failures")
+    report.append("- Validate network and DNS configuration")
+    report.append("")
+    report.append("### Flaky")
+    report.append("- Analyze test logs for timing-related failures")
+    report.append("- Consider adding retries for known flaky operations")
+    report.append("- Investigate environmental dependencies")
+    report.append("")
+    report.append("### Product Bug")
+    report.append("- Track existing bugs to resolution")
+    report.append("- Prioritize bugs blocking multiple jobs")
+    report.append("- Consider disabling jobs until bug is fixed")
+    report.append("")
+    report.append("### Needs Triage")
+    report.append("- Review recent job logs to identify patterns")
+    report.append("- File bugs with failure details")
+    report.append("- Categorize after investigation")
+    report.append("")
+
+    report.append("---")
+    report.append("")
+    report.append("*Classification based on heuristics - manual review recommended*")
+    report.append("*Data Source: [Sippy](https://sippy.dptools.openshift.org/)*")
+    report.append("")
+
+    return "\n".join(report)
+
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Failure Categorization")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    # Load data
+    extended_jobs = load_extended_jobs()
+    if not extended_jobs:
+        print("Error: No extended metrics jobs data found.")
+        print("Run fetch_extended_metrics.py first.")
+        sys.exit(1)
+    print(f"Loaded {len(extended_jobs)} jobs")
+
+    sippy_data = load_sippy_data()
+    print(f"Sippy data loaded: {sippy_data is not None}")
+    print()
+
+    # Categorize
+    analysis = categorize_all_jobs(extended_jobs, sippy_data)
+
+    # Convert defaultdict to regular dict for JSON serialization
+    analysis["by_release"] = {k: dict(v) for k, v in analysis["by_release"].items()}
+
+    # Save results
+    analysis_path = os.path.join(OUTPUT_DIR, "failure_categories.json")
+    with open(analysis_path, 'w') as f:
+        json.dump(analysis, f, indent=2)
+    print(f"Saved: {analysis_path}")
+
+    # Generate report
+    report = generate_report(analysis)
+    report_path = os.path.join(OUTPUT_DIR, "failure_categories_report.md")
+    with open(report_path, 'w') as f:
+        f.write(report)
+    print(f"Saved: {report_path}")
+
+    # Print summary
+    print()
+    print("=" * 60)
+    print("Summary:")
+    summary = analysis.get("summary", {})
+    print(f"  Total problem jobs: {summary.get('total_problem_jobs', 0)}")
+    for cat, count in summary.get("by_category", {}).items():
+        pct = summary.get("percentages", {}).get(cat, 0)
+        print(f"  {cat}: {count} ({pct}%)")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/extract_openstack_jobs.py b/reporting-toolkit/extract_openstack_jobs.py
new file mode 100644
index 0000000..8dbc201
--- /dev/null
+++ b/reporting-toolkit/extract_openstack_jobs.py
@@ -0,0 +1,345 @@
+#!/usr/bin/env python3
+"""
+Extract all OpenStack CI jobs from ci-operator/config files.
+
+This script parses CI configuration files and extracts job information
+for tests using OpenStack cluster profiles.
+
+Target cluster profiles:
+- openstack-vexxhost
+- openstack-vh-mecha-central
+- openstack-vh-mecha-az0
+- openstack-vh-bm-rhos
+- openstack-hwoffload
+- openstack-nfv
+"""
+
+import argparse
+import csv
+import json
+import os
+import re
+import sys
+from pathlib import Path
+
+try:
+    import yaml
+except ImportError:
+    print("Error: PyYAML is required. Install with: pip install pyyaml", file=sys.stderr)
+    sys.exit(1)
+
+
+# Target OpenStack cluster profiles
+OPENSTACK_PROFILES = [
+    "openstack-vexxhost",
+    "openstack-vh-mecha-central",
+    "openstack-vh-mecha-az0",
+    "openstack-vh-bm-rhos",
+    "openstack-hwoffload",
+    "openstack-nfv",
+]
+
+
+def get_cluster_profile(test):
+    """Extract cluster_profile from a test definition."""
+    if "steps" in test:
+        steps = test["steps"]
+        if isinstance(steps, dict):
+            return steps.get("cluster_profile")
+    return None
+
+
+def get_workflow(test):
+    """Extract workflow from a test definition."""
+    if "steps" in test:
+        steps = test["steps"]
+        if isinstance(steps, dict):
+            return steps.get("workflow")
+    return None
+
+
+def get_job_type(test):
+    """Determine job type based on scheduling fields.
+
+    Jobs are classified as:
+    - periodic: if they have cron/interval, OR if they have minimum_interval
+                but no presubmit triggers (always_run, run_if_changed, optional)
+    - postsubmit: if explicitly marked as postsubmit
+    - presubmit: otherwise
+
+    Note: Jobs with minimum_interval but no presubmit triggers are periodic jobs
+    that run on a schedule. They're generated into *-periodics.yaml files.
+    """
+    # Explicit periodic scheduling
+    if test.get("interval") or test.get("cron"):
+        return "periodic"
+
+    if test.get("postsubmit"):
+        return "postsubmit"
+
+    # Implicit periodic: minimum_interval without presubmit triggers
+    # These jobs run periodically, not on PRs
+    if test.get("minimum_interval"):
+        has_presubmit_trigger = (
+            test.get("always_run") or
+            test.get("run_if_changed") or
+            test.get("optional") is True or
+            test.get("skip_if_only_changed")
+        )
+        if not has_presubmit_trigger:
+            return "periodic"
+
+    return "presubmit"
+
+
+def get_schedule(test):
+    """Extract schedule (interval or cron) from a test.
+
+    For implicit periodic jobs (those with minimum_interval but no presubmit
+    triggers), the minimum_interval acts as the schedule.
+    """
+    if test.get("interval"):
+        return f"interval: {test['interval']}"
+    if test.get("cron"):
+        return f"cron: {test['cron']}"
+    # For implicit periodic jobs, minimum_interval is the effective schedule
+    if test.get("minimum_interval"):
+        has_presubmit_trigger = (
+            test.get("always_run") or
+            test.get("run_if_changed") or
+            test.get("optional") is True or
+            test.get("skip_if_only_changed")
+        )
+        if not has_presubmit_trigger:
+            return f"minimum_interval: {test['minimum_interval']}"
+    return ""
+
+
+def parse_config_file(file_path):
+    """Parse a single CI config file and extract OpenStack jobs."""
+    jobs = []
+
+    try:
+        with open(file_path, 'r', encoding='utf-8') as f:
+            config = yaml.safe_load(f)
+    except Exception as e:
+        print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
+        return jobs
+
+    if not config or "tests" not in config:
+        return jobs
+
+    # Extract metadata
+    metadata = config.get("zz_generated_metadata", {})
+    org = metadata.get("org", "")
+    repo = metadata.get("repo", "")
+    branch = metadata.get("branch", "")
+    variant = metadata.get("variant", "")
+
+    # Parse each test
+    for test in config.get("tests", []):
+        if not isinstance(test, dict):
+            continue
+
+        cluster_profile = get_cluster_profile(test)
+
+        # Check if this is an OpenStack job
+        if cluster_profile and any(profile in cluster_profile for profile in OPENSTACK_PROFILES):
+            job_name = test.get("as", "")
+
+            job_info = {
+                "job_name": job_name,
+                "cluster_profile": cluster_profile,
+                "job_type": get_job_type(test),
+                "schedule": get_schedule(test),
+                "workflow": get_workflow(test) or "",
+                "optional": test.get("optional", False),
+                "always_run": test.get("always_run", False),
+                "minimum_interval": test.get("minimum_interval", ""),
+                "skip_if_only_changed": test.get("skip_if_only_changed", ""),
+                "run_if_changed": test.get("run_if_changed", ""),
+                "org": org,
+                "repo": repo,
+                "branch": branch,
+                "variant": variant,
+                "config_file": str(file_path),
+            }
+
+            jobs.append(job_info)
+
+    return jobs
+
+
+def find_config_files(config_dir):
+    """Find all CI config YAML files."""
+    config_path = Path(config_dir)
+
+    yaml_files = []
+    for pattern in ["**/*.yaml", "**/*.yml"]:
+        yaml_files.extend(config_path.glob(pattern))
+
+    return sorted(set(yaml_files))
+
+
+def extract_jobs(config_dir):
+    """Extract all OpenStack jobs from config directory."""
+    all_jobs = []
+
+    config_files = find_config_files(config_dir)
+    print(f"Found {len(config_files)} config files to scan", file=sys.stderr)
+
+    for file_path in config_files:
+        jobs = parse_config_file(file_path)
+        all_jobs.extend(jobs)
+
+    print(f"Extracted {len(all_jobs)} OpenStack jobs", file=sys.stderr)
+    return all_jobs
+
+
+def output_csv(jobs, output_file):
+    """Output jobs to CSV format."""
+    if not jobs:
+        print("No jobs to output", file=sys.stderr)
+        return
+
+    fieldnames = [
+        "job_name", "cluster_profile", "job_type", "schedule", "workflow",
+        "optional", "always_run", "minimum_interval", "skip_if_only_changed",
+        "run_if_changed", "org", "repo", "branch", "variant", "config_file"
+    ]
+
+    with open(output_file, 'w', newline='', encoding='utf-8') as f:
+        writer = csv.DictWriter(f, fieldnames=fieldnames)
+        writer.writeheader()
+        writer.writerows(jobs)
+
+    print(f"Wrote {len(jobs)} jobs to {output_file}", file=sys.stderr)
+
+
+def output_json(jobs, output_file):
+    """Output jobs to JSON format."""
+    with open(output_file, 'w', encoding='utf-8') as f:
+        json.dump(jobs, f, indent=2)
+
+    print(f"Wrote {len(jobs)} jobs to {output_file}", file=sys.stderr)
+
+
+def print_summary(jobs):
+    """Print summary statistics."""
+    print("\n=== OpenStack CI Job Summary ===\n")
+
+    # By cluster profile
+    profile_counts = {}
+    for job in jobs:
+        profile = job["cluster_profile"]
+        profile_counts[profile] = profile_counts.get(profile, 0) + 1
+
+    print("Jobs by Cluster Profile:")
+    for profile in sorted(profile_counts.keys()):
+        print(f"  {profile}: {profile_counts[profile]}")
+
+    # By job type
+    type_counts = {}
+    for job in jobs:
+        job_type = job["job_type"]
+        type_counts[job_type] = type_counts.get(job_type, 0) + 1
+
+    print("\nJobs by Type:")
+    for job_type in sorted(type_counts.keys()):
+        print(f"  {job_type}: {type_counts[job_type]}")
+
+    # By org
+    org_counts = {}
+    for job in jobs:
+        org = job["org"] or "unknown"
+        org_counts[org] = org_counts.get(org, 0) + 1
+
+    print("\nJobs by Organization:")
+    for org in sorted(org_counts.keys(), key=lambda x: org_counts[x], reverse=True)[:10]:
+        print(f"  {org}: {org_counts[org]}")
+
+    # Unique workflows
+    workflows = set(job["workflow"] for job in jobs if job["workflow"])
+    print(f"\nUnique Workflows: {len(workflows)}")
+
+    # Unique repos
+    repos = set(f"{job['org']}/{job['repo']}" for job in jobs if job['org'] and job['repo'])
+    print(f"Unique Repositories: {len(repos)}")
+
+    # Release branches
+    branches = set(job["branch"] for job in jobs if job["branch"])
+    release_branches = sorted([b for b in branches if "release-" in b or b in ["main", "master"]])
+    print(f"\nRelease Branches:")
+    for branch in release_branches[-10:]:
+        count = len([j for j in jobs if j["branch"] == branch])
+        print(f"  {branch}: {count}")
+
+    print()
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Extract OpenStack CI jobs from ci-operator config files"
+    )
+    parser.add_argument(
+        "--config-dir",
+        default="ci-operator/config",
+        help="Path to ci-operator/config directory (default: ci-operator/config)"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for output files (default: script directory)"
+    )
+    parser.add_argument(
+        "--output-csv",
+        default="openstack_jobs_inventory.csv",
+        help="Output CSV filename (default: openstack_jobs_inventory.csv)"
+    )
+    parser.add_argument(
+        "--output-json",
+        default="openstack_jobs_inventory.json",
+        help="Output JSON filename (default: openstack_jobs_inventory.json)"
+    )
+    parser.add_argument(
+        "--summary",
+        action="store_true",
+        help="Print summary statistics"
+    )
+
+    args = parser.parse_args()
+
+    # Resolve output directory
+    output_dir = os.path.abspath(args.output_dir)
+    os.makedirs(output_dir, exist_ok=True)
+
+    print("=" * 60)
+    print("OpenStack CI Job Extractor")
+    print("=" * 60)
+    print(f"Config directory: {args.config_dir}")
+    print(f"Output directory: {output_dir}")
+    print()
+
+    # Ensure config directory exists
+    if not os.path.isdir(args.config_dir):
+        print(f"Error: Config directory not found: {args.config_dir}", file=sys.stderr)
+        sys.exit(1)
+
+    # Extract jobs
+    jobs = extract_jobs(args.config_dir)
+
+    # Output CSV
+    csv_path = os.path.join(output_dir, args.output_csv)
+    output_csv(jobs, csv_path)
+
+    # Output JSON
+    json_path = os.path.join(output_dir, args.output_json)
+    output_json(jobs, json_path)
+
+    # Print summary if requested
+    if args.summary:
+        print_summary(jobs)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/fetch_comparison_data.py b/reporting-toolkit/fetch_comparison_data.py
new file mode 100644
index 0000000..dbb723a
--- /dev/null
+++ b/reporting-toolkit/fetch_comparison_data.py
@@ -0,0 +1,224 @@
+#!/usr/bin/env python3
+"""
+Fetch platform comparison data from Sippy API.
+Fetches variant data for all platforms to compare OpenStack against AWS, GCP, Azure, vSphere.
+"""
+
+import argparse
+import json
+import os
+import sys
+import time
+from urllib.request import urlopen, Request
+from urllib.error import URLError, HTTPError
+from datetime import datetime
+
+SIPPY_BASE = "https://sippy.dptools.openshift.org/api"
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+# Platform variants to compare
+PLATFORMS = ["OpenStack", "AWS", "GCP", "Azure", "vSphere", "Metal"]
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Fetch platform comparison data from Sippy API"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for output files (default: script directory)"
+    )
+    return parser.parse_args()
+
+
+def fetch_json(url, retries=3, delay=2):
+    """Fetch JSON from URL with retries."""
+    for attempt in range(retries):
+        try:
+            req = Request(url, headers={"User-Agent": "OpenStack-CI-Analysis/1.0"})
+            with urlopen(req, timeout=60) as response:
+                return json.loads(response.read().decode())
+        except (URLError, HTTPError) as e:
+            print(f"  Attempt {attempt + 1} failed: {e}")
+            if attempt < retries - 1:
+                time.sleep(delay)
+    return None
+
+
+def fetch_variants_for_release(release):
+    """Fetch variant data for a specific release."""
+    url = f"{SIPPY_BASE}/variants?release={release}"
+    print(f"Fetching variants for release {release}...")
+
+    data = fetch_json(url)
+    if data is None:
+        print(f"  Failed to fetch variants for {release}")
+        return []
+
+    print(f"  Retrieved {len(data)} variants")
+    return data
+
+
+def extract_platform_variants(variants):
+    """Extract Platform:* variants from variant data."""
+    platform_data = {}
+
+    for variant in variants:
+        name = variant.get("name", "")
+        if name.startswith("Platform:"):
+            platform = name.replace("Platform:", "")
+            platform_data[platform] = {
+                "name": platform,
+                "variant_full_name": name,
+                "current_pass_percentage": variant.get("current_pass_percentage", 0),
+                "current_runs": variant.get("current_runs", 0),
+                "current_passes": variant.get("current_passes", 0),
+                "previous_pass_percentage": variant.get("previous_pass_percentage", 0),
+                "previous_runs": variant.get("previous_runs", 0),
+                "previous_passes": variant.get("previous_passes", 0),
+                "job_count": variant.get("job_count", 0),
+            }
+
+    return platform_data
+
+
+def fetch_jobs_for_release(release):
+    """Fetch all jobs for a release to get platform job counts."""
+    url = f"{SIPPY_BASE}/jobs?release={release}"
+    print(f"  Fetching jobs for platform counts...")
+
+    data = fetch_json(url)
+    if data is None:
+        return {}
+
+    # Count jobs by platform
+    platform_counts = {}
+    platform_runs = {}
+    platform_passes = {}
+
+    for job in data:
+        name = job.get("name", "").lower()
+        runs = job.get("current_runs", 0) + job.get("previous_runs", 0)
+        passes = job.get("current_passes", 0) + job.get("previous_passes", 0)
+
+        # Determine platform from job name
+        platform = None
+        if "openstack" in name:
+            platform = "OpenStack"
+        elif "aws" in name:
+            platform = "AWS"
+        elif "gcp" in name:
+            platform = "GCP"
+        elif "azure" in name:
+            platform = "Azure"
+        elif "vsphere" in name:
+            platform = "vSphere"
+        elif "metal" in name or "baremetal" in name:
+            platform = "Metal"
+
+        if platform:
+            platform_counts[platform] = platform_counts.get(platform, 0) + 1
+            platform_runs[platform] = platform_runs.get(platform, 0) + runs
+            platform_passes[platform] = platform_passes.get(platform, 0) + passes
+
+    result = {}
+    for platform in platform_counts:
+        runs = platform_runs.get(platform, 0)
+        passes = platform_passes.get(platform, 0)
+        result[platform] = {
+            "job_count": platform_counts[platform],
+            "total_runs": runs,
+            "total_passes": passes,
+            "pass_rate": (passes / runs * 100) if runs > 0 else 0,
+        }
+
+    return result
+
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    # Create output directory if needed
+    os.makedirs(OUTPUT_DIR, exist_ok=True)
+
+    print("=" * 60)
+    print("OpenStack CI Platform Comparison Data Fetcher")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    results = {
+        "fetched_at": datetime.now().isoformat(),
+        "releases": {},
+        "overall_by_platform": {},
+    }
+
+    # Fetch data for each release
+    for release in RELEASES:
+        print(f"\n--- Release {release} ---")
+
+        # Fetch variants
+        variants = fetch_variants_for_release(release)
+        platform_variants = extract_platform_variants(variants) if variants else {}
+
+        # Fetch job counts
+        platform_jobs = fetch_jobs_for_release(release)
+
+        # Combine data
+        release_data = {
+            "variants": platform_variants,
+            "job_metrics": platform_jobs,
+        }
+        results["releases"][release] = release_data
+
+        time.sleep(1)  # Be nice to the API
+
+    # Calculate overall metrics by platform
+    overall = {}
+    for release, data in results["releases"].items():
+        for platform, metrics in data.get("job_metrics", {}).items():
+            if platform not in overall:
+                overall[platform] = {
+                    "job_count": 0,
+                    "total_runs": 0,
+                    "total_passes": 0,
+                }
+            overall[platform]["job_count"] += metrics.get("job_count", 0)
+            overall[platform]["total_runs"] += metrics.get("total_runs", 0)
+            overall[platform]["total_passes"] += metrics.get("total_passes", 0)
+
+    # Calculate pass rates
+    for platform, data in overall.items():
+        runs = data["total_runs"]
+        passes = data["total_passes"]
+        data["pass_rate"] = (passes / runs * 100) if runs > 0 else 0
+
+    results["overall_by_platform"] = overall
+
+    # Save results
+    output_path = os.path.join(OUTPUT_DIR, "platform_comparison_raw.json")
+    with open(output_path, 'w') as f:
+        json.dump(results, f, indent=2)
+    print(f"\nSaved: {output_path}")
+
+    # Print summary
+    print("\n" + "=" * 60)
+    print("Summary by Platform (all releases):")
+    print("-" * 60)
+    print(f"{'Platform':<15} {'Jobs':>8} {'Runs':>10} {'Pass Rate':>10}")
+    print("-" * 60)
+    for platform in sorted(overall.keys(), key=lambda x: -overall[x]["pass_rate"]):
+        data = overall[platform]
+        print(f"{platform:<15} {data['job_count']:>8} {data['total_runs']:>10} {data['pass_rate']:>9.1f}%")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/fetch_extended_metrics.py b/reporting-toolkit/fetch_extended_metrics.py
new file mode 100644
index 0000000..6b28834
--- /dev/null
+++ b/reporting-toolkit/fetch_extended_metrics.py
@@ -0,0 +1,383 @@
+#!/usr/bin/env python3
+"""
+Fetch extended job metrics from Sippy API for OpenStack CI jobs.
+Combines current + previous periods for ~14 day coverage.
+Estimates job duration based on workflow/cluster profile.
+"""
+
+import argparse
+import json
+import os
+import sys
+from datetime import datetime, timedelta
+from urllib.request import urlopen, Request
+from urllib.error import URLError, HTTPError
+import time
+
+SIPPY_BASE = "https://sippy.dptools.openshift.org/api"
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Calculate extended job metrics from Sippy data"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for input/output files (default: script directory)"
+    )
+    return parser.parse_args()
+
+# Estimated durations by cluster profile (based on typical run times)
+DURATION_ESTIMATES = {
+    "openstack-vexxhost": {"min": 60, "typical": 90, "max": 150},
+    "openstack-vh-mecha-central": {"min": 60, "typical": 90, "max": 150},
+    "openstack-vh-mecha-az0": {"min": 60, "typical": 100, "max": 180},
+    "openstack-nfv": {"min": 90, "typical": 120, "max": 200},
+    "openstack-hwoffload": {"min": 90, "typical": 120, "max": 200},
+    "openstack-vh-bm-rhos": {"min": 120, "typical": 180, "max": 300},
+}
+
+
+def load_collected_data():
+    """Load previously collected Sippy data."""
+    filepath = os.path.join(OUTPUT_DIR, "sippy_jobs_raw.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def load_job_inventory():
+    """Load job inventory for cluster profile info."""
+    filepath = os.path.join(OUTPUT_DIR, "openstack_jobs_inventory.json")
+    if os.path.exists(filepath):
+        with open(filepath) as f:
+            return json.load(f)
+    return None
+
+
+def calculate_extended_metrics(sippy_data, inventory):
+    """Calculate extended metrics combining current + previous periods."""
+
+    results = {
+        "generated": datetime.now().isoformat(),
+        "period": "~14 days (current + previous Sippy windows)",
+        "releases": {},
+        "overall": {},
+        "problem_jobs": [],
+        "duration_estimates": {},
+    }
+
+    # Build a lookup for cluster profiles from inventory
+    cluster_profiles = {}
+    if inventory:
+        for job in inventory:
+            cluster_profiles[job.get("job_name", "")] = job.get("cluster_profile", "")
+
+    all_jobs = []
+
+    for release, jobs in sippy_data.get("jobs_by_release", {}).items():
+        release_stats = {
+            "total_jobs": len(jobs),
+            "current_runs": 0,
+            "current_passes": 0,
+            "previous_runs": 0,
+            "previous_passes": 0,
+            "combined_runs": 0,
+            "combined_passes": 0,
+            "pass_rate_current": 0,
+            "pass_rate_combined": 0,
+            "trend": "",
+        }
+
+        for job in jobs:
+            name = job.get("name", "")
+            current_runs = job.get("current_runs", 0)
+            current_passes = job.get("current_passes", 0)
+            previous_runs = job.get("previous_runs", 0)
+            previous_passes = job.get("previous_passes", 0)
+
+            combined_runs = current_runs + previous_runs
+            combined_passes = current_passes + previous_passes
+
+            release_stats["current_runs"] += current_runs
+            release_stats["current_passes"] += current_passes
+            release_stats["previous_runs"] += previous_runs
+            release_stats["previous_passes"] += previous_passes
+            release_stats["combined_runs"] += combined_runs
+            release_stats["combined_passes"] += combined_passes
+
+            # Calculate pass rates
+            current_rate = (current_passes / current_runs * 100) if current_runs > 0 else None
+            previous_rate = (previous_passes / previous_runs * 100) if previous_runs > 0 else None
+            combined_rate = (combined_passes / combined_runs * 100) if combined_runs > 0 else None
+
+            # Determine trend
+            trend = "stable"
+            if current_rate is not None and previous_rate is not None:
+                diff = current_rate - previous_rate
+                if diff > 10:
+                    trend = "improving"
+                elif diff < -10:
+                    trend = "degrading"
+
+            # Get cluster profile for duration estimate
+            cluster = cluster_profiles.get(name, "unknown")
+            duration_est = DURATION_ESTIMATES.get(cluster, {"min": 60, "typical": 90, "max": 180})
+
+            job_info = {
+                "release": release,
+                "name": name,
+                "brief_name": job.get("brief_name", name),
+                "cluster_profile": cluster,
+                "current_runs": current_runs,
+                "current_passes": current_passes,
+                "current_pass_rate": current_rate,
+                "previous_runs": previous_runs,
+                "previous_passes": previous_passes,
+                "previous_pass_rate": previous_rate,
+                "combined_runs": combined_runs,
+                "combined_passes": combined_passes,
+                "combined_pass_rate": combined_rate,
+                "trend": trend,
+                "last_pass": job.get("last_pass", ""),
+                "open_bugs": job.get("open_bugs", 0),
+                "estimated_duration_min": duration_est["typical"],
+            }
+            all_jobs.append(job_info)
+
+            # Track problem jobs (< 80% and has runs)
+            if combined_rate is not None and combined_rate < 80 and combined_runs >= 2:
+                results["problem_jobs"].append(job_info)
+
+        # Calculate release-level rates
+        if release_stats["current_runs"] > 0:
+            release_stats["pass_rate_current"] = (
+                release_stats["current_passes"] / release_stats["current_runs"] * 100
+            )
+        if release_stats["combined_runs"] > 0:
+            release_stats["pass_rate_combined"] = (
+                release_stats["combined_passes"] / release_stats["combined_runs"] * 100
+            )
+
+        # Determine release trend
+        if release_stats["current_runs"] > 0 and release_stats["previous_runs"] > 0:
+            curr_rate = release_stats["current_passes"] / release_stats["current_runs"]
+            prev_rate = release_stats["previous_passes"] / release_stats["previous_runs"]
+            diff = (curr_rate - prev_rate) * 100
+            if diff > 5:
+                release_stats["trend"] = "improving"
+            elif diff < -5:
+                release_stats["trend"] = "degrading"
+            else:
+                release_stats["trend"] = "stable"
+
+        results["releases"][release] = release_stats
+
+    # Overall statistics
+    total_current_runs = sum(r["current_runs"] for r in results["releases"].values())
+    total_current_passes = sum(r["current_passes"] for r in results["releases"].values())
+    total_combined_runs = sum(r["combined_runs"] for r in results["releases"].values())
+    total_combined_passes = sum(r["combined_passes"] for r in results["releases"].values())
+
+    results["overall"] = {
+        "total_jobs": len(all_jobs),
+        "current_runs": total_current_runs,
+        "current_passes": total_current_passes,
+        "current_pass_rate": (total_current_passes / total_current_runs * 100) if total_current_runs > 0 else 0,
+        "combined_runs": total_combined_runs,
+        "combined_passes": total_combined_passes,
+        "combined_pass_rate": (total_combined_passes / total_combined_runs * 100) if total_combined_runs > 0 else 0,
+        "problem_job_count": len(results["problem_jobs"]),
+    }
+
+    # Sort problem jobs by pass rate
+    results["problem_jobs"].sort(key=lambda x: x.get("combined_pass_rate", 0) or 0)
+
+    # Duration estimates summary
+    jobs_by_profile = {}
+    for job in all_jobs:
+        profile = job.get("cluster_profile", "unknown")
+        if profile not in jobs_by_profile:
+            jobs_by_profile[profile] = []
+        jobs_by_profile[profile].append(job)
+
+    for profile, jobs in jobs_by_profile.items():
+        est = DURATION_ESTIMATES.get(profile, {"min": 60, "typical": 90, "max": 180})
+        total_runs = sum(j["combined_runs"] for j in jobs)
+        results["duration_estimates"][profile] = {
+            "job_count": len(jobs),
+            "total_runs": total_runs,
+            "typical_duration_min": est["typical"],
+            "estimated_total_hours": round(total_runs * est["typical"] / 60, 1),
+        }
+
+    return results, all_jobs
+
+
+def generate_extended_report(results, all_jobs):
+    """Generate markdown report with extended metrics."""
+    report = []
+    report.append("# OpenStack CI Extended Metrics Report")
+    report.append("")
+    report.append(f"**Generated:** {results['generated']}")
+    report.append(f"**Period:** {results['period']}")
+    report.append("")
+
+    # Overall summary
+    report.append("## Executive Summary")
+    report.append("")
+    overall = results["overall"]
+    report.append(f"| Metric | Current (~7d) | Combined (~14d) |")
+    report.append(f"|--------|---------------|-----------------|")
+    report.append(f"| Total Jobs | {overall['total_jobs']} | {overall['total_jobs']} |")
+    report.append(f"| Total Runs | {overall['current_runs']} | {overall['combined_runs']} |")
+    report.append(f"| Pass Rate | {overall['current_pass_rate']:.1f}% | {overall['combined_pass_rate']:.1f}% |")
+    report.append(f"| Problem Jobs (<80%) | - | {overall['problem_job_count']} |")
+    report.append("")
+
+    # Per-release breakdown
+    report.append("## Metrics by Release")
+    report.append("")
+    report.append("| Release | Jobs | Runs (14d) | Pass Rate | Trend |")
+    report.append("|---------|------|------------|-----------|-------|")
+    for release in RELEASES:
+        rel = results["releases"].get(release, {})
+        trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(rel.get("trend", ""), "")
+        report.append(
+            f"| {release} | {rel.get('total_jobs', 0)} | "
+            f"{rel.get('combined_runs', 0)} | "
+            f"{rel.get('pass_rate_combined', 0):.1f}% | {trend_icon} {rel.get('trend', '')} |"
+        )
+    report.append("")
+
+    # Problem jobs
+    report.append("## Problem Jobs (Pass Rate < 80%)")
+    report.append("")
+    problem_jobs = results.get("problem_jobs", [])
+    if problem_jobs:
+        report.append(f"**{len(problem_jobs)} jobs** need attention:")
+        report.append("")
+        report.append("| Release | Job | Runs | Pass Rate | Trend | Bugs |")
+        report.append("|---------|-----|------|-----------|-------|------|")
+        for job in problem_jobs[:25]:
+            trend_icon = {"improving": "↑", "degrading": "↓", "stable": "→"}.get(job.get("trend", ""), "")
+            rate = job.get("combined_pass_rate")
+            rate_str = f"{rate:.1f}%" if rate is not None else "N/A"
+            report.append(
+                f"| {job['release']} | {job['brief_name'][:50]} | "
+                f"{job['combined_runs']} | {rate_str} | {trend_icon} | {job.get('open_bugs', 0)} |"
+            )
+        if len(problem_jobs) > 25:
+            report.append(f"| ... | *{len(problem_jobs) - 25} more jobs* | | | | |")
+    else:
+        report.append("All jobs with sufficient runs have pass rate >= 80%.")
+    report.append("")
+
+    # Duration estimates
+    report.append("## Estimated Job Durations by Cluster Profile")
+    report.append("")
+    report.append("*Note: Durations are estimates based on typical run times.*")
+    report.append("")
+    report.append("| Cluster Profile | Jobs | Runs (14d) | Typical Duration | Est. Total Hours |")
+    report.append("|-----------------|------|------------|------------------|------------------|")
+    for profile, est in sorted(results.get("duration_estimates", {}).items(),
+                               key=lambda x: -x[1]["total_runs"]):
+        report.append(
+            f"| {profile} | {est['job_count']} | {est['total_runs']} | "
+            f"~{est['typical_duration_min']}min | {est['estimated_total_hours']}h |"
+        )
+    report.append("")
+
+    # Trend analysis
+    report.append("## Trend Analysis")
+    report.append("")
+    improving = [j for j in all_jobs if j.get("trend") == "improving" and j["combined_runs"] >= 2]
+    degrading = [j for j in all_jobs if j.get("trend") == "degrading" and j["combined_runs"] >= 2]
+    report.append(f"- **Improving jobs:** {len(improving)}")
+    report.append(f"- **Degrading jobs:** {len(degrading)}")
+    report.append(f"- **Stable jobs:** {len(all_jobs) - len(improving) - len(degrading)}")
+    report.append("")
+
+    if degrading:
+        report.append("### Degrading Jobs (investigate)")
+        report.append("")
+        for job in sorted(degrading, key=lambda x: (x.get("current_pass_rate") or 100))[:10]:
+            curr = job.get("current_pass_rate")
+            prev = job.get("previous_pass_rate")
+            curr_str = f"{curr:.0f}%" if curr is not None else "N/A"
+            prev_str = f"{prev:.0f}%" if prev is not None else "N/A"
+            report.append(f"- **{job['brief_name'][:50]}** ({job['release']}): {prev_str} → {curr_str}")
+        report.append("")
+
+    report.append("---")
+    report.append("")
+    report.append("*Data Source: [Sippy](https://sippy.dptools.openshift.org/)*")
+    report.append("")
+
+    return "\n".join(report)
+
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    print("=" * 60)
+    print("OpenStack CI Extended Metrics")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    # Load existing data
+    sippy_data = load_collected_data()
+    if not sippy_data:
+        print("Error: No Sippy data found. Run fetch_job_metrics.py first.")
+        sys.exit(1)
+
+    inventory = load_job_inventory()
+    print(f"Loaded Sippy data from: {sippy_data.get('fetched_at')}")
+    print(f"Job inventory loaded: {inventory is not None}")
+    print()
+
+    # Calculate extended metrics
+    results, all_jobs = calculate_extended_metrics(sippy_data, inventory)
+
+    # Save results
+    results_path = os.path.join(OUTPUT_DIR, "extended_metrics.json")
+    with open(results_path, 'w') as f:
+        json.dump(results, f, indent=2)
+    print(f"Saved: {results_path}")
+
+    all_jobs_path = os.path.join(OUTPUT_DIR, "extended_metrics_jobs.json")
+    with open(all_jobs_path, 'w') as f:
+        json.dump(all_jobs, f, indent=2)
+    print(f"Saved: {all_jobs_path}")
+
+    # Generate report
+    report = generate_extended_report(results, all_jobs)
+    report_path = os.path.join(OUTPUT_DIR, "extended_metrics_report.md")
+    with open(report_path, 'w') as f:
+        f.write(report)
+    print(f"Saved: {report_path}")
+
+    # Summary
+    print()
+    print("=" * 60)
+    print("Summary:")
+    overall = results["overall"]
+    print(f"  Total jobs: {overall['total_jobs']}")
+    print(f"  Combined runs (14d): {overall['combined_runs']}")
+    print(f"  Combined pass rate: {overall['combined_pass_rate']:.1f}%")
+    print(f"  Problem jobs: {overall['problem_job_count']}")
+    print("=" * 60)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/fetch_job_metrics.py b/reporting-toolkit/fetch_job_metrics.py
new file mode 100644
index 0000000..2df28c8
--- /dev/null
+++ b/reporting-toolkit/fetch_job_metrics.py
@@ -0,0 +1,319 @@
+#!/usr/bin/env python3
+"""
+Fetch job metrics (pass rates, run counts) from Sippy API for OpenStack CI jobs.
+Saves progress to files to allow resumption if interrupted.
+"""
+
+import argparse
+import json
+import os
+import sys
+import time
+from urllib.request import urlopen, Request
+from urllib.error import URLError, HTTPError
+from datetime import datetime
+
+SIPPY_BASE = "https://sippy.dptools.openshift.org/api"
+RELEASES = ["4.17", "4.18", "4.19", "4.20", "4.21", "4.22"]
+
+# Will be set by parse_args()
+OUTPUT_DIR = None
+
+
+def parse_args():
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(
+        description="Fetch job metrics from Sippy API for OpenStack CI jobs"
+    )
+    parser.add_argument(
+        "--output-dir",
+        default=os.path.dirname(os.path.abspath(__file__)),
+        help="Directory for output files (default: script directory)"
+    )
+    parser.add_argument(
+        "--force",
+        action="store_true",
+        help="Refetch data even if cache exists"
+    )
+    return parser.parse_args()
+
+def fetch_json(url, retries=3, delay=2):
+    """Fetch JSON from URL with retries."""
+    for attempt in range(retries):
+        try:
+            req = Request(url, headers={"User-Agent": "OpenStack-CI-Analysis/1.0"})
+            with urlopen(req, timeout=60) as response:
+                return json.loads(response.read().decode())
+        except (URLError, HTTPError) as e:
+            print(f"  Attempt {attempt + 1} failed: {e}")
+            if attempt < retries - 1:
+                time.sleep(delay)
+    return None
+
+def fetch_openstack_jobs_for_release(release):
+    """Fetch all OpenStack jobs for a specific release."""
+    url = f"{SIPPY_BASE}/jobs?release={release}"
+    print(f"Fetching jobs for release {release}...")
+
+    data = fetch_json(url)
+    if data is None:
+        print(f"  Failed to fetch data for {release}")
+        return []
+
+    # Filter for OpenStack jobs
+    openstack_jobs = [j for j in data if "openstack" in j.get("name", "").lower()]
+    print(f"  Found {len(openstack_jobs)} OpenStack jobs out of {len(data)} total")
+
+    return openstack_jobs
+
+def save_progress(data, filename):
+    """Save data to file."""
+    filepath = os.path.join(OUTPUT_DIR, filename)
+    with open(filepath, 'w') as f:
+        json.dump(data, f, indent=2)
+    print(f"Saved: {filepath}")
+
+def load_progress(filename):
+    """Load data from file if exists."""
+    filepath = os.path.join(OUTPUT_DIR, filename)
+    if os.path.exists(filepath):
+        with open(filepath, 'r') as f:
+            return json.load(f)
+    return None
+
+def analyze_job_metrics(all_jobs_by_release):
+    """Analyze and summarize job metrics."""
+    summary = {
+        "generated": datetime.now().isoformat(),
+        "releases": {},
+        "overall_stats": {},
+        "worst_jobs": [],
+        "best_jobs": [],
+        "jobs_by_pass_rate": {}
+    }
+
+    all_jobs_flat = []
+
+    for release, jobs in all_jobs_by_release.items():
+        if not jobs:
+            continue
+
+        release_stats = {
+            "total_jobs": len(jobs),
+            "total_runs": sum(j.get("current_runs", 0) for j in jobs),
+            "total_passes": sum(j.get("current_passes", 0) for j in jobs),
+            "avg_pass_rate": 0,
+            "jobs_below_90": 0,
+            "jobs_below_80": 0,
+            "jobs_below_50": 0,
+        }
+
+        pass_rates = []
+        for job in jobs:
+            rate = job.get("current_pass_percentage", 0)
+            pass_rates.append(rate)
+            if rate < 90:
+                release_stats["jobs_below_90"] += 1
+            if rate < 80:
+                release_stats["jobs_below_80"] += 1
+            if rate < 50:
+                release_stats["jobs_below_50"] += 1
+
+            # Add to flat list for overall analysis
+            all_jobs_flat.append({
+                "release": release,
+                "name": job.get("name", ""),
+                "brief_name": job.get("brief_name", ""),
+                "pass_rate": rate,
+                "runs": job.get("current_runs", 0),
+                "passes": job.get("current_passes", 0),
+                "previous_pass_rate": job.get("previous_pass_percentage", 0),
+                "improvement": job.get("net_improvement", 0),
+                "last_pass": job.get("last_pass", ""),
+                "open_bugs": job.get("open_bugs", 0),
+            })
+
+        if pass_rates:
+            release_stats["avg_pass_rate"] = sum(pass_rates) / len(pass_rates)
+
+        summary["releases"][release] = release_stats
+
+    # Find worst and best performing jobs
+    jobs_with_runs = [j for j in all_jobs_flat if j["runs"] > 0]
+    if jobs_with_runs:
+        # Worst jobs (lowest pass rate with at least 2 runs)
+        jobs_with_sufficient_runs = [j for j in jobs_with_runs if j["runs"] >= 2]
+        summary["worst_jobs"] = sorted(jobs_with_sufficient_runs, key=lambda x: x["pass_rate"])[:20]
+
+        # Best jobs (100% pass rate with most runs)
+        perfect_jobs = [j for j in jobs_with_runs if j["pass_rate"] == 100]
+        summary["best_jobs"] = sorted(perfect_jobs, key=lambda x: -x["runs"])[:20]
+
+        # Group by pass rate ranges
+        ranges = {
+            "100%": [j for j in jobs_with_runs if j["pass_rate"] == 100],
+            "90-99%": [j for j in jobs_with_runs if 90 <= j["pass_rate"] < 100],
+            "80-89%": [j for j in jobs_with_runs if 80 <= j["pass_rate"] < 90],
+            "50-79%": [j for j in jobs_with_runs if 50 <= j["pass_rate"] < 80],
+            "below_50%": [j for j in jobs_with_runs if j["pass_rate"] < 50],
+        }
+        summary["jobs_by_pass_rate"] = {k: len(v) for k, v in ranges.items()}
+
+    # Overall stats
+    if all_jobs_flat:
+        all_runs = sum(j["runs"] for j in all_jobs_flat)
+        all_passes = sum(j["passes"] for j in all_jobs_flat)
+        summary["overall_stats"] = {
+            "total_jobs": len(all_jobs_flat),
+            "total_runs": all_runs,
+            "total_passes": all_passes,
+            "overall_pass_rate": (all_passes / all_runs * 100) if all_runs > 0 else 0,
+        }
+
+    return summary, all_jobs_flat
+
+def generate_metrics_report(summary, all_jobs):
+    """Generate a markdown report of job metrics."""
+    report = []
+    report.append("# OpenStack CI Job Metrics Report")
+    report.append("")
+    report.append(f"**Generated:** {summary['generated']}")
+    report.append("")
+
+    # Overall stats
+    report.append("## Overall Statistics")
+    report.append("")
+    stats = summary.get("overall_stats", {})
+    report.append(f"| Metric | Value |")
+    report.append(f"|--------|-------|")
+    report.append(f"| Total OpenStack Jobs Tracked | {stats.get('total_jobs', 0)} |")
+    report.append(f"| Total Job Runs (current period) | {stats.get('total_runs', 0)} |")
+    report.append(f"| Total Passes | {stats.get('total_passes', 0)} |")
+    report.append(f"| Overall Pass Rate | {stats.get('overall_pass_rate', 0):.1f}% |")
+    report.append("")
+
+    # Pass rate distribution
+    report.append("## Pass Rate Distribution")
+    report.append("")
+    report.append("| Pass Rate Range | Job Count |")
+    report.append("|-----------------|-----------|")
+    for range_name, count in summary.get("jobs_by_pass_rate", {}).items():
+        report.append(f"| {range_name} | {count} |")
+    report.append("")
+
+    # By release
+    report.append("## Metrics by Release")
+    report.append("")
+    report.append("| Release | Jobs | Total Runs | Avg Pass Rate | <90% | <80% | <50% |")
+    report.append("|---------|------|------------|---------------|------|------|------|")
+    for release in RELEASES:
+        rel_stats = summary.get("releases", {}).get(release, {})
+        if rel_stats:
+            report.append(f"| {release} | {rel_stats.get('total_jobs', 0)} | {rel_stats.get('total_runs', 0)} | {rel_stats.get('avg_pass_rate', 0):.1f}% | {rel_stats.get('jobs_below_90', 0)} | {rel_stats.get('jobs_below_80', 0)} | {rel_stats.get('jobs_below_50', 0)} |")
+    report.append("")
+
+    # Worst performing jobs
+    report.append("## Worst Performing Jobs (by pass rate)")
+    report.append("")
+    report.append("Jobs with at least 2 runs, sorted by lowest pass rate:")
+    report.append("")
+    report.append("| Release | Job Name | Pass Rate | Runs | Passes |")
+    report.append("|---------|----------|-----------|------|--------|")
+    for job in summary.get("worst_jobs", [])[:15]:
+        report.append(f"| {job['release']} | {job['brief_name'][:60]} | {job['pass_rate']:.1f}% | {job['runs']} | {job['passes']} |")
+    report.append("")
+
+    # Best performing jobs with high volume
+    report.append("## Best Performing Jobs (100% pass rate, most runs)")
+    report.append("")
+    report.append("| Release | Job Name | Runs | Last Pass |")
+    report.append("|---------|----------|------|-----------|")
+    for job in summary.get("best_jobs", [])[:10]:
+        last_pass = job['last_pass'][:10] if job['last_pass'] else "N/A"
+        report.append(f"| {job['release']} | {job['brief_name'][:60]} | {job['runs']} | {last_pass} |")
+    report.append("")
+
+    # Jobs needing attention
+    report.append("## Jobs Needing Attention")
+    report.append("")
+    attention_jobs = [j for j in all_jobs if j["pass_rate"] < 80 and j["runs"] >= 2]
+    if attention_jobs:
+        report.append(f"**{len(attention_jobs)} jobs** have pass rate below 80%:")
+        report.append("")
+        for job in sorted(attention_jobs, key=lambda x: x["pass_rate"]):
+            report.append(f"- **{job['brief_name']}** ({job['release']}): {job['pass_rate']:.1f}% ({job['passes']}/{job['runs']} runs)")
+    else:
+        report.append("All jobs with sufficient runs have pass rate >= 80%.")
+    report.append("")
+
+    # Data source
+    report.append("---")
+    report.append("")
+    report.append("*Data Source: Sippy (https://sippy.dptools.openshift.org/)*")
+    report.append("")
+
+    return "\n".join(report)
+
+def main():
+    global OUTPUT_DIR
+    args = parse_args()
+    OUTPUT_DIR = os.path.abspath(args.output_dir)
+
+    # Create output directory if needed
+    os.makedirs(OUTPUT_DIR, exist_ok=True)
+
+    print("=" * 60)
+    print("OpenStack CI Job Metrics Collector")
+    print("=" * 60)
+    print(f"Output directory: {OUTPUT_DIR}")
+    print()
+
+    # Check for existing progress
+    progress_file = "sippy_jobs_raw.json"
+    existing_data = load_progress(progress_file)
+
+    if existing_data and not args.force:
+        print(f"Found existing data from {existing_data.get('fetched_at', 'unknown')}")
+        print("Use --force to refetch")
+        all_jobs_by_release = existing_data.get("jobs_by_release", {})
+    else:
+        all_jobs_by_release = {}
+
+        for release in RELEASES:
+            jobs = fetch_openstack_jobs_for_release(release)
+            all_jobs_by_release[release] = jobs
+
+            # Save progress after each release
+            save_progress({
+                "fetched_at": datetime.now().isoformat(),
+                "releases_fetched": list(all_jobs_by_release.keys()),
+                "jobs_by_release": all_jobs_by_release
+            }, progress_file)
+
+            time.sleep(1)  # Be nice to the API
+
+    print()
+    print("Analyzing metrics...")
+    summary, all_jobs = analyze_job_metrics(all_jobs_by_release)
+
+    # Save summary
+    save_progress(summary, "job_metrics_summary.json")
+    save_progress(all_jobs, "job_metrics_all_jobs.json")
+
+    # Generate report
+    report = generate_metrics_report(summary, all_jobs)
+    report_path = os.path.join(OUTPUT_DIR, "job_metrics_report.md")
+    with open(report_path, 'w') as f:
+        f.write(report)
+    print(f"Report saved: {report_path}")
+
+    print()
+    print("=" * 60)
+    print("Summary:")
+    print(f"  Total jobs: {summary['overall_stats'].get('total_jobs', 0)}")
+    print(f"  Total runs: {summary['overall_stats'].get('total_runs', 0)}")
+    print(f"  Overall pass rate: {summary['overall_stats'].get('overall_pass_rate', 0):.1f}%")
+    print("=" * 60)
+
+if __name__ == "__main__":
+    main()
diff --git a/reporting-toolkit/run_analysis.sh b/reporting-toolkit/run_analysis.sh
new file mode 100755
index 0000000..b3b934d
--- /dev/null
+++ b/reporting-toolkit/run_analysis.sh
@@ -0,0 +1,163 @@
+#!/bin/bash
+#
+# Run all OpenStack CI analysis scripts in the correct order.
+#
+# Usage:
+#   ./run_analysis.sh [--config-dir /path/to/ci-operator/config] [--output-dir /path/to/output]
+#
+# If --config-dir is not specified, defaults to ../../../ci-operator/config
+# (relative to script location, assuming standard repo layout)
+#
+# If --output-dir is not specified, outputs to current working directory
+# This allows running from any location in the filesystem
+
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# Default config directory (relative to script location)
+CONFIG_DIR="${SCRIPT_DIR}/../../../ci-operator/config"
+
+# Default output directory is current working directory
+OUTPUT_DIR="$(pwd)"
+
+# Parse arguments
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        --config-dir)
+            CONFIG_DIR="$2"
+            shift 2
+            ;;
+        --output-dir)
+            OUTPUT_DIR="$2"
+            shift 2
+            ;;
+        --force)
+            FORCE="--force"
+            shift
+            ;;
+        --help)
+            echo "Usage: $0 [OPTIONS]"
+            echo ""
+            echo "Options:"
+            echo "  --config-dir DIR   Path to ci-operator/config directory"
+            echo "  --output-dir DIR   Directory for output files (default: current directory)"
+            echo "  --force            Refetch data from Sippy API"
+            echo ""
+            echo "Examples:"
+            echo "  # Run from repo root, output to current directory"
+            echo "  ./hack/openstack-ci-analysis/reporting-toolkit/run_analysis.sh"
+            echo ""
+            echo "  # Run from anywhere, specify both directories"
+            echo "  ./run_analysis.sh --config-dir /path/to/release/ci-operator/config --output-dir /tmp/analysis"
+            exit 0
+            ;;
+        *)
+            echo "Unknown option: $1"
+            exit 1
+            ;;
+    esac
+done
+
+# Resolve to absolute paths
+CONFIG_DIR="$(cd "$CONFIG_DIR" 2>/dev/null && pwd)" || {
+    echo "Error: Config directory not found: $CONFIG_DIR"
+    echo "Use --config-dir to specify the path to ci-operator/config"
+    exit 1
+}
+
+OUTPUT_DIR="$(mkdir -p "$OUTPUT_DIR" && cd "$OUTPUT_DIR" && pwd)"
+
+echo "============================================================"
+echo "OpenStack CI Analysis Toolkit"
+echo "============================================================"
+echo "Script directory: $SCRIPT_DIR"
+echo "Config directory: $CONFIG_DIR"
+echo "Output directory: $OUTPUT_DIR"
+echo "============================================================"
+echo ""
+
+# Phase 1: Data Collection
+echo "=== Phase 1: Data Collection ==="
+echo ""
+
+echo "[1/4] Extracting job inventory..."
+python3 "$SCRIPT_DIR/extract_openstack_jobs.py" \
+    --config-dir "$CONFIG_DIR" \
+    --output-dir "$OUTPUT_DIR" \
+    --summary
+
+echo ""
+echo "[2/4] Fetching job metrics from Sippy..."
+python3 "$SCRIPT_DIR/fetch_job_metrics.py" \
+    --output-dir "$OUTPUT_DIR" \
+    ${FORCE:+"$FORCE"}
+
+echo ""
+echo "[3/4] Calculating extended metrics..."
+python3 "$SCRIPT_DIR/fetch_extended_metrics.py" \
+    --output-dir "$OUTPUT_DIR"
+
+echo ""
+echo "[4/4] Fetching platform comparison data..."
+python3 "$SCRIPT_DIR/fetch_comparison_data.py" \
+    --output-dir "$OUTPUT_DIR"
+
+# Phase 2: Configuration Analysis
+echo ""
+echo "=== Phase 2: Configuration Analysis ==="
+echo ""
+
+echo "[1/3] Analyzing redundancy..."
+python3 "$SCRIPT_DIR/analyze_redundancy.py" \
+    --output-dir "$OUTPUT_DIR"
+
+echo ""
+echo "[2/3] Analyzing coverage gaps..."
+python3 "$SCRIPT_DIR/analyze_coverage.py" \
+    --output-dir "$OUTPUT_DIR"
+
+echo ""
+echo "[3/3] Analyzing trigger patterns..."
+python3 "$SCRIPT_DIR/analyze_triggers.py" \
+    --output-dir "$OUTPUT_DIR"
+
+# Phase 3: Runtime Analysis
+echo ""
+echo "=== Phase 3: Runtime Analysis ==="
+echo ""
+
+echo "[1/3] Analyzing platform comparison..."
+python3 "$SCRIPT_DIR/analyze_platform_comparison.py" \
+    --output-dir "$OUTPUT_DIR"
+
+echo ""
+echo "[2/3] Analyzing workflow pass rates..."
+python3 "$SCRIPT_DIR/analyze_workflow_passrate.py" \
+    --output-dir "$OUTPUT_DIR"
+
+echo ""
+echo "[3/3] Categorizing failures..."
+python3 "$SCRIPT_DIR/categorize_failures.py" \
+    --output-dir "$OUTPUT_DIR"
+
+# Summary
+echo ""
+echo "============================================================"
+echo "Analysis Complete!"
+echo "============================================================"
+echo ""
+echo "Output directory: $OUTPUT_DIR"
+echo ""
+echo "Generated Reports:"
+find "$OUTPUT_DIR" -maxdepth 1 -name "*.md" -type f 2>/dev/null | while read -r f; do
+    echo "  - $(basename "$f")"
+done
+echo ""
+echo "Data Files:"
+find "$OUTPUT_DIR" -maxdepth 1 -name "*.json" -type f 2>/dev/null | wc -l | xargs -I {} echo "  {} JSON files generated"
+echo ""
+echo "To view key findings, run:"
+echo "  cd $OUTPUT_DIR"
+echo "  python3 -c \"import json; d=json.load(open('extended_metrics.json')); print(f'Pass rate: {d[\\\"overall\\\"][\\\"combined_pass_rate\\\"]:.1f}%')\""
+echo ""