FTSE-100-Financial-Analysis 🇬🇧📈

A production-style financial data analysis project designed to work offline using cached datasets, reproducible pipelines, and modular architecture.

Overview

This project is an offline-first FTSE 100 financial analysis system. It uses cached data snapshots to ensure reproducible and reliable analysis. It includes stub/demo reference feeds for development and testing without relying on live APIs.

This repository is built as a flagship portfolio project for FTSE 100 financial market analysis. It is delivered in two tracks:

V1 — Dissertation Baseline (Replay / Snapshot): faithful replication of a dissertation-style workflow (intraday visuals + ARIMA + LSTM; 10-minute horizon).
V2 — Modernised UK Market Terminal (Platform / Monitoring): a production-style redesign with medallion tables, monitoring, and premium neon dashboards.

Not financial advice. This repo is for analytics / reporting / modelling demonstration only.

What you get (outputs)

V1 (7 dashboard pages)

Exports are generated to docs/dashboards/V1/exports/:

Market overview
Candles + volume
Moving averages
ARIMA forecast (10-min)
LSTM forecast (10-min)
Model comparison
Data quality snapshot

V2 (22 dashboard pages)

Exports are generated to docs/dashboards/V2/exports/:

Overview, intraday terminal, volatility regimes, drawdown
Correlation heatmap, sector rotation, top movers
Technical indicators, forecasting cockpit, backtesting report
Monitoring: model drift, DQ coverage, pipeline health, latency SLA
Governance: KPI dictionary, measure catalogue, data inventory, lineage
Operations: incident timeline, release notes

Quickstart

1) Create environment

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2) Build V1 (data → models → exports)

python scripts/v1_build_all.py

3) Build V2 (platform tables → monitoring → neon exports)

python scripts/v2_build_all.py

Terminal realism: constituents + events enrichment

To make the V2 experience feel like a UK market terminal, the repo ships reference datasets under data/reference/ and uses them during scripts/v2_build_all.py:

FTSE 100 constituents universe (tickers + sectors + weights): data/reference/ftse100_constituents_universe_snapshot.csv
UK macro calendar (stub): data/reference/uk_macro_calendar_stub.csv
Earnings calendar (stub, top-25): data/reference/ftse100_earnings_calendar_stub_top25.csv
Market news headlines (stub): data/reference/market_news_headlines_stub.csv

The build produces a unified table:

v2_modernisation_realtime/data/gold/events_calendar.(parquet|csv)

You can control the universe source in V2:

# Offline-first (default)
python scripts/v2_build_all.py --constituents-source snapshot --weights-method snapshot

# Best-effort refresh from Wikipedia (requires internet + lxml/bs4)
python scripts/v2_build_all.py --constituents-source wikipedia --weights-method equal

# Include the stub headline feed inside the events calendar
python scripts/v2_build_all.py --include-news

Important: weights and calendars here are designed for portfolio realism (demo/teaching), not as official FTSE Russell index weights or an authoritative economic calendar.

Optional: Live pulls (Yahoo/Stooq/AlphaVantage/Polygon) + caching

The repo is offline-first (ships with cached snapshots) so it runs anywhere. If you want the “real-time” feel, you can enable live pulls and cache them.

See: DATA_PROVIDERS.md

Example (V1, Yahoo intraday):

python scripts/v1_build_all.py --data-source yahoo --symbol "^FTSE" --cache-dir ./data_cache

Example (V2, Stooq daily anchor → synthetic intraday bridge):

python scripts/v2_build_all.py --data-source stooq --symbol "^UKX" --cache-dir ./data_cache

Repo layout

FTSE-100-Financial-Analysis/
  v1_dissertation_baseline/
    data/raw/                  # cached raw snapshot
    data/processed/            # cleaned parquet snapshot
    outputs/figures/           # figures referenced by specs
    outputs/forecasts/         # ARIMA + LSTM forecast CSVs
    outputs/metrics/           # KPI + model metrics
    logs/                      # JSONL build logs
    notebooks/                 # V1 walkthrough notebooks

  v2_modernisation_realtime/
    data/bronze|silver|gold/   # platform tables (parquet)
    monitoring/reports/        # dq/drift/ops reports
    db/                        # duckdb placeholder
    outputs/                   # V2 analysis artifacts
    logs/                      # JSONL build logs
    notebooks/                 # V2 walkthrough notebooks

  docs/
    dashboards/                # page specs + exports for V1 + V2
    dissertation/              # dissertation document (V1 reference)
    mermaid/                   # architecture + lineage diagrams
    logs/                      # issue registers / ops docs

  src/ftse100/                 # reusable python package
  scripts/                     # build runners (V1 + V2)

Architecture diagrams (Mermaid)

End-to-end architecture (V2)

%% FTSE-100-Financial-Analysis - End-to-End Architecture (V2 Platform)
flowchart LR
  subgraph S["Data Sources"]
    YF["Yahoo Finance / Provider API"]
    CAL["Trading Calendar (UK)"]
    EVT["Events Feed (optional)"]
  end

  subgraph I["Ingestion Layer"]
    ING["Ingest Job (schedule: 5m in session)"]
    RAW["data/raw/provider_snapshots"]
    LOG["docs/logs/refresh_run_register.csv"]
  end

  subgraph P["Processing Layer"]
    STG["data/staged (cleaned + tz-normalised)"]
    CUR["data/curated (features + indicators)"]
    DQ["DQ Checks (dupes, gaps, OHLC sanity)"]
    DQREP["dq reports + issue register"]
  end

  subgraph M["Modelling Layer"]
    ARIMA["ARIMA baseline"]
    LSTM["LSTM sequence model"]
    FCAST["outputs/forecasts"]
    METR["outputs/metrics"]
  end

  subgraph V["Serving Layer"]
    MART["data/marts (mart.* tables)"]
    DASH["Dashboards (Power BI / Streamlit / Plotly)"]
    EXP["docs/dashboards exports (4K PNGs)"]
  end

  subgraph O["Ops & Monitoring"]
    MON["Monitoring (freshness, drift, SLA)"]
    ALERT["Alert Rules (OK/WARN/FAIL)"]
    INC["Incident Register"]
  end

  YF --> ING --> RAW
  CAL --> ING
  EVT --> ING

  RAW --> STG --> CUR --> MART
  CUR --> DQ --> DQREP
  DQREP --> DASH

  CUR --> ARIMA --> FCAST
  CUR --> LSTM --> FCAST
  FCAST --> METR --> MART

  MART --> DASH --> EXP
  ING --> LOG
  LOG --> MON --> ALERT --> INC
  DQREP --> MON

Data lineage (medallion)

%% FTSE-100-Financial-Analysis - Data Lineage (Medallion)
flowchart TB
  subgraph RAW["Raw"]
    R1["raw: provider snapshots"]
    R2["raw: events optional"]
  end

  subgraph STG["Staged"]
    S1["staged: ohlcv_clean"]
    S2["staged: calendar_joined"]
  end

  subgraph CUR["Curated"]
    C1["curated: returns + vol"]
    C2["curated: indicators (RSI/MACD/BB)"]
    C3["curated: model_features"]
  end

  subgraph MART["Marts"]
    M1["mart.market_overview"]
    M2["mart.intraday_terminal"]
    M3["mart.volatility_regimes"]
    M4["mart.drawdown_risk"]
    M5["mart.forecasting_cockpit"]
    M6["mart.model_monitoring"]
    M7["mart.pipeline_health"]
    M8["mart.data_quality_health"]
  end

  subgraph DASH["Dashboards"]
    D1["V2-P01 Overview"]
    D2["V2-P02 Intraday"]
    D3["V2-P03 Vol Regimes"]
    D4["V2-P04 Risk"]
    D5["V2-P09 Forecasting"]
    D6["V2-P11 Monitoring"]
    D7["V2-P13 Pipeline Health"]
    D8["V2-P12 DQ"]
  end

  R1 --> S1 --> C1 --> M1 --> D1
  R1 --> S1 --> C1 --> M2 --> D2
  R1 --> S1 --> C1 --> M3 --> D3
  R1 --> S1 --> C1 --> M4 --> D4
  C3 --> M5 --> D5
  C1 --> M6 --> D6
  M7 --> D7
  M8 --> D8
  R2 --> S2 --> C1

Forecasting lifecycle

%% Forecasting Lifecycle (V2)
stateDiagram-v2
  [*] --> DataReady: DQ pass + freshness ok
  DataReady --> FeatureBuild: returns/vol/indicators
  FeatureBuild --> TrainOrLoad: (re)train OR load latest model
  TrainOrLoad --> Forecast: generate horizon predictions
  Forecast --> Evaluate: compute MAE/RMSE + coverage
  Evaluate --> Publish: write to mart.forecasting_* tables
  Publish --> Monitor: drift + error tracking
  Monitor --> RetrainDecision: breach thresholds?
  RetrainDecision --> TrainOrLoad: yes
  RetrainDecision --> DataReady: no

Monitoring + alerts (sequence)

%% Monitoring & Alerts Sequence (V2)
sequenceDiagram
  autonumber
  participant Scheduler as Scheduler
  participant Ingest as Ingestion Job
  participant DQ as DQ Engine
  participant Models as Forecast Runner
  participant Mart as Mart Writer
  participant Dash as Dashboard
  participant Monitor as Monitor/Alert
  participant Incident as Incident Register

  Scheduler->>Ingest: Trigger (5m in session)
  Ingest->>DQ: Send staged batch
  DQ-->>Ingest: DQ status (pass/warn/fail)
  alt DQ FAIL
    DQ->>Monitor: Raise DQ_FAIL
    Monitor->>Incident: Create incident + notify
    Monitor->>Dash: Show RED banner (block forecasts)
  else DQ PASS/WARN
    Ingest->>Models: Run forecasting pipeline
    Models->>Mart: Write forecasts + metrics
    Mart->>Dash: Dash reads updated marts
    Dash->>Monitor: Report freshness + usage telemetry
    Monitor->>Incident: Log issues if SLA breached
  end

Notes on data

This repo ships with offline synthetic snapshots so everything runs deterministically in portfolio environments.
The synthetic FTSE session is constrained to match a real daily OHLC print (see V1 metadata JSON) so results look credible.
Hooks for live refresh can be added outside this sandbox (Yahoo/Stooq/paid data providers).

License

MIT (see LICENSE).

What's new in V2.2

This repo version adds portfolio-grade governance realism and a Power BI export evidence pack:

Governance logs (root): LOGBOOK.md, PROGRESS_LOG_DAILY.md, RISK_LOG.md
Time tracking (root): WEEKLY_HOURS.csv, HOURS_BREAKDOWN.csv
Deep registers: docs/logs/DECISIONS_LOG.md, ASSUMPTIONS_LOG.md, issue_register.csv, model_experiment_log.csv, dashboard_review_log.md
V1 documentation packs: v1_dissertation_baseline/docs/diagrams/ and v1_dissertation_baseline/docs/Dashboards/
Power BI export pack (4K): v2_modernisation_realtime/bi_powerbi/exports/ (includes donut/pie + gauge visuals)

Key review docs:

docs/00_REPO_NAVIGATION.md
docs/01_PORTFOLIO_WALKTHROUGH.md

What's new in V2.1

This repo version adds terminal realism and removes dangling spec references:

Curated mart.* layer (v2_modernisation_realtime/data/mart/) — every V2 dashboard page now has a physical input table.
Populated DuckDB warehouse (v2_modernisation_realtime/db/warehouse.duckdb) with bronze.*, silver.*, gold.*, and mart.*.
Repo-wide run register (docs/logs/refresh_run_register.csv) for auditability.
4K dashboard exports (all docs/dashboards/**/exports/*.png are 3840×2160).
UK market terminal app (optional): streamlit run apps/uk_market_terminal.py.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
apps		apps
assets		assets
config		config
data/reference		data/reference
docs		docs
logs		logs
scripts		scripts
src/ftse100		src/ftse100
tests		tests
v1_dissertation_baseline		v1_dissertation_baseline
v2_modernisation_realtime		v2_modernisation_realtime
.env.example		.env.example
.gitignore		.gitignore
ASSUMPTIONS_LOG.md		ASSUMPTIONS_LOG.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DAILY_TIMELOG.csv		DAILY_TIMELOG.csv
DASHBOARD_EXPORT_RUNBOOK.md		DASHBOARD_EXPORT_RUNBOOK.md
DATA_CONTRACTS.md		DATA_CONTRACTS.md
DATA_PROVIDERS.md		DATA_PROVIDERS.md
DATA_README.md		DATA_README.md
DECISIONS_LOG.md		DECISIONS_LOG.md
HANDOVER_NOTES.md		HANDOVER_NOTES.md
HOURS_BREAKDOWN.csv		HOURS_BREAKDOWN.csv
ISSUE_REGISTER.csv		ISSUE_REGISTER.csv
KPI_CATALOGUE.md		KPI_CATALOGUE.md
KPI_THRESHOLDS_CHANGELOG.md		KPI_THRESHOLDS_CHANGELOG.md
LICENSE		LICENSE
LOGBOOK.md		LOGBOOK.md
MANIFEST.md		MANIFEST.md
MEETING_NOTES.md		MEETING_NOTES.md
METRICS_LIBRARY.md		METRICS_LIBRARY.md
PROGRESS_LOG_DAILY.md		PROGRESS_LOG_DAILY.md
PROGRESS_LOG_WEEKLY.md		PROGRESS_LOG_WEEKLY.md
README.md		README.md
REFERENCE_DATA.md		REFERENCE_DATA.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
REPO_AUDIT_LOG.md		REPO_AUDIT_LOG.md
RISK_LOG.md		RISK_LOG.md
SECURITY.md		SECURITY.md
STAKEHOLDER_COMMS_LOG.md		STAKEHOLDER_COMMS_LOG.md
WEEKLY_HOURS.csv		WEEKLY_HOURS.csv
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_ui.txt		requirements_ui.txt
test.md		test.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FTSE-100-Financial-Analysis 🇬🇧📈

Overview

What you get (outputs)

V1 (7 dashboard pages)

V2 (22 dashboard pages)

Quickstart

1) Create environment

2) Build V1 (data → models → exports)

3) Build V2 (platform tables → monitoring → neon exports)

Terminal realism: constituents + events enrichment

Optional: Live pulls (Yahoo/Stooq/AlphaVantage/Polygon) + caching

Repo layout

Architecture diagrams (Mermaid)

End-to-end architecture (V2)

Data lineage (medallion)

Forecasting lifecycle

Monitoring + alerts (sequence)

Notes on data

License

What's new in V2.2

What's new in V2.1

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FTSE-100-Financial-Analysis 🇬🇧📈

Overview

What you get (outputs)

V1 (7 dashboard pages)

V2 (22 dashboard pages)

Quickstart

1) Create environment

2) Build V1 (data → models → exports)

3) Build V2 (platform tables → monitoring → neon exports)

Terminal realism: constituents + events enrichment

Optional: Live pulls (Yahoo/Stooq/AlphaVantage/Polygon) + caching

Repo layout

Architecture diagrams (Mermaid)

End-to-end architecture (V2)

Data lineage (medallion)

Forecasting lifecycle

Monitoring + alerts (sequence)

Notes on data

License

What's new in V2.2

What's new in V2.1

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages