Skip to content

OR-AI/or-epg-master

Repository files navigation


🛡️ EPG-Master: Engineering Strategic Certainty

A Cybernetic Operating System for Forensic AI Governance

Status Python Backend

"Current AI provides answers. EPG-Master builds evidence chains for high-stakes decisions."


1. The Vision: From Plausibility to Truth

In an era of ubiquitous AI, we are faced with a Trust Dilemma. Large Language Models (LLMs) are "People Pleasers"—they deliver plausible, eloquent stories that often lack accountability and forensic grounding. You cannot bet a €15 billion infrastructure project on a "maybe."

EPG-Master (Epistemic Paradigm Governor) is a multi-agent framework designed to restore human sovereignty. It transforms "AI-Gequatsche" (plausible noise) into Forensic Truth.

Why EPG-Master?

  • Virtual Strike Team: Instead of one lonely chatbot, you hire a crew of 5 specialized agents that audit, challenge, and verify each other in a closed cybernetic loop.
  • Epistemic Metabolism: The system doesn't just "talk"; it processes knowledge. It measures the density of truth ($\eta_e$), the maturity of logic (SMI), and the progress of the consensus (EPI).
  • Audit-Ready Output: The result is not a chat history, but a cryptographically sealed 7-Pillar Dossier ready for the boardroom.

2. The Cybernetic Crew: A Hierarchy of Truth

The EPG-Master does not rely on a single AI response. It orchestrates a Virtual Strike Team of five specialized agents that operate in a closed cybernetic loop, overseen by the human leader.

👤 0. The Human Sovereign (The User)

  • Role: The Ultimate Authority.
  • Task: Defines the Mission Objective and grants the Final Approval based on the forensic dossier.
  • Relationship: The alpha and omega of the loop. Provides the initial steering impulse and holds the power of the final "YES" or "NO."

🧭 1. Strategic Generator (The Project Lead)

  • Role: The Captain.
  • Task: Translates human intent into three distinct strategic pathways (A, B, C). Focuses on ROI and Value Creation.
  • Relationship: Directs the Architect and integrates Risk Agent alerts to refine the overall mission strategy.

🏗️ 2. Architect (The Technical Master)

  • Role: Chief Engineer.
  • Task: Constructs the Formal Logical Skeleton using operational calculus (:=, ->). Transmutes abstract strategy into rigid mathematical structures.
  • Relationship: Works for the Project Lead while ensuring the design is robust enough to pass the Auditor's forensic scrutiny.

🔍 3. Risk Agent (The Epistemic Risk)

  • Role: Safety Officer / "Devil’s Advocate."
  • Task: Specifically hunts for Blind Spots and falsifies weak assumptions. Asks: "What happens if we fail?"
  • Relationship: Critically challenges the Architect's logic to protect the system from catastrophic errors.

⚖️ 4. Auditor (The Forensic Sovereign)

  • Role: Independent Judge.
  • Task: The ultimate Gatekeeper. Verifies every claim against hard evidence (Technical Documents). Holds Absolute Veto Power over the process.
  • Relationship: Stands outside the hierarchy to audit the integrity of the Project Lead and Architect before the decision reaches the board level.

🏛️ 5. Governor (The Executive Arbiter)

  • Role: The Chief Synthesizer / The Bridge.
  • Task: Translates complex machine logic and forensic evidence into a Management-Level Narrative.
  • Relationship: Acts as the Executive Interface. He is the machine’s "last word," designed solely to serve the Human Sovereign with a verified recommendation.

The Cybernetic Interaction Logic

The team operates in Sprints. If the Auditor triggers a VETO, the system enters a self-correction cycle. The Architect must harden the logic and the Project Lead must provide better evidence until the forensic threshold is met.


3. The Engine: Epistemic Metrics & Governance KPI

The EPG-Master does not measure "quality" through sentiment. It uses a cybernetic feedback loop driven by four distinct metrics that determine the system's state.

🟠 SMI – Structural Maturity Index

  • Logic: Calculated in core/operator_core.py.
  • Definition: Measures the semantic and structural alignment between the Strategic Generator (Vision) and the Architect (Technical Blueprint).
  • The Code Reality: It uses signature analysis to ensure that every strategic path defined by the Project Lead is mirrored in the Architect's formal calculus (:=, ->). If the Architect invents components or ignores strategic directives, the SMI drops significantly.

🔴 EPI – Epistemic Progress Index

  • Logic: Calculated in epistemic_governance/epistemic_delta.py.

  • Definition: Measures the Learning Delta of the Governor between two sprints.

  • The Code Reality: It tracks the evolution of the Governor's state vector $S = {C, R, P, D}$. It rewards the system when the consensus sharpens and penalizes stagnation. A high EPI indicates that the team is actively resolving contradictions.

  • To track if the team is actually making progress (and not just repeating phrases), the system calculates the State Vector $S$ of the Governor after every sprint. Think of this as a "Cognitive Snapshot" consisting of four coordinates:

  • C (Confidence): The subjective certainty score provided by the Governor.

  • R (Risks): The quantitative weight of identified risks in the current proposal.

  • P (Position): The semantic location of the strategy in high-dimensional space (calculated via E5-Large Embeddings).

  • D (Depth): The density of inherited evidence tags ([TD]) from the specialists.

The Calculation: The EPI is the Euclidean distance between the snapshot of Sprint $n$ and Sprint $n-1$.

  • High EPI: The Governor significantly changed their mind or sharpened the logic due to new evidence.
  • Low EPI: The system is converging toward a final, stable consensus.

🟣 ηₑ – Epistemic Efficiency

  • Logic: Calculated in core/orchestrator.py (_calculate_epistemic_efficiency).
  • Definition: Measures the Truth Density of the process.
  • The Code Reality: A thermodynamic formula that sets the number of forensic grounding points ([TD], Art., ISO) in relation to the total information mass (Tokens read from RAG + Tokens written in Chat).
  • Formula: $\eta_e = (KPI_{current} \times (1 + GroundingPoints^{1.8}) \times 100) / (Mass_{TD} + Mass_{STM} + 1)$.
  • Purpose: It punishes "AI-Schwurbelei" (excessive text without evidence) and rewards high-density forensic citations.

🔵 Confidence (Stress Resilience)

  • Logic: Calculated in epistemic_governance/confidence_update.py.

  • Definition: The reliability of the current consensus under pressure.

  • The Code Reality: It starts with the Governor's self-assessment and is then aggressively recalibrated by the Auditor's Veto malus and the Risk Agent's falsification score.

  • In epistemic_governance/confidence_update.py, confidence is not a "feeling" but a Stress-Resilience Metric. It follows the principle of Falsification:

    1. Initial Input: The Governor starts with a self-assessed confidence level (e.g., 90%).
    2. The Auditor's Attack: If the Auditor triggers a VETO (e.g., due to a missing source), the confidence is multiplied by the VETO_CONFIDENCE_PENALTY (defined in settings.py, e.g., 0.6).
    3. The Risk Impact: The score is further adjusted based on the "Criticality" of the Risk Agent's report.

The Result: A high Confidence score at the end of a run means the proposal was "attacked" by the Auditor and the Risk Agent and survived without its logic being broken.


🟢 The Final Master Equation: Governance KPI

The EPG-Master enforces a Zero-Error-Tolerance logic through multiplication. Unlike additive scoring, if one pillar fails, the entire system collapses to protect the human sovereign from false certainty.

$$KPI = (\frac{Conf}{100}) \times (SMI^{1.2}) \times EPI_{weight}$$

  • Multiplicative Integrity: If SMI is 0 (total logic failure), the KPI is 0.
  • Hardening Threshold: A run is only considered HARDENED if the KPI exceeds the threshold defined in config/settings.py (default: 85%).

4. Installation & Setup: Hardening your Environment

EPG-Master is a high-performance framework. To ensure forensic precision and handle the multi-agent orchestration, your system must meet specific criteria.

💻 System Requirements

  • Operating System: Windows 10/11 Pro (Tested on Windows 11 Pro).
  • CPU: Minimum 8 Cores (AMD Ryzen 7 / Intel i7 or better).
  • RAM: 32 GB RAM minimum (64 GB recommended for large document ingestion).
  • GPU (The VRAM Reality):
    • Recommended: NVIDIA RTX 3090 / 4090 (24 GB VRAM) for maximum speed.
    • Minimum Baseline: NVIDIA RTX 4060 Ti (16 GB VRAM).
    • Note: Using quantized GGUF models (like the Q4 versions specified below) allows the system to run on 16 GB VRAM by utilizing shared system memory.
  • Storage: 50 GB free space (Models + Vector Database).

💡 Hardware Optimization Tips

If you are running on 16 GB VRAM (e.g., RTX 4060 Ti):

  1. Dual GPU Setup: Connect your monitors to the iGPU (onboard graphics) to free up the full 16 GB of your dedicated NVIDIA GPU for the AI models.
  2. Ollama Memory Management: Ollama will automatically manage the spillover into system RAM, but inference for the 30B Governor will be slower than on a 24 GB card.

🛠️ Prerequisites

Before cloning the repository, ensure the following tools are installed:

  1. Python 3.11+: Download here.
  2. Ollama: Required for running the Agent-Models. Download here.
  3. Tesseract OCR: Required for multimodal document parsing (Image-to-Text).
    • Install version 5.5.0 or higher.
    • Important: Add the Tesseract installation path to your Windows Environment Variables (PATH).
  4. Qdrant (Local): EPG-Master uses the Qdrant local storage mode. No separate server is required, but the directory vector_stores/ must be writeable.

🚀 Step-by-Step Setup

1. Clone the Repository

git clone https://github.com/OR-AI/or-epg-master.git
cd EPG-Master

2. Create a Virtual Environment

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Prepare the AI Models (Ollama)

EPG-Master v1.2 relies on specific high-reasoning models. Run these commands in your terminal:

ollama pull qwen2.5:14b-instruct
ollama pull qwen2.5-coder:14b
ollama pull phi4:14b
ollama pull mistral-nemo:12b
ollama pull SimonPu/Qwen3-Coder:30B-Instruct_Q4_K_XL

5. Configure Hardware Offloading

If you encounter OutOfMemory (OOM) errors, open config/settings.py and reduce the parameter count of the models or increase the LLM_RESPONSE_TIMEOUT.

6. Maintenance

🧹 System Reset

To perform a complete cybernetic reset (clearing all agent memories and exports while keeping technical documents), run:

python reset_epg_memory.py

📂 Data Ingestion (Your Forensic Base)

To allow the EPG-Master to find evidence, you must populate the "Source of Truth":

  1. Place your PDFs, DOCX, or XLSX files into data/raw/governance_docs/ (or other subfolders in data/raw/).
  2. On the first start, the Governance Watcher will automatically parse and index these documents into the Qdrant vector store.
  3. The multilingual-e5-large embedding model will be downloaded automatically (approx. 2.3 GB).

🚦 Verification Run

Run the main application to verify your setup:

streamlit run main_app.py

If the Dashboard loads and the "Evidence Explorer" shows your documents, the system is Hardened and Ready.


5. Usage Manual: Operating the Governance Engine

Running the EPG-Master is a process of Human-Led Strategic Steering. You provide the intent, and the system provides the forensic validation.

🎯 Step 1: Defining the Mission Objective

The EPG-Master starts with the Human Steering Impulse. Unlike a standard chatbot, you should provide a complex, multi-layered objective.

  • Bad Prompt: "Tell me about AI risks." (Too vague, no strategic depth).
  • Good Prompt: "Evaluate the strategic viability of establishing a group-wide AI Governance Framework (CAGF) to ensure compliance with the EU AI Act Art. 6, focusing on a 15-billion-euro budget and a 2032 deadline."

Why this matters: The system performs an initial Sense-Context Extraction. It creates a "Mission DNA" (e.g., Chemicals, ROI, Law) which acts as a semantic filter for all retrieved evidence.


🔄 Step 2: The Consensus Cycle & Sovereignty Mechanisms

A Consensus Cycle (or Sprint) is a complete metabolic rotation of the agent crew. EPG-Master v1.2 uses three specific mechanisms to ensure the AI stays focused on your goals:

A) The Domain Fidelity Guard (Sovereign Exit)

Before the first sprint starts, the system performs a "Sense-Context Extraction."

  • The Guard: If the input (e.g., "Blue Smurfs") does not match the forensic database, the system triggers a Domain Mismatch and halts execution.
  • The Benefit: This prevents the system from generating nonsensical compliance reports for out-of-scope topics.

B) ASR – Active Strategic Refresh (The Heartbeat)

In v1.2, we solved the "Attention Drift." In every single agent call, your original Objective is re-injected at the very end of the prompt.

  • Recency Bias Utilization: By placing your mission after the technical evidence, we ensure that your intent remains the "Commanding Signal."
  • Impact: Agents treat technical documents (ISO/Law) only as tools to solve your specific problem, not as the primary subject.

C) Metabolic Flush (Short-Term Memory Management)

To prevent "Data Drowning," the system performs a flush after Sprint 2:

  • History Pruning: It removes old, used evidence snippets from the agents' immediate memory while preserving their Strategic Decisions.
  • Result: This keeps the reasoning sharp and prevents the agents from getting lost in redundant regulatory text.

The Workflow of a Cycle:

  1. Directive (Project Lead): Translates your mission into strategic paths.
  2. Structuring (Architect): Builds the formal logic (:=, ->).
  3. Falsification (Risk Agent): Challenges assumptions and identifies "Liability Gaps."
  4. Synthesis (Governor): Balances ROI, Risk, and Evidence into an Executive Briefing.
  5. Audit (Auditor): The final Gatekeeper. Verifies every claim. If he triggers a VETO, the cycle repeats with corrective memory.

📊 Step 3: Monitoring Live Telemetry

While the engine is running, the Streamlit Dashboard provides real-time insights into the "Health" of the decision:

  • SMI (Structural Maturity): Watch the orange line. It shows if the logic is becoming more stable or more chaotic.
  • ηₑ (Efficiency): Watch the purple line. High efficiency means the agents are citing hard evidence rather than generating "AI-noise."
  • EPI (Progress): Watch the red line. It tracks the "Learning Delta." If it drops toward zero at the end, the system has Converged—meaning no further truth can be extracted.

EPG Dashboard Overview
Fig 1: Live Evolution Trace & Epistemic Metabolism
Hardened Governance Report
Fig 2: Hardened Governance Report & Epistemic Health


🛡️ Step 4: The Final Handover (Human Sovereignty)

The EPG-Master never "decides" for you. Once the cycles are complete and the state is HARDENED, the system generates the 7-Pillar Dossier.

  1. Review the _VERIFICATION_TODO.md: This is your primary tool. It lists every assumption ([INT]) and every data gap ([GAP]) that the AI could not solve.
  2. Execute the Decision: Use the _PROPOSAL.md (Executive Summary) to brief the board, backed by the full forensic trail in the _FORENSIC.md.

💡 Pro-Tip: Memory Persistence

EPG-Master v1.2 uses Long-Term Memory (LTM). Every successful run hardens the agents' expertise. If you run a similar mission later, the agents will "remember" previous successful logic paths, leading to faster convergence and higher SMI.

📑 Mission Blueprints (Ready-to-Use Examples)

To demonstrate the forensic precision and the domain-agnostic nature of the EPG-Master, we have provided four mission blueprints in the docs/examples/ directory. You can use these prompts to test the system's reasoning across different industries:

  1. NAAC_Mission-Prompt.md

    • Focus: State-level Infrastructure & Sovereignty.
    • Challenge: Evaluating a €15 Billion National AI-Cloud under geopolitical stress scenarios (e.g., supply chain decoupling).
    1. BASF_Mission-Prompt.md
    • Focus: Industrial CAPEX & Logistics.
    • Challenge: Evaluating the strategic viability and risk profile of a €9 Billion industrial investment in China. The system must balance expected returns against complex geopolitical risks and financial constraints.
  2. CAGF_Governance-Prompt.md

    • Focus: Corporate Compliance & Data Protection.
    • Challenge: Building a group-wide Governance Framework to transition 150 legacy algorithms into EU AI Act compliance.
  3. Smurf_Sanity_Check-Prompt.md

    • Focus: System Integrity & Security.
    • Challenge: A "Negative Test" to prove the Domain Fidelity Guard correctly aborts execution when a non-strategic/fictional prompt is entered.

Instruction: Open the desired .md file, copy the content under the "The Prompt" section, and paste it into the EPG-Master dashboard input field.


6. The Output: The Seven Pillars of Forensic Truth

The EPG-Master does not provide a simple chat history. It generates a Fortress of Evidence. Upon completion of a mission, the system exports seven distinct, cryptographically sealed artifacts into the exports/ directory.

🏛️ Why Seven Pillars?

A multi-billion euro decision cannot rest on a single document. High-stakes governance requires a separation of concerns. Each pillar represents a different epistemic perspective—ensuring that the Executive Recommendation is backed by Mathematical Logic, Risk Falsification, and Normative Compliance.


🛡️ The Seven Artifacts

  1. _DATA.json (The Flight Recorder)

    • Value: Contains every raw metric (SMI, EPI, ηₑ, Confidence) and system log.
    • Purpose: Technical auditability. If a decision is questioned years later, the "Flight Recorder" provides the exact state of the machine during the process.
  2. _FORENSIC.md (The Strategic Proof)

    • Value: The Project Lead's deep-dive into the three strategic pathways.
    • Purpose: Shows exactly which technical documents ([TD]) support each part of the strategy.
  3. _RISK.md (The Falsification Report)

    • Value: The unvarnished critique by the Risk Agent.
    • Purpose: Proves that the strategy wasn't just "accepted" but survived a radical stress-test and active search for blind spots.
  4. _VERDICT.md (The Auditor’s Seal)

    • Value: The final independent judgment.
    • Purpose: Confirms whether the team followed the "Mission DNA" and the mandatory rules of evidence. This is your "Green Light."
  5. _PROPOSAL.md (The Executive Summary)

    • Value: Boardroom-ready synthesis translated into human narrative.
    • Purpose: The bridge between machine logic and management decision. It provides the "Why" and the "How" in professional language.
  6. _TRANSCRIPT.md (The Evolution Log)

    • Value: Full transparency of the internal debate.
    • Purpose: Documents how the team corrected itself across the Sprints. It reveals the "thinking process" of the virtual strike team.
  7. _VERIFICATION_TODO.md (The Human Sovereign Gate)

    • Value: The most critical document for the user.
    • Purpose: It extracts every assumption ([INT]) and every data gap ([GAP]). It tells you exactly where you, as the human leader, must look closer before signing off.

🔐 Forensic Integrity & The "Evolutionary Save"

You will notice that EPG-Master saves and re-saves these reports during the run. This is a deliberate Cybernetic Security Protocol:

  • Immutable Traceability: Every Sprint is a snapshot. By saving the intermediate states, we ensure that the final dossier isn't just a "lucky guess" from the last round, but the result of a traceable Consensus Evolution.
  • EPG-FORENSIC-SEAL: Each file contains an internal SHA-256 Hash and a Session-ID. This creates a mathematical link between all 7 files. If one file is manipulated, the chain of evidence is broken.

"In the boardroom, 'I think' is a liability. 'I have a forensic dossier' is sovereignty."