Skip to content

Cascading failures demo: lab & script#477

Open
Vera-bahval wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Vera-bahval:cascading_failures
Open

Cascading failures demo: lab & script#477
Vera-bahval wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Vera-bahval:cascading_failures

Conversation

@Vera-bahval
Copy link
Copy Markdown

Cascading Failures Extension for FinBot

Adds a reproducible, observable harness for studying cascading failures in
FinBot's multi-agent invoice pipeline.

What's changed

  • No breaking changes. The existing orchestrator, runner, specialized agents,
    guardrails, and event emission are reused unchanged.
  • finbot/apps/vendor/routes/api.py -- two new endpoints registered on the
    existing vendor API router (/vendor/api/v1/cascade/...).
  • finbot/apps/vendor/routes/web.py -- one new page route (/vendor/cascade).
  • finbot/apps/vendor/templates/base.html -- one sidebar nav entry ("Cascade
    Lab") added.

What's added

finbot/agents/cascade.py

  • AgentStepResult -- per-delegation record (order, agent, success,
    confidence, reasoning, errors).
  • CascadeAnalysis -- aggregate metrics (initial/final confidence, cumulative
    degradation, total errors, failed agents, cascade type, whether the chain
    reached payments/communication).
  • CascadeOrchestratorAgent -- subclass of OrchestratorAgent. Overrides
    _capture_agent_context to record structured step results as delegations
    complete. Behaviourally identical to the base orchestrator.
  • classify_cascade() -- returns one of
    none | dirty_data | half_cascade | midchain_cascade | full_cascade.
  • run_cascade_orchestrator() -- drop-in coroutine that mirrors
    run_orchestrator_agent and returns the normal result enriched with
    agent_chain and cascade_analysis.
  • load_scenarios_file() / get_scenario(id) -- catalogue helpers.

finbot/agents/cascade_scenarios.json

Single source of truth for all cascade scenarios. Each entry declares
expected cascade type, severity, explanation, and a parameterised invoice
payload. Adding a scenario requires no code changes. The file also
documents the cascade-type taxonomy (label, severity, summary) used by the
UI legend.

Cascade Lab web UI

Interactive page in the vendor portal at /vendor/cascade that makes the
cascade chain observable without leaving the browser.

  • Lists scenarios from the JSON catalogue as cards with expected cascade
    type and severity.
  • "Run scenario" creates a one-off demo invoice and processes it through
    the real instrumented agent chain.
  • Renders the result as an animated agent pipeline
    (Invoice → Fraud → Payments → Communication), each node showing
    success/failure, confidence bar, extracted error signals, and reasoning
    summary. Steps reveal in order so cascade propagation is easy to read.
  • A top hero panel renders the detected cascade type, severity, expected-vs-
    observed match, and confidence-degradation metrics.

Endpoints

  • GET /vendor/api/v1/cascade/scenarios -- returns the JSON catalogue.
  • POST /vendor/api/v1/cascade/run -- body {"scenario_id": "..."};
    runs the instrumented orchestrator synchronously and returns
    {scenario, invoice, workflow_id, task_status, task_summary, agent_chain, cascade_analysis}.

scripts/cascade_failure_demo.py

Standalone demo that runs the cascade-instrumented orchestrator in-process
(no HTTP, no auth). Seeds a demo vendor in the cascade-demo namespace,
submits invoices covering each scenario, and prints the agent chain plus
cascade analysis per run. Gracefully degrades to no-op event emission if
Redis is unreachable.

docs/cascade_failures.md

Design rationale, cascade taxonomy table, web UI walkthrough, programmatic
usage, and limitations.

Cascade taxonomy

Type Error origin Reaches payments Severity
dirty_data Input, caught by first agent No Low
half_cascade Early agent (invoice / fraud) No Medium
midchain_cascade Middle agent (fraud / approval) Yes High
full_cascade First agent on plausible input Yes Critical

How to use

Web UI (recommended)

  1. Start the stack: docker compose up -d --build
  2. Open http://localhost:8000/vendor/cascade
  3. Pick a scenario card → "Run scenario" → watch the animated chain and the
    detected cascade type / severity / degradation.

Standalone script

Requires a database (uv run python scripts/db.py setup) and an LLM
provider (OPENAI_API_KEY in .env, or Ollama).

uv run python scripts/cascade_failure_demo.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant