A context graph system with an LLM-powered query interface for SAP Order-to-Cash data. Users can visually explore interconnected business entities and ask natural language questions that are translated into SQL queries executed against a real database.
Live Demo: https://oc2-graph.vercel.app/ Repository: https://github.com/RoBWarrior/oc2-graph
- Ingests SAP O2C dataset (JSONL) and models it as a graph of interconnected entities
- Visualizes the graph with force-directed layout — nodes are clickable and expandable
- Provides a chat interface where users ask questions in natural language
- LLM translates questions → PostgreSQL SQL → executes against Supabase → returns natural language answers
- Guardrails reject off-topic queries (general knowledge, coding help, etc.)
┌─────────────────────────────────────────────────┐
│ Frontend (Vercel) │
│ React + Vite │
│ react-force-graph-2d — graph visualization │
│ Chat panel — NL query interface │
└──────────────────┬──────────────────────────────┘
│ HTTP (axios)
┌──────────────────▼──────────────────────────────┐
│ Backend (Render) │
│ Node.js + Express │
│ /api/graph → nodes + edges from Supabase │
│ /api/chat → NL → Groq → SQL → Supabase → NL │
└──────────────────┬──────────────────────────────┘
│ Supabase JS SDK
┌──────────────────▼──────────────────────────────┐
│ Supabase (PostgreSQL) │
│ 17 tables covering the full O2C flow │
│ execute_sql() RPC for safe dynamic queries │
└──────────────────┬──────────────────────────────┘
│ REST API
┌──────────────────▼──────────────────────────────┐
│ Groq API — llama-3.3-70b-versatile │
│ Stage 1: classify + generate SQL │
│ Stage 2: summarize results in natural language │
└─────────────────────────────────────────────────┘
| Type | Source Table | Color |
|---|---|---|
| SalesOrder | sales_order_headers | Blue |
| Customer | business_partners | Orange |
| Product | products | Purple |
| Delivery | outbound_delivery_headers | Green |
| BillingDocument | billing_document_headers | Yellow |
| JournalEntry | journal_entry_items | Pink |
| Payment | payments | Teal |
| Plant | outbound_delivery_items.plant | Cyan |
Customer ──placed──► SalesOrder ──contains──► Product
│
delivered via
│
▼
Delivery ──ships from──► Plant
SalesOrder ──billed as──► BillingDocument ──posted as──► JournalEntry ──cleared by──► Payment
Why Supabase over alternatives:
| Option | Verdict |
|---|---|
| SQLite | Fails on Render (ephemeral filesystem resets on deploy) |
| Firebase/Firestore | NoSQL — cross-collection JOINs are painful, NL→SQL impossible |
| Supabase (PostgreSQL) | ✅ Free, persistent, real SQL JOINs, RPC for dynamic execution |
Supabase gives us a real PostgreSQL database which is critical for the NL→SQL pipeline. The LLM generates standard SQL that runs directly via a custom execute_sql() RPC function.
Three-stage pipeline using Groq (llama-3.3-70b-versatile):
The system prompt includes:
- Full schema with all 17 table names and column names
- All key foreign key relationships
- Status code definitions
- Explicit rules for JOIN patterns (e.g. O2C flow traces)
- Instruction to return structured JSON only
Groq returns one of:
{"type": "off_topic", "message": "..."}or
{"type": "query", "sql": "SELECT ...", "explanation": "..."}SQL is validated server-side (SELECT-only) then executed via Supabase RPC.
Raw results (up to 20 rows) are sent back to Groq with the original question for a human-readable summary.
Why Groq over Gemini:
Google's Generative Language API times out from certain Indian ISPs. Groq's API is reliably accessible and significantly faster (sub-second responses).
Four layers of protection:
-
LLM-level — System prompt explicitly instructs the model to return
{"type": "off_topic"}for any non-O2C query (general knowledge, weather, coding help, creative writing) -
Server-level SQL validation — Before execution, the backend checks:
- Query must start with
SELECT - Query must not contain
DROP,DELETE,INSERT,UPDATE,TRUNCATE, orALTER
- Query must start with
-
Database-level RPC — The
execute_sql()Supabase function enforces SELECT-only at the PostgreSQL level withSECURITY DEFINER -
Row limits — All queries capped at 50 rows by default in the system prompt
o2c-graph/
├── backend/
│ ├── index.js # Express server, /api/graph, /api/chat
│ ├── seed.js # Loads JSONL dataset into Supabase
│ ├── schema.sql # Run in Supabase SQL Editor to create tables
│ ├── rpc_function.sql # Creates execute_sql() RPC in Supabase
│ ├── package.json
│ └── .env.example
├── frontend/
│ ├── src/
│ │ ├── App.jsx # Main UI: graph + chat panel
│ │ └── main.jsx # React entry point
│ ├── index.html
│ ├── vite.config.js
│ └── package.json
├── sessions/
│ └── claude_session.md # AI coding session transcript
└── README.md
- Create a new project at supabase.com (region: Southeast Asia)
- Go to SQL Editor → run
backend/schema.sql(creates all 17 tables) - Go to SQL Editor → run
backend/rpc_function.sql(creates safe SQL executor) - Go to Settings → API and copy:
- Project URL
anonpublic keyservice_rolesecret key
cd backend
npm install
cp .env.example .envFill in .env:
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=eyJ...
SUPABASE_SERVICE_ROLE_KEY=eyJ...
GROQ_API_KEY=gsk_...
PORT=3001
Seed the database (update DATA_DIR in seed.js to point to your dataset):
node seed.jsStart the server:
node index.jscd frontend
npm install
cp .env.example .envFill in .env:
VITE_API_URL=http://localhost:3001
Start dev server:
npm run dev- Go to console.groq.com
- Sign up → API Keys → Create new key (free, no credit card)
- Push to GitHub
- New Web Service → connect repo → root directory:
backend - Build command:
npm install - Start command:
node index.js - Add environment variables in Render dashboard
- New Project → connect repo → root directory:
frontend - Add environment variable:
VITE_API_URL= your Render backend URL - Deploy
| Query | What it does |
|---|---|
| Which products are in the most billing documents? | Aggregates billing_document_items by material |
| Trace the full flow for billing document 90504298 | JOINs across SO → Delivery → Billing → Journal → Payment |
| Show sales orders delivered but not billed | LEFT JOIN with NULL check on billing side |
| Which customers have the highest total order value? | SUM on sales_order_headers grouped by sold_to_party |
| Show cancelled billing documents | Filters billing_document_is_cancelled = 'X' |
| What was the total payment amount in April 2025? | SUM on payments filtered by posting_date |
This project was built using Claude (claude.ai) as the primary AI coding assistant.
Session logs are available in /sessions/claude_session.md — these include the full conversation covering architecture decisions, debugging, and iteration.