diff --git a/submissions/Priyanshu-Bhardwaj/level5/answers.md b/submissions/Priyanshu-Bhardwaj/level5/answers.md
new file mode 100644
index 000000000..f66b15cf7
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level5/answers.md
@@ -0,0 +1,203 @@
+# Level 5 - Graph Thinking
+**Submitted by Priyanshu Bhardwaj**
+
+---
+
+## Q1. Model It (20 pts)
+
+See `schema.md` for the full diagram.
+
+### Summary
+
+**6 Node Labels:** `Project`, `Product`, `Station`, `Worker`, `Week`, `Certification`
+
+**8 Relationship Types:**
+
+| Relationship | Direction | Properties |
+|---|---|---|
+| `INCLUDES` | Project → Product | **`quantity`, `unit_factor`** |
+| `EXECUTED_AT` | Project → Station | **`week`, `planned_hours`, `actual_hours`** |
+| `PRIMARY_AT` | Worker → Station | — |
+| `CAN_COVER` | Worker → Station | — |
+| `HAS_CERT` | Worker → Certification | — |
+| `SCHEDULED_IN` | Station → Week | **`total_capacity`, `deficit`** |
+| `ACTIVE_DURING` | Project → Week | — |
+| `AVAILABLE_IN` | Worker → Week | **`hours`** |
+
+**Design decision:** `EXECUTED_AT`, `INCLUDES`, and `SCHEDULED_IN` carry properties because they represent volatile, measurable states (planned vs. actual hours, capacity deficits, and volume requirements) that dictate the factory's operational load.
+
+```mermaid
+graph TD
+ Project((Project)) -->|"INCLUDES
{quantity: 600, unit_factor: 1.77}"| Product((Product))
+ Project -->|"EXECUTED_AT
{week: 'w1', planned_hours: 48, actual_hours: 45.2}"| Station((Station))
+ Worker((Worker)) -->|"PRIMARY_AT"| Station
+ Worker -->|"CAN_COVER"| Station
+ Worker -->|"HAS_CERT"| Certification((Certification))
+ Station -->|"SCHEDULED_IN
{total_capacity: 480, deficit: -132}"| Week((Week))
+ Project -->|"ACTIVE_DURING"| Week
+ Worker -->|"AVAILABLE_IN
{hours: 40}"| Week
+```
+
+---
+
+## Q2. Why Not Just SQL? (20 pts)
+
+**Query:** *"Which workers are certified to cover Station 016 (Gjutning) when Per Hansen is on vacation, and which projects would be affected?"*
+
+### SQL Version
+
+```sql
+SELECT
+ w.name AS Covering_Worker,
+ p.project_name AS Affected_Project
+FROM factory_workers w
+JOIN factory_production p ON p.station_code = '016'
+WHERE w.can_cover_stations LIKE '%016%'
+ AND w.name != 'Per Hansen';
+```
+
+### Cypher Version
+
+```cypher
+MATCH (absent:Worker {name: "Per Hansen"})-[:PRIMARY_AT]->(s:Station {station_code: "016"})
+MATCH (cover:Worker)-[:CAN_COVER]->(s)
+WHERE cover <> absent
+MATCH (p:Project)-[exec:EXECUTED_AT]->(s)
+RETURN cover.name AS Covering_Worker, p.project_name AS Affected_Project
+```
+
+### What Graph Makes Obvious
+SQL hides the many-to-many relationship of worker coverage behind a clunky comma-separated string (`can_cover_stations`) that requires slow `LIKE` string-matching or complex junction tables. In the graph, the `CAN_COVER` relationship is an explicit, first-class entity; traversing from the absent worker to the station, to the replacement worker, and out to the affected projects is structurally intuitive and highlights single-point-of-failure vulnerabilities instantly.
+
+---
+
+## Q3. Spot the Bottleneck (20 pts)
+
+### 1. Cypher Query — Overruns >10% Grouped by Station
+
+```cypher
+MATCH (p:Project)-[exec:EXECUTED_AT]->(s:Station)
+WHERE exec.actual_hours > (exec.planned_hours * 1.10)
+RETURN
+ p.project_name,
+ s.station_name,
+ exec.week,
+ exec.planned_hours,
+ exec.actual_hours
+ORDER BY (exec.actual_hours - exec.planned_hours) DESC
+```
+
+### 2. Modelling the Alert as a Graph Pattern
+
+Instead of burying a bottleneck as an isolated property calculation, it should be modelled as an explicit event node: `(:BottleneckAlert)`.
+
+```cypher
+// Graph Pattern for Alert Modeling
+(p:Project)-[:TRIGGERED]->(b:BottleneckAlert)-[:OCCURRED_AT]->(s:Station)
+```
+
+This allows for automated nightly graph jobs that generate alert nodes when actual hours severely exceed planned hours. By querying these nodes, you can quickly analyze historical trends, asking questions like *"Which stations accumulate the most Bottleneck Alerts?"* or linking them to `(:Week)` nodes to visualize the factory's peak stress periods.
+
+---
+
+## Q4. Vector + Graph Hybrid (20 pts)
+
+### 1. What to Embed
+Embed the **Project descriptions and requirements** (the contextual summary, complexity parameters, and client expectations) into the `Project` node as a dense vector. Example: *"450 meters of IQB beams for a hospital extension in Linköping, tight timeline."*
+
+### 2. Hybrid Query — Vector Similarity + Graph Filter
+
+```cypher
+// 1. Vector Search: Find structurally/thematically similar past projects
+CALL db.index.vector.queryNodes('project_embeddings', 5, $new_project_vector)
+YIELD node AS pastProject, score
+
+// 2. Graph Filter: Traverse to find which of those similar projects succeeded operationally
+MATCH (pastProject)-[exec:EXECUTED_AT]->(s:Station)
+WITH pastProject, score, sum(exec.actual_hours) AS total_actual, sum(exec.planned_hours) AS total_planned
+WHERE (total_actual - total_planned) / total_planned < 0.05
+
+RETURN
+ pastProject.project_name,
+ score,
+ total_planned,
+ total_actual
+ORDER BY score DESC
+```
+
+### 3. Why This Beats Plain Filtering
+Filtering solely by "IQB beams" treats a simple warehouse beam exactly the same as a highly complex, tight-tolerance beam for a hospital. Vector embeddings understand semantic intent and context. By combining this with the graph, you retrieve projects that were not just semantically similar, but mathematically proven to have been executed well (variance < 5%), providing highly reliable baseline data for quoting new jobs.
+
+---
+
+## Q5. My L6 Blueprint (20 pts)
+
+### Node Labels → CSV Column Mappings
+
+| Node | CSV | Mapped Columns |
+|------|-----|----------------|
+| `(:Project)` | factory_production.csv | `project_id`, `project_number`, `project_name` |
+| `(:Station)` | factory_production.csv | `station_code`, `station_name` |
+| `(:Worker)` | factory_workers.csv | `worker_id`, `name`, `role`, `type` |
+| `(:Week)` | factory_capacity.csv | `week` |
+
+### Relationship Types and What Creates Them
+
+| Relationship | Created By |
+|---|---|
+| `(Project)-[:EXECUTED_AT]->(Station)` | Grouping rows in `factory_production.csv`, storing `week`, `planned_hours`, `actual_hours` |
+| `(Worker)-[:CAN_COVER]->(Station)` | Splitting the `can_cover_stations` string in `factory_workers.csv` |
+| `(Station)-[:SCHEDULED_IN]->(Week)` | Mapping `factory_capacity.csv` data to track `total_capacity` and `deficit` |
+
+### 4 Streamlit Dashboard Panels
+
+#### Panel 1 — Project Overview
+Shows all projects with total planned hours, actual hours, and variance percentage.
+```cypher
+MATCH (p:Project)
+OPTIONAL MATCH (p)-[exec:EXECUTED_AT]->()
+WITH p, sum(exec.planned_hours) AS planned, sum(exec.actual_hours) AS actual
+RETURN
+ p.project_number,
+ p.project_name,
+ planned AS total_planned,
+ actual AS total_actual,
+ CASE WHEN planned > 0 THEN round(((actual - planned) / planned) * 100, 1) ELSE 0 END AS variance_pct
+ORDER BY p.project_number
+```
+
+#### Panel 2 — Station Load Across Weeks
+Visualizes hours per station across weeks to highlight where actual exceeds planned.
+```cypher
+MATCH (p:Project)-[exec:EXECUTED_AT]->(s:Station)
+RETURN
+ s.station_name,
+ exec.week AS week,
+ sum(exec.planned_hours) AS planned_load,
+ sum(exec.actual_hours) AS actual_load
+ORDER BY week, s.station_name
+```
+
+#### Panel 3 — Capacity Tracker
+Shows weekly capacity vs demand, flagging deficit weeks.
+```cypher
+MATCH (s:Station)-[sched:SCHEDULED_IN]->(w:Week)
+RETURN
+ w.id AS Week,
+ sum(sched.total_capacity) AS Capacity,
+ sum(sched.deficit) AS Deficit
+ORDER BY Week
+```
+
+#### Panel 4 — Worker Coverage Matrix
+Displays which workers cover which stations and highlights Single Points of Failure (SPOF).
+```cypher
+MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+WITH s, count(w) AS cover_count, collect(w.name) AS covered_by
+RETURN
+ s.station_name,
+ cover_count,
+ covered_by,
+ CASE WHEN cover_count = 1 THEN true ELSE false END AS is_spof
+ORDER BY cover_count ASC
+```
diff --git a/submissions/Priyanshu-Bhardwaj/level5/schema.md b/submissions/Priyanshu-Bhardwaj/level5/schema.md
new file mode 100644
index 000000000..5f28aa169
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level5/schema.md
@@ -0,0 +1,11 @@
+```mermaid
+graph TD
+ Project((Project)) -->|"INCLUDES
{quantity: 600, unit_factor: 1.77}"| Product((Product))
+ Project -->|"EXECUTED_AT
{week: 'w1', planned_hours: 48, actual_hours: 45.2}"| Station((Station))
+ Worker((Worker)) -->|"PRIMARY_AT"| Station
+ Worker -->|"CAN_COVER"| Station
+ Worker -->|"HAS_CERT"| Certification((Certification))
+ Station -->|"SCHEDULED_IN
{total_capacity: 480, deficit: -132}"| Week((Week))
+ Project -->|"ACTIVE_DURING"| Week
+ Worker -->|"AVAILABLE_IN
{hours: 40}"| Week
+```
diff --git a/submissions/Priyanshu-Bhardwaj/level6/.env.example b/submissions/Priyanshu-Bhardwaj/level6/.env.example
new file mode 100644
index 000000000..ef1eed100
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/.env.example
@@ -0,0 +1,3 @@
+NEO4J_URI=neo4j+s://.databases.neo4j.io
+NEO4J_USERNAME=
+NEO4J_PASSWORD=
diff --git a/submissions/Priyanshu-Bhardwaj/level6/DASHBOARD_URL.txt b/submissions/Priyanshu-Bhardwaj/level6/DASHBOARD_URL.txt
new file mode 100644
index 000000000..32ba11052
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/DASHBOARD_URL.txt
@@ -0,0 +1 @@
+https://lpi-l6-7v5vyywp4psdzyvgtzyxka.streamlit.app
diff --git a/submissions/Priyanshu-Bhardwaj/level6/README.md b/submissions/Priyanshu-Bhardwaj/level6/README.md
new file mode 100644
index 000000000..521d0f408
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/README.md
@@ -0,0 +1,80 @@
+# 🏭 Factory Production Knowledge Graph
+
+**A Streamlit Dashboard powered by a Neo4j Knowledge Graph for factory production and capacity planning.**
+
+This project transforms a complex, 46-sheet Excel-based manufacturing schedule into a connected graph database. It visualizes project load, identifies operational bottlenecks, tracks weekly capacity constraints, and highlights single points of failure in workforce coverage.
+
+### 🚀 Live Deployment
+**Access the deployed dashboard here:** [streamlit app](https://lpi-l6-7v5vyywp4psdzyvgtzyxka.streamlit.app/)
+
+---
+
+## ✨ Key Features (Dashboard Panels)
+
+1. **📊 Project Overview:** Tracks planned vs. actual hours and completion efficiency across all 8 major construction projects.
+2. **⚙️ Station Load Matrix:** Interactive visualizations highlighting overloaded production stations week-by-week.
+3. **📈 Capacity Tracker:** Monitors total factory capacity against planned demand, explicitly flagging weeks operating in a deficit.
+4. **👷 Worker Coverage:** A matrix identifying which workers are certified for which stations, successfully pinpointing Single-Point-of-Failure (SPOF) vulnerabilities.
+5. **✅ Automated Self-Test:** A built-in validation suite that tests the live Neo4j connection, verifies node/relationship counts, and executes variance Cypher queries.
+
+## 🛠️ Technology Stack
+* **Frontend:** Streamlit, Plotly Express
+* **Database:** Neo4j Aura (Graph Database)
+* **Data Processing:** Python, Pandas, Cypher query language
+* **Environment:** Python `python-dotenv`
+
+---
+
+## 💻 Local Setup & Installation
+
+Follow these steps to run the Knowledge Graph and Dashboard on your local machine.
+
+### 1. Prerequisites
+* Python 3.8+ installed.
+* A free [Neo4j Aura](https://neo4j.com/cloud/aura/) database instance.
+
+### 2. Clone the Repository
+```bash
+git clone https://github.com/PriyanshuBHardwaj20/lpi-l6.git
+cd lpi-l6
+```
+
+### 3. Set Up the Virtual Environment
+
+**Mac/Linux:**
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+```
+
+**Windows:**
+```
+python -m venv venv
+venv\Scripts\activate
+pip install -r requirements.txt
+```
+
+### 4. Configure Environment Variables
+Create a file named .env in the root directory of the project. Never commit this file to version control. Add your Neo4j Aura credentials:
+
+```
+NEO4J_URI=neo4j+ssc://.databases.neo4j.io
+NEO4J_USERNAME=
+NEO4J_PASSWORD=
+```
+### 5. Seed the Knowledge Graph
+
+Before running the dashboard, you must populate the Neo4j database with the CSV data. The seeding script creates uniqueness constraints and uses MERGE operations, making it safe to run multiple times without duplicating data.
+```
+python seed_graph.py
+```
+Run the script and wait for the "Graph seeding complete! ✅" confirmation message.
+
+### 6. Run the Dashboard
+
+Once the database is seeded, launch the Streamlit application:
+```
+streamlit run app.py
+```
+The dashboard will automatically open in your default web browser at http://localhost:8501.
diff --git a/submissions/Priyanshu-Bhardwaj/level6/app.py b/submissions/Priyanshu-Bhardwaj/level6/app.py
new file mode 100644
index 000000000..02102cda8
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/app.py
@@ -0,0 +1,196 @@
+import streamlit as st
+from neo4j import GraphDatabase
+import pandas as pd
+import plotly.express as px
+import os
+from dotenv import load_dotenv
+
+# Load local .env file
+load_dotenv()
+
+# Set page layout to wide
+st.set_page_config(page_title="Factory Graph Dashboard", layout="wide")
+
+# Connect to Neo4j
+@st.cache_resource
+def get_driver():
+ try:
+ # Try Streamlit Secrets first (for when deployed to the cloud)
+ uri = st.secrets["NEO4J_URI"]
+ user = st.secrets["NEO4J_USERNAME"]
+ pwd = st.secrets["NEO4J_PASSWORD"]
+ except Exception:
+ # Fall back to local .env variables if no secrets.toml exists
+ uri = os.getenv("NEO4J_URI")
+ user = os.getenv("NEO4J_USERNAME")
+ pwd = os.getenv("NEO4J_PASSWORD")
+
+ return GraphDatabase.driver(uri, auth=(user, pwd))
+
+driver = get_driver()
+
+def run_cypher(query, params=None):
+ with driver.session() as session:
+ result = session.run(query, params or {})
+ return pd.DataFrame([dict(r) for r in result])
+
+st.title("🏭 Factory Production Knowledge Graph")
+
+# Navigation via Tabs
+tab1, tab2, tab3, tab4, tab5 = st.tabs([
+ "📊 Project Overview",
+ "⚙️ Station Load",
+ "📈 Capacity Tracker",
+ "👷 Worker Coverage",
+ "✅ Self-Test"
+])
+
+# --- PAGE 1: Project Overview ---
+with tab1:
+ st.header("Project Overview")
+ query = """
+ MATCH (p:Project)
+ OPTIONAL MATCH (p)-[e:EXECUTED_AT]->()
+ WITH p, sum(e.planned_hours) AS planned, sum(e.actual_hours) AS actual
+ OPTIONAL MATCH (p)-[:INCLUDES]->(prod:Product)
+ RETURN p.id AS Project_ID, p.name AS Name,
+ round(planned, 1) AS Planned_Hours,
+ round(actual, 1) AS Actual_Hours,
+ CASE WHEN planned > 0 THEN round(((actual - planned) / planned) * 100, 1) ELSE 0 END AS Variance_Pct,
+ collect(DISTINCT prod.type) AS Products
+ ORDER BY Project_ID
+ """
+ df = run_cypher(query)
+
+ # Stylize variance
+ def style_variance(val):
+ color = 'red' if val > 0 else 'green'
+ return f'color: {color}'
+
+ st.dataframe(df.style.map(style_variance, subset=['Variance_Pct']), use_container_width=True)
+
+# --- PAGE 2: Station Load ---
+with tab2:
+ st.header("Station Load Across Weeks")
+ query = """
+ MATCH (p:Project)-[e:EXECUTED_AT]->(s:Station)
+ RETURN s.name AS Station, e.week AS Week,
+ sum(e.planned_hours) AS Planned, sum(e.actual_hours) AS Actual
+ ORDER BY Week, Station
+ """
+ df = run_cypher(query)
+ if not df.empty:
+ # Calculate overload boolean for highlighting
+ df['Is Overloaded'] = df['Actual'] > df['Planned']
+
+ fig = px.bar(df, x="Week", y=["Planned", "Actual"], barmode="group",
+ facet_col="Station", facet_col_wrap=3,
+ title="Planned vs Actual Hours per Station",
+ color_discrete_map={"Planned": "#1f77b4", "Actual": "#ff7f0e"})
+ st.plotly_chart(fig, use_container_width=True)
+
+ # Highlight overloaded stations explicitly
+ st.subheader("⚠️ Overloaded Stations Warning")
+ overloaded = df[df['Is Overloaded']][['Week', 'Station', 'Planned', 'Actual']]
+ st.dataframe(overloaded, use_container_width=True)
+
+# --- PAGE 3: Capacity Tracker ---
+with tab3:
+ st.header("Weekly Factory Capacity vs Demand")
+ query = """
+ MATCH (w:Week)-[c:HAS_CAPACITY]->(:Factory)
+ RETURN w.id AS Week, c.total_capacity AS Capacity,
+ c.total_planned AS Planned_Demand, c.deficit AS Deficit
+ ORDER BY Week
+ """
+ df = run_cypher(query)
+ if not df.empty:
+ fig = px.line(df, x="Week", y=["Capacity", "Planned_Demand"], markers=True,
+ title="Capacity vs Planned Demand Tracker")
+ fig.add_bar(x=df["Week"], y=df["Deficit"], name="Deficit (Red if < 0)")
+
+ st.plotly_chart(fig, use_container_width=True)
+
+ def highlight_deficit(val):
+ return 'background-color: #ffcccc' if val < 0 else ''
+ st.dataframe(df.style.map(highlight_deficit, subset=['Deficit']), use_container_width=True)
+
+# --- PAGE 4: Worker Coverage ---
+with tab4:
+ st.header("Worker Station Coverage Matrix")
+ query = """
+ MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+ RETURN w.name AS Worker, s.name AS Station
+ """
+ df = run_cypher(query)
+ if not df.empty:
+ # Create a Pivot Table (Matrix)
+ matrix = pd.crosstab(df['Worker'], df['Station']).replace({0: '-', 1: '✅'})
+ st.dataframe(matrix, use_container_width=True)
+
+ st.subheader("🚨 Single-Point-of-Failure Stations")
+ spof_query = """
+ MATCH (s:Station)<-[:CAN_COVER]-(w:Worker)
+ WITH s, count(w) AS coverage_count
+ WHERE coverage_count = 1
+ RETURN s.name AS Station, coverage_count AS Workers_Certified
+ """
+ spof_df = run_cypher(spof_query)
+ st.error("The following stations only have ONE certified worker who can cover them:")
+ st.table(spof_df)
+
+# --- PAGE 5: Self-Test ---
+with tab5:
+ st.header("Mission Validation: Self-Test")
+
+ def run_self_test(driver):
+ checks = []
+ try:
+ with driver.session() as s:
+ s.run("RETURN 1")
+ checks.append(("Neo4j connected", True, 3))
+ except:
+ checks.append(("Neo4j connected", False, 3))
+ return checks
+
+ with driver.session() as s:
+ result = s.run("MATCH (n) RETURN count(n) AS c").single()
+ checks.append((f"{result['c']} nodes (min: 50)", result['c'] >= 50, 3))
+
+ result = s.run("MATCH ()-[r]->() RETURN count(r) AS c").single()
+ checks.append((f"{result['c']} relationships (min: 100)", result['c'] >= 100, 3))
+
+ result = s.run("CALL db.labels() YIELD label RETURN count(label) AS c").single()
+ checks.append((f"{result['c']} node labels (min: 6)", result['c'] >= 6, 3))
+
+ result = s.run("CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c").single()
+ checks.append((f"{result['c']} relationship types (min: 8)", result['c'] >= 8, 3))
+
+ # Adjusted Variance Query based on our EXECUTED_AT schema mapping
+ result = s.run("""
+ MATCH (p:Project)-[r:EXECUTED_AT]->(s:Station)
+ WHERE r.actual_hours > (r.planned_hours * 1.1)
+ RETURN p.name AS project, s.name AS station,
+ r.planned_hours AS planned, r.actual_hours AS actual
+ LIMIT 10
+ """)
+ rows = [dict(r) for r in result]
+ checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5))
+
+ return checks
+
+ results = run_self_test(driver)
+
+ total_score = 0
+ for text, passed, pts in results:
+ status = "✅" if passed else "❌"
+ earned = pts if passed else 0
+ total_score += earned
+ st.markdown(f"**{status} {text}** — *{earned}/{pts} pts*")
+
+ st.divider()
+ st.subheader(f"SELF-TEST SCORE: {total_score}/20")
+ if total_score == 20:
+ st.success("All checks passed! You are ready to deploy.")
+ else:
+ st.error("Some checks failed. Review your graph schema or queries.")
diff --git a/submissions/Priyanshu-Bhardwaj/level6/requirements.txt b/submissions/Priyanshu-Bhardwaj/level6/requirements.txt
new file mode 100644
index 000000000..c27b6ae5c
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/requirements.txt
@@ -0,0 +1,5 @@
+streamlit
+neo4j
+pandas
+plotly
+python-dotenv
diff --git a/submissions/Priyanshu-Bhardwaj/level6/seed_graph.py b/submissions/Priyanshu-Bhardwaj/level6/seed_graph.py
new file mode 100644
index 000000000..bb028c6ee
--- /dev/null
+++ b/submissions/Priyanshu-Bhardwaj/level6/seed_graph.py
@@ -0,0 +1,112 @@
+import os
+import pandas as pd
+from neo4j import GraphDatabase
+from dotenv import load_dotenv
+
+# Load environment variables
+load_dotenv()
+URI = os.getenv("NEO4J_URI")
+AUTH = (os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD"))
+
+def run_query(driver, query, parameters=None):
+ with driver.session() as session:
+ session.run(query, parameters)
+
+def setup_constraints(driver):
+ print("Setting up constraints...")
+ constraints = [
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project) REQUIRE p.id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (s:Station) REQUIRE s.code IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (w:Worker) REQUIRE w.id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (wk:Week) REQUIRE wk.id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (pd:Product) REQUIRE pd.type IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (c:Certification) REQUIRE c.name IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (f:Factory) REQUIRE f.name IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (e:Etapp) REQUIRE e.name IS UNIQUE"
+ ]
+ for q in constraints:
+ run_query(driver, q)
+
+def load_data(driver):
+ print("Loading CSVs...")
+ production_df = pd.read_csv("data/factory_production.csv", dtype=str)
+ workers_df = pd.read_csv("data/factory_workers.csv", dtype=str)
+ capacity_df = pd.read_csv("data/factory_capacity.csv", dtype=str)
+
+ # 1. Load Capacity & Weeks (Relationships: 1. HAS_CAPACITY)
+ print("Seeding Capacity & Weeks...")
+ capacity_data = capacity_df.fillna(0).to_dict('records')
+ run_query(driver, """
+ UNWIND $rows AS row
+ MERGE (w:Week {id: row.week})
+ MERGE (f:Factory {name: "Main Factory"})
+ MERGE (w)-[r:HAS_CAPACITY {own: toInteger(row.own_hours), hired: toInteger(row.hired_hours),
+ overtime: toInteger(row.overtime_hours), deficit: toInteger(row.deficit)}]->(f)
+ SET r.total_capacity = toInteger(row.total_capacity), r.total_planned = toInteger(row.total_planned)
+ """, {"rows": capacity_data})
+
+ # 2. Load Workers (Relationships: 2. PRIMARY_AT, 3. CAN_COVER, 4. HAS_CERT)
+ print("Seeding Workers...")
+ workers_data = workers_df.fillna("").to_dict('records')
+ run_query(driver, """
+ UNWIND $rows AS row
+ MERGE (w:Worker {id: row.worker_id})
+ SET w.name = row.name, w.role = row.role, w.type = row.type
+
+ // Primary Station
+ WITH w, row
+ WHERE row.primary_station <> ""
+ MERGE (s:Station {code: toString(row.primary_station)})
+ MERGE (w)-[:PRIMARY_AT]->(s)
+
+ // Cover Stations
+ WITH w, row
+ WHERE row.can_cover_stations <> ""
+ UNWIND split(row.can_cover_stations, ",") AS cover_code
+ MERGE (cs:Station {code: trim(cover_code)})
+ MERGE (w)-[:CAN_COVER]->(cs)
+
+ // Certifications
+ WITH w, row
+ WHERE row.certifications <> ""
+ UNWIND split(row.certifications, ",") AS cert_name
+ MERGE (c:Certification {name: trim(cert_name)})
+ MERGE (w)-[:HAS_CERT]->(c)
+ """, {"rows": workers_data})
+
+ # 3. Load Production (Relationships: 5. EXECUTED_AT, 6. INCLUDES, 7. ACTIVE_DURING, 8. ASSIGNED_TO)
+ print("Seeding Production...")
+ prod_data = production_df.fillna(0).to_dict('records')
+ run_query(driver, """
+ UNWIND $rows AS row
+ MERGE (p:Project {id: row.project_id})
+ SET p.name = row.project_name, p.number = toString(row.project_number)
+
+ MERGE (s:Station {code: toString(row.station_code)})
+ SET s.name = row.station_name
+
+ MERGE (pd:Product {type: row.product_type})
+ SET pd.unit = row.unit
+
+ MERGE (wk:Week {id: row.week})
+ MERGE (et:Etapp {name: row.etapp})
+
+ // Ensure idempotency for identical week/station overlaps by adding product to rel
+ MERGE (p)-[r:EXECUTED_AT {week: row.week, product: row.product_type}]->(s)
+ SET r.planned_hours = toFloat(row.planned_hours),
+ r.actual_hours = toFloat(row.actual_hours),
+ r.completed_units = toInteger(row.completed_units)
+
+ MERGE (p)-[inc:INCLUDES {product: row.product_type}]->(pd)
+ SET inc.qty = toFloat(row.quantity), inc.unit_factor = toFloat(row.unit_factor)
+
+ MERGE (p)-[:ACTIVE_DURING]->(wk)
+ MERGE (p)-[:ASSIGNED_TO]->(et)
+ """, {"rows": prod_data})
+
+ print("Graph seeding complete! ✅")
+
+if __name__ == "__main__":
+ with GraphDatabase.driver(URI, auth=AUTH) as driver:
+ setup_constraints(driver)
+ load_data(driver)