From f7130240c457fb5c2a3a3eb2d05d2982ce0d84e8 Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Fri, 8 May 2026 12:56:02 +0530 Subject: [PATCH 1/7] fix level-5 submission path for jv-singh --- submissions/jv-singh/level5/answers.md | 1031 ++++++++++++++++++++++++ submissions/jv-singh/level5/schema.md | 108 +++ 2 files changed, 1139 insertions(+) create mode 100644 submissions/jv-singh/level5/answers.md create mode 100644 submissions/jv-singh/level5/schema.md diff --git a/submissions/jv-singh/level5/answers.md b/submissions/jv-singh/level5/answers.md new file mode 100644 index 000000000..4865c65d1 --- /dev/null +++ b/submissions/jv-singh/level5/answers.md @@ -0,0 +1,1031 @@ +# Level 5 — Graph Thinking: Knowledge Graph Foundations + +**Submission folder:** `submissions/jv-singh/level5/` +**Files:** + +```text +submissions/jv-singh/level5/ +├── answers.md +└── schema.md +``` + +## Dataset used + +I used the three CSV files from `challenges/data/`: + +1. `factory_production.csv` — project/product/station/week production facts. +2. `factory_workers.csv` — worker, station coverage, and certification data. +3. `factory_capacity.csv` — weekly total capacity and demand data. + +Important data-quality note: the Level 5 prompt says `factory_workers.csv` has 13 workers, but the actual file contains 14 workers (`W01` to `W14`). I used the actual CSV as the source of truth. Another naming mismatch exists in Q2: the question mentions **Per Gustafsson**, but the actual Station `016` primary worker in the CSV is **Per Hansen**. I wrote the query using an absent-worker parameter so either name can be used, and I explain the assumption in Q2. + +--- + +# Q1. Model It — Graph Schema Design + +## A. Final Answer + +The graph should model the factory as connected operational entities, not as isolated spreadsheet rows. + +### Node labels + +| Node label | Meaning | Key properties | Source columns | +|---|---|---|---| +| `Project` | A construction/customer project | `project_id`, `project_number`, `name` | `project_id`, `project_number`, `project_name` from `factory_production.csv` | +| `Product` | Product family being produced | `product_type`, `unit` | `product_type`, `unit` | +| `Station` | Factory production station | `station_code`, `name` | `station_code`, `station_name` | +| `Worker` | Factory employee or hired worker | `worker_id`, `name`, `role`, `type`, `hours_per_week` | `worker_id`, `name`, `role`, `type`, `hours_per_week` from `factory_workers.csv` | +| `Week` | Planning week | `week`, `sort_order` | `week` from production and capacity files | +| `Etapp` | Project phase/stage | `etapp_id` | `etapp` | +| `BOP` | Bill-of-process / process grouping | `bop_id` | `bop` | +| `Certification` | Skill or certificate held by a worker | `name` | parsed from `certifications` | +| `CapacityPlan` | Weekly capacity record | `capacity_id`, `own_hours`, `hired_hours`, `overtime_hours`, `total_capacity`, `total_planned`, `deficit`, `is_deficit` | all columns in `factory_capacity.csv` | +| `BottleneckAlert` | Derived alert for overload or risk | `alert_id`, `severity`, `reason`, `planned_hours`, `actual_hours`, `variance_hours`, `variance_pct` | derived from production and capacity data | + +This gives more than the required 6 node labels. + +### Relationship types + +| Relationship | Direction | Meaning | Key properties | +|---|---|---|---| +| `PRODUCES` | `(Project)-[:PRODUCES]->(Product)` | Project produces a product type | `quantity`, `unit`, `unit_factor` | +| `SCHEDULED_AT` | `(Project)-[:SCHEDULED_AT]->(Station)` | Project has work at a station in a week | `week`, `planned_hours`, `actual_hours`, `completed_units`, `variance_hours`, `variance_pct`, `is_over_10pct` | +| `HAS_WORK_IN` | `(Project)-[:HAS_WORK_IN]->(Week)` | Project has total work in a week | `planned_hours`, `actual_hours`, `station_count` | +| `HAS_ETAPP` | `(Project)-[:HAS_ETAPP]->(Etapp)` | Project belongs to or uses an etapp | `etapp` | +| `USES_BOP` | `(Project)-[:USES_BOP]->(BOP)` | Project uses a process grouping | `bop` | +| `REQUIRES_STATION` | `(Product)-[:REQUIRES_STATION]->(Station)` | Product type normally needs this station | `times_seen`, `total_planned_hours`, `total_actual_hours` | +| `PRIMARY_AT` | `(Worker)-[:PRIMARY_AT]->(Station)` | Worker’s main station | `role`, `hours_per_week` | +| `CAN_COVER` | `(Worker)-[:CAN_COVER]->(Station)` | Worker can cover this station | `is_primary`, `coverage_source`, `hours_per_week` | +| `HAS_CERTIFICATION` | `(Worker)-[:HAS_CERTIFICATION]->(Certification)` | Worker has this certification | `certification_name` | +| `HAS_CAPACITY` | `(Week)-[:HAS_CAPACITY]->(CapacityPlan)` | Week has capacity data | `total_capacity`, `total_planned`, `deficit` | +| `CAPACITY_PRESSURE_ON` | `(CapacityPlan)-[:CAPACITY_PRESSURE_ON]->(Station)` | A deficit week puts pressure on specific stations | `station_planned_hours`, `station_actual_hours`, `station_variance_hours`, `share_of_week_demand` | +| `FLAGS_PROJECT` | `(BottleneckAlert)-[:FLAGS_PROJECT]->(Project)` | Alert points to risky project | `reason` | +| `FLAGS_STATION` | `(BottleneckAlert)-[:FLAGS_STATION]->(Station)` | Alert points to risky station | `reason` | +| `FLAGS_WEEK` | `(BottleneckAlert)-[:FLAGS_WEEK]->(Week)` | Alert points to risky week | `reason` | + +This gives more than the required 8 relationship types. Several relationships carry data, especially `SCHEDULED_AT`, `PRODUCES`, `HAS_CAPACITY`, and `CAPACITY_PRESSURE_ON`. + +### Schema diagram + +See `schema.md` in this folder. It contains the Mermaid schema diagram and relationship-property table. + +## B. What I Did + +1. I treated every row in `factory_production.csv` as a production fact: one project, one product, one station, one week, planned hours, actual hours, and completed units. +2. I separated repeated names into nodes: `Project`, `Product`, `Station`, `Week`, `Etapp`, and `BOP`. +3. I treated worker coverage as a graph problem: workers connect to stations through `PRIMARY_AT` and `CAN_COVER`. +4. I treated certifications as separate nodes because one worker can have many certifications and one certification can be shared by many workers. +5. I added `CapacityPlan` nodes so weekly capacity records are not hidden as flat rows. +6. I added derived `BottleneckAlert` nodes for operational monitoring. + +## C. Why I Did It + +A spreadsheet row is good for storage, but weak for asking connected questions. A graph is better because factory planning is mainly about relationships: + +- Which project uses which station? +- Which worker can cover that station? +- Which week is overloaded? +- Which projects are driving the overload? +- Which stations have only one backup person? + +I used relationships with properties because production facts are not just connections. They also have measurements like `planned_hours`, `actual_hours`, and `completed_units`. + +## D. How It Works + +Example production row: + +```text +P01, Stålverket Borås, IQB, station 012, week w1, planned 32.0, actual 35.5 +``` + +Becomes: + +```cypher +(:Project {project_id: 'P01', name: 'Stålverket Borås'}) + -[:PRODUCES {quantity: 600, unit: 'meter', unit_factor: 1.77}]-> +(:Product {product_type: 'IQB'}) + +(:Project {project_id: 'P01'}) + -[:SCHEDULED_AT { + week: 'w1', + planned_hours: 32.0, + actual_hours: 35.5, + variance_hours: 3.5, + variance_pct: 0.109375, + is_over_10pct: true + }]-> +(:Station {station_code: '012', name: 'Förmontering IQB'}) +``` + +Example worker row: + +```text +W07, Per Hansen, primary_station 016, can_cover_stations 016,017 +``` + +Becomes: + +```cypher +(:Worker {worker_id: 'W07', name: 'Per Hansen'})-[:PRIMARY_AT]->(:Station {station_code: '016'}) +(:Worker {worker_id: 'W07'})-[:CAN_COVER]->(:Station {station_code: '016'}) +(:Worker {worker_id: 'W07'})-[:CAN_COVER]->(:Station {station_code: '017'}) +``` + +Now a query can jump from a project to a station to backup workers without manually joining many tables. + +## E. What Still Remains / Assumptions + +- I assume `station_code` uniquely identifies a station. +- I assume `product_type` uniquely identifies a product family. +- I assume `week` values like `w1`, `w2`, and `w3` can be sorted by the numeric part. +- I assume `BOP` means a process grouping or bill-of-process identifier. +- The schema can later be extended with real station-level capacity if the factory provides capacity per station, not just total weekly capacity. +- The schema can also support vector embeddings later by adding properties such as `embedding` or by using Neo4j vector indexes on project descriptions. + +--- + +# Q2. Why Not Just SQL? + +Question: **Which workers are certified to cover Station 016 (Gjutning) when Per Gustafsson is on vacation, and which projects would be affected?** + +## A. Final Answer + +### Assumption about the worker name + +The question mentions **Per Gustafsson**, but the actual CSV has **Per Hansen** as the primary worker for Station `016` (`Gjutning`). To avoid hard-coding a possibly wrong name, both queries below use an absent-worker parameter: + +```text +:absent_worker_name = 'Per Gustafsson' +``` + +For the actual CSV, I would run the same query with: + +```text +:absent_worker_name = 'Per Hansen' +``` + +### Equivalent SQL query + +Assumed SQL tables: + +- `workers(worker_id, name, role, primary_station, hours_per_week, type)` +- `worker_station_coverage(worker_id, station_code)` — parsed from `can_cover_stations` +- `stations(station_code, station_name)` +- `worker_certifications(worker_id, certification)` — parsed from `certifications` +- `production(project_id, project_name, product_type, station_code, week, planned_hours, actual_hours)` + +```sql +WITH affected_station AS ( + SELECT + s.station_code, + s.station_name + FROM stations s + WHERE s.station_code = '016' +), +backup_workers AS ( + SELECT + w.worker_id, + w.name, + w.role, + w.type, + w.hours_per_week, + STRING_AGG(wc.certification, ', ') AS certifications + FROM workers w + JOIN worker_station_coverage wsc + ON w.worker_id = wsc.worker_id + JOIN affected_station s + ON s.station_code = wsc.station_code + LEFT JOIN worker_certifications wc + ON wc.worker_id = w.worker_id + WHERE w.name <> :absent_worker_name + GROUP BY + w.worker_id, + w.name, + w.role, + w.type, + w.hours_per_week +), +affected_projects AS ( + SELECT DISTINCT + p.project_id, + p.project_name, + p.product_type, + p.week, + p.planned_hours, + p.actual_hours + FROM production p + JOIN affected_station s + ON s.station_code = p.station_code +) +SELECT + bw.worker_id, + bw.name AS backup_worker, + bw.role, + bw.type, + bw.hours_per_week, + bw.certifications, + ap.project_id, + ap.project_name, + ap.product_type, + ap.week, + ap.planned_hours, + ap.actual_hours +FROM backup_workers bw +CROSS JOIN affected_projects ap +ORDER BY + bw.name, + ap.week, + ap.project_id; +``` + +### Equivalent Cypher query + +```cypher +:param absent_worker_name => 'Per Hansen'; + +MATCH (station:Station {station_code: '016'}) +OPTIONAL MATCH (absent:Worker)-[:PRIMARY_AT|CAN_COVER]->(station) +WHERE absent.name = $absent_worker_name +MATCH (backup:Worker)-[:CAN_COVER]->(station) +WHERE backup.name <> $absent_worker_name +OPTIONAL MATCH (backup)-[:HAS_CERTIFICATION]->(cert:Certification) +MATCH (project:Project)-[work:SCHEDULED_AT]->(station) +RETURN + station.station_code AS station_code, + station.name AS station_name, + $absent_worker_name AS absent_worker, + backup.worker_id AS backup_worker_id, + backup.name AS backup_worker_name, + backup.role AS backup_worker_role, + backup.type AS backup_worker_type, + backup.hours_per_week AS backup_hours_per_week, + collect(DISTINCT cert.name) AS backup_certifications, + project.project_id AS affected_project_id, + project.name AS affected_project_name, + work.week AS affected_week, + work.planned_hours AS planned_hours, + work.actual_hours AS actual_hours +ORDER BY + backup_worker_name, + affected_week, + affected_project_id; +``` + +### Expected result using the actual CSV + +For Station `016` (`Gjutning`), the workers who can cover the station are: + +1. **Per Hansen** — primary worker for Station `016`; certifications include `Casting` and `Formwork`. +2. **Victor Elm** — foreman who can cover many stations, including `016`; certifications include `Leadership`, `CE`, and `ISO 9001`. + +If **Per Hansen** is absent, the only remaining listed cover for Station `016` is **Victor Elm**. This is a resilience risk because Station `016` has only one backup in the current data. + +Affected projects at Station `016` are: + +| Project | Product | Week | Planned hours | Actual hours | Comment | +|---|---:|---:|---:|---:|---| +| `P03` Lagerhall Jönköping | IQB | `w2` | 28.0 | 35.0 | 25.0% over plan | +| `P05` Sjukhus Linköping ET2 | IQB | `w2` | 35.0 | 40.0 | 14.3% over plan | +| `P07` Idrottshall Västerås | IQB | `w2` | 20.0 | 22.0 | 10.0% over plan; not greater than 10%, exactly 10% | +| `P08` Bro E6 Halmstad | IQB | `w3` | 22.0 | 25.0 | 13.6% over plan | + +## B. What I Did + +1. I identified Station `016` as `Gjutning`. +2. I checked which workers have `016` in `can_cover_stations`. +3. I excluded the absent worker. +4. I connected the remaining backup workers to all projects scheduled at Station `016`. +5. I included certifications so the answer is not only “who can cover”, but also “why they are qualified”. + +## C. Why I Did It + +The question is not just asking for a list of workers. It is asking for operational impact: + +- Who can replace the absent worker? +- Are they certified? +- Which projects depend on the affected station? +- Is this a single-point-of-failure risk? + +The graph query expresses this naturally because all of these are relationships. + +## D. How It Works + +In SQL, the logic must be reconstructed using joins between workers, station coverage, certifications, stations, and production rows. That is correct, but it is verbose. + +In Cypher, the logic follows the real-world chain: + +```text +Backup Worker → can cover → Station 016 ← scheduled at ← Project +``` + +That path is exactly the business question. + +The graph version makes the dependency visible: + +```cypher +(backup:Worker)-[:CAN_COVER]->(station:Station)<-[work:SCHEDULED_AT]-(project:Project) +``` + +This tells us both the available replacement worker and the affected projects in one pattern. + +## E. What Still Remains / Assumptions + +- The CSV does not contain a worker named `Per Gustafsson`; I assume the intended absent worker is the Station `016` primary worker, `Per Hansen`. +- The CSV lists certifications as text; it does not say which exact certification is mandatory for each station. I assume a worker listed in `can_cover_stations` is operationally allowed to cover that station. +- If real factory rules are stricter, we should add `(:Station)-[:REQUIRES_CERTIFICATION]->(:Certification)` and filter backup workers by required certification. + +### Why graph is better here + +The graph version is better because the question is path-based. It moves from absent worker to station, from station to backup workers, and from station to affected projects. In SQL, the same question requires multiple joins and parsed many-to-many tables. In a graph, the same relationships are first-class data, so the impact chain is easier to read, validate, and extend. + +--- + +# Q3. Spot the Bottleneck + +## A. Final Answer + +### Capacity-level finding + +The capacity file shows weekly total factory capacity versus total planned demand. + +| Week | Total capacity | Total planned | Deficit | Capacity status | +|---|---:|---:|---:|---| +| `w1` | 480 | 612 | -132 | Overloaded | +| `w2` | 520 | 645 | -125 | Overloaded | +| `w3` | 480 | 398 | 82 | Enough capacity | +| `w4` | 500 | 550 | -50 | Overloaded | +| `w5` | 510 | 480 | 30 | Enough capacity | +| `w6` | 440 | 520 | -80 | Overloaded | +| `w7` | 520 | 600 | -80 | Overloaded | +| `w8` | 500 | 470 | 30 | Enough capacity | + +So the overloaded weeks are: + +```text +w1, w2, w4, w6, w7 +``` + +However, `factory_production.csv` only contains detailed station/project rows for `w1`, `w2`, and `w3`. Therefore, I can attribute station/project causes for `w1` and `w2`, but not for `w4`, `w6`, and `w7` without more production rows. + +### Station-level bottlenecks from production data + +Aggregating all available production rows by station: + +| Rank | Station | Planned hours | Actual hours | Variance hours | Variance % | Rows >10% over plan | Bottleneck interpretation | +|---:|---|---:|---:|---:|---:|---:|---| +| 1 | `012` Förmontering IQB | 324.0 | 345.0 | +21.0 | +6.5% | 2 | Largest total extra hours | +| 2 | `016` Gjutning | 105.0 | 122.0 | +17.0 | +16.2% | 3 | Worst percentage overrun and weak coverage | +| 3 | `014` Svets o montage IQB | 235.0 | 247.2 | +12.2 | +5.2% | 1 | Moderate overrun | +| 4 | `015` Montering IQP | 172.0 | 184.0 | +12.0 | +7.0% | 3 | Several jobs over 10% | +| 5 | `018` SB B/F-hall | 149.0 | 158.5 | +9.5 | +6.4% | 4 | Many small overruns | + +### Most important bottleneck: Station 016 (`Gjutning`) + +Station `016` is the most serious operational bottleneck even though it is not the largest by total hours. + +Reasons: + +1. It is **16.2% above planned hours overall**. +2. 3 of its 4 production rows are greater than 10% over plan. +3. It appears in overloaded week `w2`. +4. Worker coverage is thin: only **Per Hansen** and **Victor Elm** can cover it. +5. If Per Hansen is unavailable, only Victor Elm remains, and Victor is also listed as a cross-station foreman who covers many other stations. + +### Project/station rows greater than 10% over plan + +The exact rows where: + +```text +actual_hours > 1.1 × planned_hours +``` + +are: + +| Project | Station | Week | Planned | Actual | Variance | Variance % | +|---|---|---:|---:|---:|---:|---:| +| `P01` Stålverket Borås | `012` Förmontering IQB | `w1` | 32.0 | 35.5 | +3.5 | +10.9% | +| `P02` Kontorshus Mölndal | `012` Förmontering IQB | `w1` | 22.0 | 24.5 | +2.5 | +11.4% | +| `P03` Lagerhall Jönköping | `014` Svets o montage IQB | `w1` | 42.0 | 48.0 | +6.0 | +14.3% | +| `P02` Kontorshus Mölndal | `015` Montering IQP | `w1` | 19.0 | 21.0 | +2.0 | +10.5% | +| `P04` Parkering Helsingborg | `015` Montering IQP | `w1` | 16.0 | 18.0 | +2.0 | +12.5% | +| `P01` Stålverket Borås | `015` Montering IQP | `w2` | 25.0 | 28.0 | +3.0 | +12.0% | +| `P03` Lagerhall Jönköping | `016` Gjutning | `w2` | 28.0 | 35.0 | +7.0 | +25.0% | +| `P05` Sjukhus Linköping ET2 | `016` Gjutning | `w2` | 35.0 | 40.0 | +5.0 | +14.3% | +| `P08` Bro E6 Halmstad | `016` Gjutning | `w3` | 22.0 | 25.0 | +3.0 | +13.6% | +| `P04` Parkering Helsingborg | `018` SB B/F-hall | `w1` | 19.0 | 22.0 | +3.0 | +15.8% | +| `P05` Sjukhus Linköping ET2 | `018` SB B/F-hall | `w1` | 25.0 | 28.0 | +3.0 | +12.0% | +| `P07` Idrottshall Västerås | `018` SB B/F-hall | `w1` | 16.0 | 18.0 | +2.0 | +12.5% | +| `P06` Skola Uppsala | `018` SB B/F-hall | `w2` | 16.0 | 18.0 | +2.0 | +12.5% | +| `P02` Kontorshus Mölndal | `019` SP B/F-hall | `w2` | 14.0 | 15.5 | +1.5 | +10.7% | + +### Project-level contributors + +By total extra hours across available production rows: + +| Rank | Project | Planned | Actual | Variance | Variance % | Rows >10% | +|---:|---|---:|---:|---:|---:|---:| +| 1 | `P03` Lagerhall Jönköping | 392.0 | 406.5 | +14.5 | +3.7% | 2 | +| 2 | `P04` Parkering Helsingborg | 228.0 | 238.0 | +10.0 | +4.4% | 2 | +| 3 | `P08` Bro E6 Halmstad | 407.0 | 415.0 | +8.0 | +2.0% | 1 | +| 4 | `P05` Sjukhus Linköping ET2 | 613.0 | 619.5 | +6.5 | +1.1% | 2 | +| 5 | `P01` Stålverket Borås | 316.0 | 322.4 | +6.4 | +2.0% | 2 | + +### Cypher query: actual_hours > 1.1 × planned_hours, grouped by station + +```cypher +MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) +WHERE work.actual_hours > work.planned_hours * 1.10 +RETURN + s.station_code AS station_code, + s.name AS station_name, + count(*) AS overrun_rows, + collect(DISTINCT p.project_id + ' ' + p.name) AS affected_projects, + round(sum(work.planned_hours), 2) AS planned_hours, + round(sum(work.actual_hours), 2) AS actual_hours, + round(sum(work.actual_hours - work.planned_hours), 2) AS variance_hours, + round( + 100.0 * (sum(work.actual_hours) - sum(work.planned_hours)) / sum(work.planned_hours), + 2 + ) AS variance_pct +ORDER BY + overrun_rows DESC, + variance_hours DESC; +``` + +### Graph pattern for bottleneck alerts + +I would model bottlenecks as explicit alert nodes: + +```text +(:BottleneckAlert) + -[:FLAGS_PROJECT]->(:Project) + -[:FLAGS_STATION]->(:Station) + -[:FLAGS_WEEK]->(:Week) +``` + +Example alert node: + +```cypher +(:BottleneckAlert { + alert_id: 'P03-016-w2-overrun', + severity: 'high', + reason: 'actual_hours > planned_hours by more than 10%', + planned_hours: 28.0, + actual_hours: 35.0, + variance_hours: 7.0, + variance_pct: 25.0 +}) +``` + +For capacity deficits, I would also create alerts like: + +```text +(:BottleneckAlert {alert_id: 'capacity-w2', reason: 'weekly capacity deficit'}) + -[:FLAGS_WEEK]->(:Week {week: 'w2'}) +``` + +Then connect that alert to the stations with the largest station-week variance or largest share of week demand. + +## B. What I Did + +1. I first looked at weekly total capacity versus total planned demand. +2. I identified deficit weeks: `w1`, `w2`, `w4`, `w6`, and `w7`. +3. I then checked production rows where actual hours exceeded planned hours by more than 10%. +4. I aggregated production by station to find which stations had the biggest total overruns. +5. I also checked worker coverage to see whether the station bottleneck is made worse by limited staffing. +6. I separated two concepts: + - **capacity deficit** = the whole factory does not have enough hours in that week. + - **execution overrun** = specific station/project rows used more hours than planned. + +## C. Why I Did It + +A real factory bottleneck is not just the biggest number. It is a combination of: + +1. demand pressure, +2. actual time overruns, +3. weak worker coverage, +4. timing during already overloaded weeks. + +That is why Station `016` is more alarming than Station `012` even though Station `012` has more total extra hours. Station `016` has a higher percentage overrun and only two listed workers who can cover it. + +## D. How It Works + +The basic overrun rule is: + +```text +actual_hours > planned_hours × 1.10 +``` + +Example: + +```text +P03 at Station 016 in week w2: +planned = 28.0 +actual = 35.0 +threshold = 28.0 × 1.10 = 30.8 +35.0 > 30.8, so this is a bottleneck row. +``` + +The capacity rule is: + +```text +deficit < 0 means the week is overloaded. +``` + +Example: + +```text +w2 total capacity = 520 +w2 total planned = 645 +deficit = -125 +``` + +So week `w2` is overloaded. In the detailed production data, week `w2` also contains major Station `016` overruns, so Station `016` is a strong candidate bottleneck. + +## E. What Still Remains / Assumptions + +- The production CSV only has rows for `w1`, `w2`, and `w3`, while capacity exists for `w1` through `w8`. I cannot accurately assign station/project causes for deficits in `w4`, `w6`, and `w7` without detailed production rows for those weeks. +- The capacity file is total weekly factory capacity, not station-specific capacity. If station-level capacity is later provided, the bottleneck model should compare station demand against station capacity directly. +- The worker file says who can cover each station, but it does not say availability by week. If vacation/absence calendars are added, the graph can detect week-specific staffing bottlenecks. + +--- + +# Q4. Vector + Graph Hybrid + +## A. Final Answer + +The factory receives new free-text requests such as: + +```text +450 meters of IQB beams for a hospital extension in Linköping, similar scope to previous hospital projects, tight timeline +``` + +A good system should not only search by product type. It should find past projects that are semantically similar and operationally similar. + +### What I would embed + +I would create an embedding text for every historical project using fields like: + +```text +Project name: Sjukhus Linköping ET2. +Products: IQB, SB, SR. +Quantity: 450 meters. +Stations used: FS IQB, Förmontering IQB, Montering IQB, Gjutning, SB B/F-hall, SR B/F-hall. +Weeks: w1, w2, w3. +Performance: planned hours, actual hours, variance percentage. +Context tags: hospital, Linköping, extension, tight timeline if available. +``` + +I would embed: + +1. project descriptions, +2. product/product-spec summaries, +3. station route summaries, +4. lessons-learned or variance summaries, +5. worker skill/certification summaries if the future use case includes staffing. + +### How similarity search works + +1. Convert the new project request into an embedding vector. +2. Compare that vector against stored project vectors. +3. Return the top similar historical projects. +4. Then apply graph filters to keep only the projects that are operationally relevant. + +### Hybrid vector + graph query + +Example in Neo4j-style Cypher: + +```cypher +// Step 1: vector search gives semantically similar projects +CALL db.index.vector.queryNodes('project_embedding_index', 10, $new_project_embedding) +YIELD node AS similarProject, score + +// Step 2: graph filter keeps projects that use the same product/station pattern +MATCH (similarProject)-[:PRODUCES]->(product:Product {product_type: $product_type}) +MATCH (similarProject)-[work:SCHEDULED_AT]->(station:Station) +WHERE station.station_code IN $required_station_codes + +WITH + similarProject, + score, + collect(DISTINCT station.name) AS matched_stations, + sum(work.planned_hours) AS planned_hours, + sum(work.actual_hours) AS actual_hours +WHERE planned_hours > 0 + AND abs(actual_hours - planned_hours) / planned_hours < 0.05 + +RETURN + similarProject.project_id AS project_id, + similarProject.name AS project_name, + score AS semantic_similarity, + matched_stations, + planned_hours, + actual_hours, + round(100.0 * (actual_hours - planned_hours) / planned_hours, 2) AS variance_pct +ORDER BY + semantic_similarity DESC, + abs(actual_hours - planned_hours) ASC +LIMIT 5; +``` + +### Why this is better than filtering only by product type + +Filtering only by product type may find projects that are all `IQB`, but not all `IQB` projects are operationally similar. A hospital extension with a tight timeline may behave differently from a bridge, a parking structure, or a warehouse. + +The hybrid method is better because it checks: + +1. semantic similarity — the request sounds like a past project, +2. product similarity — it uses similar products, +3. station-route similarity — it used the same factory stations, +4. performance similarity — it finished with low variance, +5. staffing/certification similarity — optional future filter. + +## B. What I Did + +1. I separated “meaning similarity” from “factory execution similarity”. +2. I used vectors for free text because text like “hospital extension in Linköping” is not easy to match with exact filters. +3. I used graph relationships for hard operational rules such as product type, station route, and variance. +4. I filtered for projects with variance under 5% because these are good planning references. + +## C. Why I Did It + +Vector search is good at fuzzy similarity. Graph search is good at exact connected constraints. Factory planning needs both. + +A vector search alone might return a project that sounds similar but used very different stations. A graph query alone might return the same product type but miss that the customer context and timeline are similar. Together, they produce recommendations that are both meaningful and operationally useful. + +## D. How It Works + +The new project request becomes a numeric vector. Historical projects also have vectors. The system first finds projects with close vectors. + +Then the graph asks: + +```text +Did those similar projects use the required stations? +Did they use the same product type? +Was the variance less than 5%? +``` + +Only projects that pass both the vector similarity and graph constraints are recommended. + +Example result: + +```text +New request: hospital extension, IQB beams, tight timeline +Recommended reference: P05 Sjukhus Linköping ET2 +Reason: semantically similar hospital project, used IQB-related stations, and gives real station-hour history. +``` + +## E. What Still Remains / Assumptions + +- The current CSV does not include rich project descriptions, customer type, deadline pressure, or location fields except what can be inferred from project names. +- To make vector search truly strong, future data should include descriptions, project category, location, complexity, constraints, and post-project notes. +- Neo4j vector indexing requires embeddings to be generated and stored first. +- The query assumes the application can provide `$new_project_embedding`, `$product_type`, and `$required_station_codes`. + +--- + +# Q5. Level 6 Plan — Build Blueprint + +## A. Final Answer + +Level 6 should build the Level 5 schema into a Neo4j graph and then use Streamlit to query Neo4j for dashboard pages. + +## 1. Node labels and CSV column mapping + +| Node label | Created from | CSV columns | +|---|---|---| +| `Project` | unique project rows from production | `project_id`, `project_number`, `project_name` | +| `Product` | unique product types from production | `product_type`, `unit` | +| `Station` | unique station rows from production | `station_code`, `station_name` | +| `Week` | unique weeks from production and capacity | `week` | +| `Etapp` | unique etapp values from production | `etapp` | +| `BOP` | unique bop values from production | `bop` | +| `Worker` | each worker row | `worker_id`, `name`, `role`, `primary_station`, `hours_per_week`, `type` | +| `Certification` | parsed certification names | `certifications` split by comma | +| `CapacityPlan` | each weekly capacity row | `week`, `own_staff_count`, `hired_staff_count`, `own_hours`, `hired_hours`, `overtime_hours`, `total_capacity`, `total_planned`, `deficit` | +| `BottleneckAlert` | derived from overrun/capacity rules | derived IDs and metrics | + +## 2. Relationships and how they are created + +| Relationship | Creation rule | +|---|---| +| `(Project)-[:PRODUCES]->(Product)` | from each production row, merge by project and product | +| `(Project)-[:SCHEDULED_AT]->(Station)` | from each production row; relationship carries week and hour metrics | +| `(Project)-[:HAS_WORK_IN]->(Week)` | aggregate project-week planned and actual hours | +| `(Project)-[:HAS_ETAPP]->(Etapp)` | from production `etapp` | +| `(Project)-[:USES_BOP]->(BOP)` | from production `bop` | +| `(Product)-[:REQUIRES_STATION]->(Station)` | aggregate all product/station combinations from production | +| `(Worker)-[:PRIMARY_AT]->(Station)` | from worker `primary_station`, except `all` should be handled as foreman/global coverage | +| `(Worker)-[:CAN_COVER]->(Station)` | split `can_cover_stations` by comma and connect worker to each station | +| `(Worker)-[:HAS_CERTIFICATION]->(Certification)` | split `certifications` by comma | +| `(Week)-[:HAS_CAPACITY]->(CapacityPlan)` | from each capacity row | +| `(CapacityPlan)-[:CAPACITY_PRESSURE_ON]->(Station)` | derived by joining week-level capacity with station-week production demand | +| `(BottleneckAlert)-[:FLAGS_PROJECT]->(Project)` | created for rows over 10% plan or for risky project/station/week combinations | +| `(BottleneckAlert)-[:FLAGS_STATION]->(Station)` | created for overloaded station | +| `(BottleneckAlert)-[:FLAGS_WEEK]->(Week)` | created for overloaded week | + +## 3. Data ingestion plan + +### Step 1: Create constraints + +```cypher +CREATE CONSTRAINT project_id IF NOT EXISTS +FOR (p:Project) REQUIRE p.project_id IS UNIQUE; + +CREATE CONSTRAINT product_type IF NOT EXISTS +FOR (p:Product) REQUIRE p.product_type IS UNIQUE; + +CREATE CONSTRAINT station_code IF NOT EXISTS +FOR (s:Station) REQUIRE s.station_code IS UNIQUE; + +CREATE CONSTRAINT worker_id IF NOT EXISTS +FOR (w:Worker) REQUIRE w.worker_id IS UNIQUE; + +CREATE CONSTRAINT week_id IF NOT EXISTS +FOR (w:Week) REQUIRE w.week IS UNIQUE; + +CREATE CONSTRAINT certification_name IF NOT EXISTS +FOR (c:Certification) REQUIRE c.name IS UNIQUE; +``` + +### Step 2: Load production CSV + +For each row: + +1. `MERGE` project. +2. `MERGE` product. +3. `MERGE` station. +4. `MERGE` week. +5. `MERGE` etapp. +6. `MERGE` BOP. +7. Create/update relationships using `MERGE`. +8. Store `planned_hours`, `actual_hours`, `completed_units`, and variance metrics on `SCHEDULED_AT`. + +### Step 3: Load workers CSV + +For each worker: + +1. `MERGE` worker. +2. `MERGE` primary station if not `all`. +3. Create `PRIMARY_AT`. +4. Split `can_cover_stations` and create `CAN_COVER` relationships. +5. Split `certifications` and create `HAS_CERTIFICATION` relationships. + +### Step 4: Load capacity CSV + +For each week: + +1. `MERGE` week. +2. `MERGE` capacity plan using ID like `capacity-w1`. +3. Create `HAS_CAPACITY` relationship. +4. Set `is_deficit = true` when `deficit < 0`. + +### Step 5: Create derived alerts + +Create `BottleneckAlert` nodes for: + +1. production rows where `actual_hours > planned_hours * 1.10`, +2. weeks where `deficit < 0`, +3. stations with only one or two covering workers and high demand. + +Use `MERGE`, not `CREATE`, so the script can run more than once safely. + +## 4. Dashboard panels for Level 6 + +### Panel 1: Project Overview + +**What it shows:** + +- all 8 projects, +- total planned hours, +- total actual hours, +- variance hours, +- variance percentage, +- products involved, +- stations involved. + +**Cypher query:** + +```cypher +MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) +OPTIONAL MATCH (p)-[:PRODUCES]->(prod:Product) +WITH + p, + collect(DISTINCT prod.product_type) AS products, + collect(DISTINCT s.name) AS stations, + sum(work.planned_hours) AS planned_hours, + sum(work.actual_hours) AS actual_hours +RETURN + p.project_id AS project_id, + p.project_number AS project_number, + p.name AS project_name, + products, + stations, + round(planned_hours, 2) AS planned_hours, + round(actual_hours, 2) AS actual_hours, + round(actual_hours - planned_hours, 2) AS variance_hours, + round(100.0 * (actual_hours - planned_hours) / planned_hours, 2) AS variance_pct +ORDER BY project_id; +``` + +### Panel 2: Station Load and Overrun Chart + +**What it shows:** + +- planned vs actual hours by station, +- station variance, +- stations where actual hours are higher than planned, +- highlight `016`, `012`, `015`, and `018` if overruns remain visible. + +**Cypher query:** + +```cypher +MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) +WITH + s, + sum(work.planned_hours) AS planned_hours, + sum(work.actual_hours) AS actual_hours, + count(CASE WHEN work.actual_hours > work.planned_hours * 1.10 THEN 1 END) AS over_10pct_rows +RETURN + s.station_code AS station_code, + s.name AS station_name, + round(planned_hours, 2) AS planned_hours, + round(actual_hours, 2) AS actual_hours, + round(actual_hours - planned_hours, 2) AS variance_hours, + round(100.0 * (actual_hours - planned_hours) / planned_hours, 2) AS variance_pct, + over_10pct_rows +ORDER BY variance_hours DESC; +``` + +### Panel 3: Capacity Tracker + +**What it shows:** + +- weekly total capacity, +- weekly total planned demand, +- deficit/surplus, +- red flag for deficit weeks. + +**Cypher query:** + +```cypher +MATCH (w:Week)-[:HAS_CAPACITY]->(c:CapacityPlan) +RETURN + w.week AS week, + c.own_hours AS own_hours, + c.hired_hours AS hired_hours, + c.overtime_hours AS overtime_hours, + c.total_capacity AS total_capacity, + c.total_planned AS total_planned, + c.deficit AS deficit, + c.is_deficit AS is_deficit +ORDER BY toInteger(replace(w.week, 'w', '')); +``` + +### Panel 4: Worker Coverage Matrix + +**What it shows:** + +- station rows, +- workers who can cover each station, +- number of certified/covering workers, +- single-point-of-failure stations. + +**Cypher query:** + +```cypher +MATCH (s:Station) +OPTIONAL MATCH (worker:Worker)-[:CAN_COVER]->(s) +WITH + s, + collect(DISTINCT worker.name) AS covering_workers, + count(DISTINCT worker) AS coverage_count +RETURN + s.station_code AS station_code, + s.name AS station_name, + coverage_count, + covering_workers, + CASE + WHEN coverage_count <= 1 THEN 'HIGH RISK' + WHEN coverage_count = 2 THEN 'MEDIUM RISK' + ELSE 'OK' + END AS coverage_risk +ORDER BY coverage_count ASC, station_code; +``` + +### Panel 5: Bottleneck Alerts + +**What it shows:** + +- every project/station/week row where actual hours are more than 10% above planned, +- severity, +- linked project, +- linked station, +- linked week. + +**Cypher query:** + +```cypher +MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) +WHERE work.actual_hours > work.planned_hours * 1.10 +RETURN + p.project_id AS project_id, + p.name AS project_name, + s.station_code AS station_code, + s.name AS station_name, + work.week AS week, + work.planned_hours AS planned_hours, + work.actual_hours AS actual_hours, + round(work.actual_hours - work.planned_hours, 2) AS variance_hours, + round(100.0 * (work.actual_hours - work.planned_hours) / work.planned_hours, 2) AS variance_pct, + CASE + WHEN work.actual_hours > work.planned_hours * 1.25 THEN 'HIGH' + WHEN work.actual_hours > work.planned_hours * 1.10 THEN 'MEDIUM' + ELSE 'LOW' + END AS severity +ORDER BY variance_pct DESC; +``` + +## B. What I Did + +1. I reused the schema from Q1 instead of inventing a new model. +2. I mapped every CSV column to either a node property, a relationship property, or a derived metric. +3. I designed ingestion to be idempotent with `MERGE`. +4. I planned dashboard panels around the scoring requirements: project overview, station load, capacity tracker, and worker coverage. +5. I added a fifth panel for bottleneck alerts because it directly supports factory decision-making. + +## C. Why I Did It + +Level 6 should prove that the graph is useful, not just that data was loaded. The dashboard should answer practical questions: + +- Which projects are over plan? +- Which stations are overloaded? +- Which weeks have capacity deficits? +- Which stations have weak worker coverage? +- Where should a production manager act first? + +This is why each panel is powered by a Cypher query instead of reading directly from CSV. + +## D. How It Works + +The ingestion script builds the graph once. Then the dashboard connects to Neo4j and runs Cypher queries. + +The flow is: + +```text +CSV files → seed_graph.py → Neo4j graph → Streamlit app → dashboard panels +``` + +Example: + +1. `seed_graph.py` creates `Project`, `Station`, and `SCHEDULED_AT` relationships. +2. The Station Load panel runs a Cypher query that sums `planned_hours` and `actual_hours` per station. +3. Streamlit displays the result as a bar chart. +4. A production manager can immediately see which station is using more time than expected. + +## E. What Still Remains / Assumptions + +- Level 6 still needs actual implementation in Python. +- Neo4j Aura or local Neo4j credentials must be created and stored safely in `.env` or Streamlit secrets. +- Streamlit deployment must be completed separately. +- The production data currently has detailed rows only for `w1`, `w2`, and `w3`, so capacity dashboard will show all weeks, but station-level attribution is only possible where production rows exist. +- The dashboard should not read raw CSV files after seeding; it should query Neo4j. + +--- + +# Final Section + +## ✔ Summary of what is COMPLETED + +- Completed the Level 5 graph schema design with more than 6 node labels and more than 8 relationship types. +- Created a separate `schema.md` containing a Mermaid schema diagram and relationship-property mapping. +- Mapped the CSV data into graph nodes, relationships, and properties. +- Wrote equivalent SQL and Cypher queries for the Station `016` worker-coverage question. +- Explained why the graph query is clearer and more natural than SQL for connected factory questions. +- Analyzed bottlenecks using capacity deficits, planned vs actual hours, station overruns, project overruns, and worker coverage risk. +- Identified Station `016` (`Gjutning`) as the highest-risk bottleneck because it combines high percentage overrun with weak worker coverage. +- Designed a practical vector + graph hybrid approach for matching new project requests to similar historical projects. +- Produced a detailed Level 6 blueprint including node labels, relationships, CSV mapping, ingestion plan, and dashboard Cypher queries. + +## ❗ What is STILL REMAINING for Level 6 or improvements + +- Implement `seed_graph.py`. +- Implement `app.py` in Streamlit. +- Create a Neo4j Aura or local Neo4j database. +- Load the graph using the three CSV files. +- Add the required Self-Test page. +- Deploy the Streamlit app. +- Add `DASHBOARD_URL.txt` with the deployed app URL. +- Add real station-level capacity if available. +- Add worker availability/vacation calendars if available. +- Add richer project descriptions for better vector search. + +## 🧠 Key insights I should remember + +1. A graph is useful because factory planning is relationship-heavy. +2. The most important bottleneck is not always the biggest total-hours station. +3. Station `012` has the largest total extra hours, but Station `016` is riskier because it has high percentage overruns and limited worker coverage. +4. Capacity deficit and production overrun are related but not identical. +5. The current data can explain `w1` and `w2` station/project causes, but not `w4`, `w6`, and `w7` because detailed production rows are missing for those weeks. +6. Level 5 Q5 should be treated as the implementation blueprint for Level 6. +7. In Level 6, the dashboard must query Neo4j, not read directly from CSV. diff --git a/submissions/jv-singh/level5/schema.md b/submissions/jv-singh/level5/schema.md new file mode 100644 index 000000000..101dc574d --- /dev/null +++ b/submissions/jv-singh/level5/schema.md @@ -0,0 +1,108 @@ +# Level 5 Schema Diagram — Factory Knowledge Graph + +This Mermaid diagram is the schema design for Level 5. It uses the three CSV files in `challenges/data/` as the source of truth. + +```mermaid +erDiagram + PROJECT { + string project_id PK + string project_number + string name + } + + PRODUCT { + string product_type PK + string unit + } + + STATION { + string station_code PK + string name + } + + WORKER { + string worker_id PK + string name + string role + string type + float hours_per_week + } + + WEEK { + string week PK + int sort_order + } + + ETAPP { + string etapp_id PK + } + + BOP { + string bop_id PK + } + + CERTIFICATION { + string name PK + } + + CAPACITY_PLAN { + string capacity_id PK + float own_hours + float hired_hours + float overtime_hours + float total_capacity + float total_planned + float deficit + int own_staff_count + int hired_staff_count + boolean is_deficit + } + + BOTTLENECK_ALERT { + string alert_id PK + string severity + string reason + float planned_hours + float actual_hours + float variance_hours + float variance_pct + } + + PROJECT ||--o{ PRODUCT : PRODUCES_qty_unit_factor + PROJECT ||--o{ STATION : SCHEDULED_AT_week_planned_actual_completed + PROJECT ||--o{ WEEK : HAS_WORK_IN_planned_actual + PROJECT ||--o{ ETAPP : HAS_ETAPP + PROJECT ||--o{ BOP : USES_BOP + PRODUCT ||--o{ STATION : REQUIRES_STATION + WORKER ||--|| STATION : PRIMARY_AT + WORKER ||--o{ STATION : CAN_COVER + WORKER ||--o{ CERTIFICATION : HAS_CERTIFICATION + WEEK ||--|| CAPACITY_PLAN : HAS_CAPACITY + CAPACITY_PLAN ||--o{ STATION : CAPACITY_PRESSURE_ON + BOTTLENECK_ALERT ||--|| PROJECT : FLAGS_PROJECT + BOTTLENECK_ALERT ||--|| STATION : FLAGS_STATION + BOTTLENECK_ALERT ||--|| WEEK : FLAGS_WEEK +``` + +## Relationship Types and Main Properties + +| Relationship | From → To | Main source | Important properties | +|---|---|---|---| +| `PRODUCES` | `Project → Product` | `factory_production.csv` | `quantity`, `unit`, `unit_factor` | +| `SCHEDULED_AT` | `Project → Station` | `factory_production.csv` | `week`, `planned_hours`, `actual_hours`, `completed_units`, `variance_hours`, `variance_pct`, `is_over_10pct` | +| `HAS_WORK_IN` | `Project → Week` | `factory_production.csv` | `planned_hours`, `actual_hours`, `station_count` | +| `HAS_ETAPP` | `Project → Etapp` | `factory_production.csv` | `etapp` | +| `USES_BOP` | `Project → BOP` | `factory_production.csv` | `bop` | +| `REQUIRES_STATION` | `Product → Station` | `factory_production.csv` | `times_seen`, `total_planned_hours`, `total_actual_hours` | +| `PRIMARY_AT` | `Worker → Station` | `factory_workers.csv` | `role`, `hours_per_week` | +| `CAN_COVER` | `Worker → Station` | `factory_workers.csv` | `is_primary`, `coverage_source`, `hours_per_week` | +| `HAS_CERTIFICATION` | `Worker → Certification` | `factory_workers.csv` | `certification_name` | +| `HAS_CAPACITY` | `Week → CapacityPlan` | `factory_capacity.csv` | `own_hours`, `hired_hours`, `overtime_hours`, `total_capacity`, `total_planned`, `deficit` | +| `CAPACITY_PRESSURE_ON` | `CapacityPlan → Station` | derived from production + capacity | `station_planned_hours`, `station_actual_hours`, `station_variance_hours`, `share_of_week_demand` | +| `FLAGS_PROJECT` | `BottleneckAlert → Project` | derived | `reason` | +| `FLAGS_STATION` | `BottleneckAlert → Station` | derived | `reason` | +| `FLAGS_WEEK` | `BottleneckAlert → Week` | derived | `reason` | + +## Why this schema works + +The graph separates stable business entities (`Project`, `Product`, `Station`, `Worker`, `Week`) from measured operational facts (`SCHEDULED_AT`, `HAS_CAPACITY`, `CAPACITY_PRESSURE_ON`, `BOTTLENECK_ALERT`). This makes it easy to answer both planning questions, such as “which stations are overloaded?”, and resilience questions, such as “who can cover a station if a worker is absent?”. From e8ab2bdcc24f0a96b007528d805863abcb74c3a3 Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Sun, 10 May 2026 12:00:56 +0530 Subject: [PATCH 2/7] level-6: add factory graph dashboard --- submissions/jv-singh/level6/.env.example | 3 + submissions/jv-singh/level6/DASHBOARD_URL.txt | 1 + submissions/jv-singh/level6/README.md | 132 +++++++ submissions/jv-singh/level6/app.py | 361 ++++++++++++++++++ .../jv-singh/level6/data/factory_capacity.csv | 9 + .../level6/data/factory_production.csv | 69 ++++ .../jv-singh/level6/data/factory_workers.csv | 15 + submissions/jv-singh/level6/requirements.txt | 5 + submissions/jv-singh/level6/seed_graph.py | 245 ++++++++++++ 9 files changed, 840 insertions(+) create mode 100644 submissions/jv-singh/level6/.env.example create mode 100644 submissions/jv-singh/level6/DASHBOARD_URL.txt create mode 100644 submissions/jv-singh/level6/README.md create mode 100644 submissions/jv-singh/level6/app.py create mode 100644 submissions/jv-singh/level6/data/factory_capacity.csv create mode 100644 submissions/jv-singh/level6/data/factory_production.csv create mode 100644 submissions/jv-singh/level6/data/factory_workers.csv create mode 100644 submissions/jv-singh/level6/requirements.txt create mode 100644 submissions/jv-singh/level6/seed_graph.py diff --git a/submissions/jv-singh/level6/.env.example b/submissions/jv-singh/level6/.env.example new file mode 100644 index 000000000..d9beac684 --- /dev/null +++ b/submissions/jv-singh/level6/.env.example @@ -0,0 +1,3 @@ +NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io +NEO4J_USER=neo4j +NEO4J_PASSWORD=your-password-here diff --git a/submissions/jv-singh/level6/DASHBOARD_URL.txt b/submissions/jv-singh/level6/DASHBOARD_URL.txt new file mode 100644 index 000000000..77a54a269 --- /dev/null +++ b/submissions/jv-singh/level6/DASHBOARD_URL.txt @@ -0,0 +1 @@ +TODO: deploy to Streamlit Community Cloud and replace this line with https://your-app.streamlit.app diff --git a/submissions/jv-singh/level6/README.md b/submissions/jv-singh/level6/README.md new file mode 100644 index 000000000..d231d318d --- /dev/null +++ b/submissions/jv-singh/level6/README.md @@ -0,0 +1,132 @@ +# Level 6 — Factory Graph + Dashboard + +This folder contains my Level 6 implementation for the factory knowledge graph challenge. + +## What is included + +- `seed_graph.py` — loads the three factory CSV files into Neo4j using idempotent `MERGE` queries. +- `app.py` — Streamlit dashboard with five pages: + - Project Overview + - Station Load + - Capacity Tracker + - Worker Coverage + - Self-Test +- `data/` — local copies of the required challenge CSV files. +- `.env.example` — safe credential template with no real secrets. +- `requirements.txt` — Python dependencies for local and Streamlit Cloud runs. +- `DASHBOARD_URL.txt` — placeholder to replace after deployment. + +## Graph schema + +### Node labels + +- `Project` +- `Product` +- `Station` +- `Worker` +- `Week` +- `Etapp` +- `BOP` +- `Certification` +- `CapacityPlan` +- `BottleneckAlert` + +### Relationship types + +- `PRODUCES` +- `SCHEDULED_AT` +- `HAS_WORK_IN` +- `HAS_ETAPP` +- `USES_BOP` +- `REQUIRES_STATION` +- `ACTIVE_IN` +- `PRIMARY_AT` +- `CAN_COVER` +- `HAS_CERTIFICATION` +- `HAS_CAPACITY` +- `CAPACITY_PRESSURE_ON` +- `FLAGS_PROJECT` +- `FLAGS_STATION` +- `FLAGS_WEEK` + +## Local setup + +From this folder: + +```bash +python -m venv .venv +source .venv/bin/activate +pip install -r requirements.txt +cp .env.example .env +``` + +Edit `.env` with your Neo4j Aura, Neo4j Desktop, or Docker credentials: + +```env +NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io +NEO4J_USER=neo4j +NEO4J_PASSWORD=your-password-here +``` + +## Seed Neo4j + +Run: + +```bash +python seed_graph.py +``` + +Expected output is a summary similar to: + +```text +Seed complete: nodes, relationships +Labels: ... +Relationship types: ... +``` + +The seed script is safe to run more than once. It creates uniqueness constraints and uses `MERGE` for nodes and relationships. + +## Run the dashboard locally + +Run: + +```bash +streamlit run app.py +``` + +Open the local Streamlit URL, then use the sidebar to visit all pages. The **Self-Test** page should show green checks after the graph is seeded. + +## Deploy to Streamlit Community Cloud + +1. Push this repository/branch to GitHub. +2. Go to . +3. Create a new app using: + - repository: your fork of `lpi-developer-kit` + - branch: your submission branch + - main file path: `submissions/jv-singh/level6/app.py` +4. In app settings, add Streamlit secrets: + + ```toml + NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io" + NEO4J_USER = "neo4j" + NEO4J_PASSWORD = "your-password-here" + ``` + +5. Save the app URL in `DASHBOARD_URL.txt`. +6. Verify the deployed app loads and the **Self-Test** page passes. + +## What is not completed yet + +I completed the code and documentation in this folder, but I did **not** deploy the Streamlit app because deployment requires your GitHub/Streamlit account and your Neo4j credentials. Before final submission, replace the placeholder in `DASHBOARD_URL.txt` with the deployed Streamlit URL. + +## Submission checklist + +- [x] `seed_graph.py` implemented. +- [x] `app.py` implemented with 4 required dashboard pages plus Self-Test. +- [x] Required CSV data copied into `data/`. +- [x] No real credentials committed. +- [ ] Neo4j Aura/Desktop instance created by the submitter. +- [ ] `python seed_graph.py` run against the submitter's Neo4j database. +- [ ] Streamlit Cloud app deployed. +- [ ] `DASHBOARD_URL.txt` updated with the live dashboard URL. +- [ ] Pull request title uses `level-6: JV Singh`. diff --git a/submissions/jv-singh/level6/app.py b/submissions/jv-singh/level6/app.py new file mode 100644 index 000000000..6bb578f92 --- /dev/null +++ b/submissions/jv-singh/level6/app.py @@ -0,0 +1,361 @@ +"""Streamlit dashboard for the Level 6 factory Neo4j graph.""" + +from __future__ import annotations + +import os +from pathlib import Path + +import pandas as pd +import plotly.express as px +import streamlit as st +from dotenv import load_dotenv +from neo4j import GraphDatabase + +BASE_DIR = Path(__file__).resolve().parent + +st.set_page_config( + page_title="Factory Graph Dashboard", + page_icon="🏭", + layout="wide", +) + + +def get_secret(name: str, default: str | None = None) -> str | None: + try: + if name in st.secrets: + return st.secrets[name] + except Exception: + pass + return os.getenv(name, default) + + +@st.cache_resource(show_spinner=False) +def get_driver(): + load_dotenv(BASE_DIR / ".env") + uri = get_secret("NEO4J_URI") + user = get_secret("NEO4J_USER", "neo4j") + password = get_secret("NEO4J_PASSWORD") + if not uri or not password: + return None + return GraphDatabase.driver(uri, auth=(user, password)) + + +def query_df(driver, cypher: str, **params) -> pd.DataFrame: + with driver.session() as session: + rows = [dict(record) for record in session.run(cypher, **params)] + return pd.DataFrame(rows) + + +def render_connection_help() -> None: + st.error("Neo4j credentials are not configured.") + st.markdown( + """ + Add these values either to a local `.env` file or to Streamlit Cloud + **Settings → Secrets** before running the dashboard: + + ```toml + NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io" + NEO4J_USER = "neo4j" + NEO4J_PASSWORD = "your-password" + ``` + """ + ) + + +def run_self_test(driver): + checks = [] + try: + with driver.session() as session: + session.run("RETURN 1") + checks.append(("Neo4j connected", True, 3)) + except Exception as exc: # noqa: BLE001 - Streamlit should show the failed check instead of crashing. + checks.append((f"Neo4j connected ({exc.__class__.__name__})", False, 3)) + return checks + + with driver.session() as session: + result = session.run("MATCH (n) RETURN count(n) AS c").single() + count = result["c"] + checks.append((f"{count} nodes (min: 50)", count >= 50, 3)) + + result = session.run("MATCH ()-[r]->() RETURN count(r) AS c").single() + count = result["c"] + checks.append((f"{count} relationships (min: 100)", count >= 100, 3)) + + result = session.run("CALL db.labels() YIELD label RETURN count(label) AS c").single() + count = result["c"] + checks.append((f"{count} node labels (min: 6)", count >= 6, 3)) + + result = session.run( + "CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c" + ).single() + count = result["c"] + checks.append((f"{count} relationship types (min: 8)", count >= 8, 3)) + + result = session.run( + """ + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + WHERE r.actual_hours > r.planned_hours * 1.1 + RETURN p.name AS project, + s.name AS station, + r.planned_hours AS planned, + r.actual_hours AS actual + LIMIT 10 + """ + ) + rows = [dict(record) for record in result] + checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5)) + return checks + + +def render_project_overview(driver) -> None: + st.header("Project Overview") + df = query_df( + driver, + """ + MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) + OPTIONAL MATCH (p)-[:PRODUCES]->(prod:Product) + WITH p, + collect(DISTINCT prod.product_type) AS products, + collect(DISTINCT s.name) AS stations, + sum(work.planned_hours) AS planned_hours, + sum(work.actual_hours) AS actual_hours + RETURN p.project_id AS project_id, + p.project_number AS project_number, + p.name AS project_name, + products, + stations, + round(planned_hours, 2) AS planned_hours, + round(actual_hours, 2) AS actual_hours, + round(actual_hours - planned_hours, 2) AS variance_hours, + round(10000.0 * (actual_hours - planned_hours) / planned_hours) / 100.0 AS variance_pct + ORDER BY project_id + """, + ) + if df.empty: + st.warning("No project data found. Run `python seed_graph.py` first.") + return + + total_planned = df["planned_hours"].sum() + total_actual = df["actual_hours"].sum() + variance_pct = ((total_actual - total_planned) / total_planned) * 100 if total_planned else 0 + col1, col2, col3, col4 = st.columns(4) + col1.metric("Projects", len(df)) + col2.metric("Planned hours", f"{total_planned:,.1f}") + col3.metric("Actual hours", f"{total_actual:,.1f}", f"{variance_pct:+.1f}%") + col4.metric("Projects >10% variance", int((df["variance_pct"] > 10).sum())) + + display = df.copy() + display["products"] = display["products"].apply(lambda values: ", ".join(values)) + display["stations"] = display["stations"].apply(lambda values: ", ".join(values)) + st.dataframe( + display, + use_container_width=True, + hide_index=True, + column_config={ + "variance_pct": st.column_config.NumberColumn("variance_pct", format="%.2f%%"), + "planned_hours": st.column_config.NumberColumn("planned_hours", format="%.1f"), + "actual_hours": st.column_config.NumberColumn("actual_hours", format="%.1f"), + }, + ) + + fig = px.bar( + df, + x="project_name", + y=["planned_hours", "actual_hours"], + barmode="group", + title="Planned vs actual hours by project", + ) + fig.update_layout(xaxis_title="Project", yaxis_title="Hours") + st.plotly_chart(fig, use_container_width=True) + + +def render_station_load(driver) -> None: + st.header("Station Load") + df = query_df( + driver, + """ + MATCH (p:Project)-[work:SCHEDULED_AT]->(s:Station) + RETURN s.station_code AS station_code, + s.name AS station_name, + work.week AS week, + sum(work.planned_hours) AS planned_hours, + sum(work.actual_hours) AS actual_hours, + round(sum(work.actual_hours - work.planned_hours), 2) AS variance_hours + ORDER BY station_code, week + """, + ) + if df.empty: + st.warning("No station load data found.") + return + + station_options = ["All stations", *df["station_name"].drop_duplicates().tolist()] + selected = st.selectbox("Station filter", station_options) + chart_df = df if selected == "All stations" else df[df["station_name"] == selected] + + melted = chart_df.melt( + id_vars=["station_code", "station_name", "week"], + value_vars=["planned_hours", "actual_hours"], + var_name="metric", + value_name="hours", + ) + fig = px.bar( + melted, + x="week", + y="hours", + color="metric", + facet_col="station_name" if selected == "All stations" else None, + facet_col_wrap=3, + barmode="group", + hover_data=["station_code"], + title="Weekly planned vs actual station load", + ) + fig.update_layout(height=760 if selected == "All stations" else 460) + st.plotly_chart(fig, use_container_width=True) + + overrun_df = df[df["actual_hours"] > df["planned_hours"]].copy() + st.subheader("Rows where actual hours exceed planned hours") + st.dataframe(overrun_df, use_container_width=True, hide_index=True) + + +def render_capacity_tracker(driver) -> None: + st.header("Capacity Tracker") + df = query_df( + driver, + """ + MATCH (week:Week)-[:HAS_CAPACITY]->(plan:CapacityPlan) + RETURN week.week AS week, + week.sort_order AS sort_order, + plan.own_hours AS own_hours, + plan.hired_hours AS hired_hours, + plan.overtime_hours AS overtime_hours, + plan.total_capacity AS total_capacity, + plan.total_planned AS total_planned, + plan.deficit AS deficit, + plan.is_deficit AS is_deficit + ORDER BY sort_order + """, + ) + if df.empty: + st.warning("No capacity data found.") + return + + deficit_weeks = int(df["is_deficit"].sum()) + col1, col2, col3 = st.columns(3) + col1.metric("Weeks tracked", len(df)) + col2.metric("Deficit weeks", deficit_weeks) + col3.metric("Worst deficit", f"{df['deficit'].min():,.0f} hours") + + fig = px.line( + df, + x="week", + y=["total_capacity", "total_planned"], + markers=True, + title="Total capacity vs planned demand", + ) + deficit_points = df[df["deficit"] < 0] + fig.add_scatter( + x=deficit_points["week"], + y=deficit_points["total_planned"], + mode="markers", + marker={"color": "red", "size": 13, "symbol": "x"}, + name="Deficit week", + ) + st.plotly_chart(fig, use_container_width=True) + + styled = df.drop(columns=["sort_order"]).style.apply( + lambda row: ["background-color: #ffd6d6" if row["deficit"] < 0 else "" for _ in row], + axis=1, + ) + st.dataframe(styled, use_container_width=True, hide_index=True) + + +def render_worker_coverage(driver) -> None: + st.header("Worker Coverage") + coverage = query_df( + driver, + """ + MATCH (s:Station) + OPTIONAL MATCH (w:Worker)-[:CAN_COVER]->(s) + WITH s, collect(DISTINCT w.name) AS workers + RETURN s.station_code AS station_code, + s.name AS station_name, + size([worker IN workers WHERE worker IS NOT NULL]) AS certified_workers, + [worker IN workers WHERE worker IS NOT NULL] AS workers + ORDER BY station_code + """, + ) + matrix = query_df( + driver, + """ + MATCH (w:Worker) + OPTIONAL MATCH (w)-[:CAN_COVER]->(s:Station) + RETURN w.name AS worker, collect(DISTINCT s.station_code) AS stations + ORDER BY worker + """, + ) + if coverage.empty: + st.warning("No worker coverage data found.") + return + + spof = coverage[coverage["certified_workers"] <= 1] + col1, col2, col3 = st.columns(3) + col1.metric("Workers", len(matrix)) + col2.metric("Stations", len(coverage)) + col3.metric("Single-point-of-failure stations", len(spof)) + + coverage_display = coverage.copy() + coverage_display["workers"] = coverage_display["workers"].apply(lambda values: ", ".join(values)) + styled = coverage_display.style.apply( + lambda row: ["background-color: #ffd6d6" if row["certified_workers"] <= 1 else "" for _ in row], + axis=1, + ) + st.subheader("Station certification coverage") + st.dataframe(styled, use_container_width=True, hide_index=True) + + all_stations = sorted(coverage["station_code"].dropna().unique().tolist()) + matrix_rows = [] + for _, row in matrix.iterrows(): + covered = set(row["stations"]) + matrix_rows.append({"worker": row["worker"], **{station: station in covered for station in all_stations}}) + matrix_df = pd.DataFrame(matrix_rows) + st.subheader("Worker-to-station matrix") + st.dataframe(matrix_df, use_container_width=True, hide_index=True) + + +def render_self_test(driver) -> None: + st.header("Self-Test") + st.caption("Automated checks required by the Level 6 challenge.") + checks = run_self_test(driver) + earned = 0 + possible = sum(points for _, _, points in checks) + for label, passed, points in checks: + if passed: + earned += points + st.markdown(f"{'✅' if passed else '❌'} **{label}** — `{points if passed else 0}/{points}`") + st.divider() + st.subheader(f"SELF-TEST SCORE: {earned}/{possible}") + st.progress(earned / possible if possible else 0) + + +def main() -> None: + st.title("🏭 Factory Graph Dashboard") + st.caption("Level 6: Neo4j knowledge graph + Streamlit dashboard") + driver = get_driver() + if driver is None: + render_connection_help() + return + + pages = { + "Project Overview": render_project_overview, + "Station Load": render_station_load, + "Capacity Tracker": render_capacity_tracker, + "Worker Coverage": render_worker_coverage, + "Self-Test": render_self_test, + } + selected_page = st.sidebar.radio("Navigation", list(pages.keys())) + st.sidebar.info("All dashboard pages query Neo4j directly; CSV files are used only by seed_graph.py.") + pages[selected_page](driver) + + +if __name__ == "__main__": + main() diff --git a/submissions/jv-singh/level6/data/factory_capacity.csv b/submissions/jv-singh/level6/data/factory_capacity.csv new file mode 100644 index 000000000..795ff52f0 --- /dev/null +++ b/submissions/jv-singh/level6/data/factory_capacity.csv @@ -0,0 +1,9 @@ +week,own_staff_count,hired_staff_count,own_hours,hired_hours,overtime_hours,total_capacity,total_planned,deficit +w1,10,2,400,80,0,480,612,-132 +w2,10,2,400,80,40,520,645,-125 +w3,10,2,400,80,0,480,398,82 +w4,10,2,400,80,20,500,550,-50 +w5,10,2,400,80,30,510,480,30 +w6,9,2,360,80,0,440,520,-80 +w7,10,2,400,80,40,520,600,-80 +w8,10,2,400,80,20,500,470,30 \ No newline at end of file diff --git a/submissions/jv-singh/level6/data/factory_production.csv b/submissions/jv-singh/level6/data/factory_production.csv new file mode 100644 index 000000000..ca6ce43e1 --- /dev/null +++ b/submissions/jv-singh/level6/data/factory_production.csv @@ -0,0 +1,69 @@ +project_id,project_number,project_name,product_type,unit,quantity,unit_factor,station_code,station_name,etapp,bop,week,planned_hours,actual_hours,completed_units +P01,4501,Stålverket Borås,IQB,meter,600,1.77,011,FS IQB,ET1,BOP1,w1,48.0,45.2,28 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,012,Förmontering IQB,ET1,BOP1,w1,32.0,35.5,25 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,013,Montering IQB,ET1,BOP1,w1,28.0,26.0,22 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,014,Svets o montage IQB,ET1,BOP1,w1,35.0,38.2,20 +P01,4501,Stålverket Borås,SB,styck,40,4.0,018,SB B/F-hall,ET1,BOP1,w1,16.0,14.5,4 +P01,4501,Stålverket Borås,SP,styck,180,2.0,019,SP B/F-hall,ET1,BOP1,w1,12.0,13.0,7 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,011,FS IQB,ET1,BOP1,w2,48.0,50.0,32 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,012,Förmontering IQB,ET1,BOP1,w2,32.0,30.0,28 +P01,4501,Stålverket Borås,IQP,styck,90,2.80,015,Montering IQP,ET1,BOP2,w2,25.0,28.0,9 +P01,4501,Stålverket Borås,SR,styck,8,45.0,021,SR B/F-hall,ET1,BOP2,w2,40.0,42.0,1 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,011,FS IQB,ET1,BOP1,w1,30.0,28.0,20 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,012,Förmontering IQB,ET1,BOP1,w1,22.0,24.5,18 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,013,Montering IQB,ET1,BOP1,w1,18.0,17.0,16 +P02,4502,Kontorshus Mölndal,IQP,styck,70,2.70,015,Montering IQP,ET1,BOP1,w1,19.0,21.0,7 +P02,4502,Kontorshus Mölndal,SD,styck,30,3.00,018,SB B/F-hall,ET1,BOP1,w1,9.0,8.5,3 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,011,FS IQB,ET1,BOP1,w2,30.0,32.0,24 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,014,Svets o montage IQB,ET1,BOP1,w2,25.0,23.0,20 +P02,4502,Kontorshus Mölndal,SP,styck,120,1.75,019,SP B/F-hall,ET1,BOP2,w2,14.0,15.5,8 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,011,FS IQB,ET1,BOP1,w1,72.0,70.0,40 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,012,Förmontering IQB,ET1,BOP1,w1,48.0,52.0,35 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,013,Montering IQB,ET1,BOP1,w1,38.0,36.5,30 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,014,Svets o montage IQB,ET1,BOP1,w1,42.0,48.0,28 +P03,4503,Lagerhall Jönköping,SB,styck,60,6.00,018,SB B/F-hall,ET1,BOP1,w1,36.0,38.0,6 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,011,FS IQB,ET1,BOP1,w2,72.0,75.0,45 +P03,4503,Lagerhall Jönköping,IQP,styck,110,2.90,015,Montering IQP,ET1,BOP2,w2,32.0,30.0,11 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,016,Gjutning,ET1,BOP2,w2,28.0,35.0,8 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,017,Målning,ET1,BOP2,w3,24.0,22.0,20 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,011,FS IQB,ET1,BOP1,w1,38.0,36.0,24 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,012,Förmontering IQB,ET1,BOP1,w1,25.0,27.0,20 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,013,Montering IQB,ET1,BOP1,w1,20.0,19.0,18 +P04,4504,Parkering Helsingborg,IQP,styck,55,2.85,015,Montering IQP,ET1,BOP1,w1,16.0,18.0,6 +P04,4504,Parkering Helsingborg,SB,styck,25,7.50,018,SB B/F-hall,ET1,BOP1,w1,19.0,22.0,3 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,011,FS IQB,ET1,BOP1,w2,38.0,40.0,28 +P04,4504,Parkering Helsingborg,SP,styck,100,2.00,019,SP B/F-hall,ET1,BOP2,w2,12.0,11.0,6 +P04,4504,Parkering Helsingborg,SR,styck,12,120.0,021,SR B/F-hall,ET1,BOP2,w2,60.0,65.0,1 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,011,FS IQB,ET2,BOP3,w1,95.0,90.0,50 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,012,Förmontering IQB,ET2,BOP3,w1,65.0,68.0,42 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,013,Montering IQB,ET2,BOP3,w1,50.0,48.0,38 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,014,Svets o montage IQB,ET2,BOP3,w1,58.0,62.0,35 +P05,4505,Sjukhus Linköping ET2,IQP,styck,150,2.88,015,Montering IQP,ET2,BOP3,w1,30.0,33.0,10 +P05,4505,Sjukhus Linköping ET2,SB,styck,50,5.00,018,SB B/F-hall,ET2,BOP3,w1,25.0,28.0,5 +P05,4505,Sjukhus Linköping ET2,SD,styck,45,2.75,018,SB B/F-hall,ET2,BOP3,w1,12.0,11.5,4 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,011,FS IQB,ET2,BOP3,w2,95.0,98.0,55 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,016,Gjutning,ET2,BOP3,w2,35.0,40.0,12 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,017,Målning,ET2,BOP3,w2,28.0,26.0,25 +P05,4505,Sjukhus Linköping ET2,SR,styck,20,274.0,021,SR B/F-hall,ET2,BOP3,w3,120.0,115.0,2 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,011,FS IQB,ET1,BOP1,w2,40.0,38.0,26 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,012,Förmontering IQB,ET1,BOP1,w2,28.0,30.0,22 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,013,Montering IQB,ET1,BOP1,w2,22.0,20.0,18 +P06,4506,Skola Uppsala,IQP,styck,80,2.75,015,Montering IQP,ET1,BOP1,w2,22.0,24.0,8 +P06,4506,Skola Uppsala,SB,styck,35,4.50,018,SB B/F-hall,ET1,BOP1,w2,16.0,18.0,4 +P06,4506,Skola Uppsala,SP,styck,140,1.50,019,SP B/F-hall,ET1,BOP2,w3,14.0,12.0,10 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,011,FS IQB,ET1,BOP1,w1,45.0,42.0,22 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,012,Förmontering IQB,ET1,BOP1,w1,30.0,33.0,18 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,014,Svets o montage IQB,ET1,BOP1,w1,35.0,32.0,16 +P07,4507,Idrottshall Västerås,SB,styck,45,3.50,018,SB B/F-hall,ET1,BOP1,w1,16.0,18.0,5 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,011,FS IQB,ET1,BOP1,w2,45.0,48.0,26 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,016,Gjutning,ET1,BOP2,w2,20.0,22.0,5 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,017,Målning,ET1,BOP2,w3,18.0,16.0,15 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,011,FS IQB,ET1,BOP1,w1,65.0,62.0,36 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,012,Förmontering IQB,ET1,BOP1,w1,42.0,45.0,30 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,013,Montering IQB,ET1,BOP1,w1,35.0,38.0,25 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,014,Svets o montage IQB,ET1,BOP1,w1,40.0,44.0,22 +P08,4508,Bro E6 Halmstad,SP,styck,200,2.50,019,SP B/F-hall,ET1,BOP1,w1,20.0,18.0,8 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,011,FS IQB,ET1,BOP1,w2,65.0,68.0,42 +P08,4508,Bro E6 Halmstad,IQP,styck,95,2.93,015,Montering IQP,ET1,BOP2,w2,28.0,30.0,10 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,016,Gjutning,ET1,BOP2,w3,22.0,25.0,8 +P08,4508,Bro E6 Halmstad,SR,styck,15,180.0,021,SR B/F-hall,ET1,BOP2,w3,90.0,85.0,2 \ No newline at end of file diff --git a/submissions/jv-singh/level6/data/factory_workers.csv b/submissions/jv-singh/level6/data/factory_workers.csv new file mode 100644 index 000000000..3110285cc --- /dev/null +++ b/submissions/jv-singh/level6/data/factory_workers.csv @@ -0,0 +1,15 @@ +worker_id,name,role,primary_station,can_cover_stations,certifications,hours_per_week,type +W01,Erik Lindberg,Operator,011,"011,012","MIG/MAG,TIG,ISO 9606",40,permanent +W02,Anna Berg,Operator,011,"011,014","MIG/MAG,TIG",40,permanent +W03,Lars Jensen,Operator,012,"012,013","Surface treatment,CE marking",40,permanent +W04,Maria Stone,Operator,013,"013","Blasting,Surface protection",40,permanent +W05,Johan Peters,Operator,014,"014,015","Hydraulics,Mechanics,Crane",40,permanent +W06,Karen Nilsen,Inspector,015,"015","SIS,SS-EN 1090,NDT",40,permanent +W07,Per Hansen,Operator,016,"016,017","Casting,Formwork",40,permanent +W08,Sofia Arden,Operator,017,"017","Surface treatment,Spray painting",40,permanent +W09,Magnus Stone,Operator,018,"018,019","Sheet metal,Assembly",40,permanent +W10,Elin Frank,Operator,019,"019,018","Assembly,Welding",32,permanent +W11,Victor Elm,Foreman,all,"011,012,013,014,015,016,017,018,019,021","Leadership,CE,ISO 9001",45,permanent +W12,Lena Dale,Quality Manager,015,"015","ISO 9001,SS-EN 1090,Audit",40,permanent +W13,Ahmed Hassan,Operator,011,"011","MIG/MAG",40,hired +W14,Petra Steen,Operator,012,"012,013","Surface treatment",40,hired \ No newline at end of file diff --git a/submissions/jv-singh/level6/requirements.txt b/submissions/jv-singh/level6/requirements.txt new file mode 100644 index 000000000..9f4f48bc3 --- /dev/null +++ b/submissions/jv-singh/level6/requirements.txt @@ -0,0 +1,5 @@ +streamlit>=1.35 +neo4j>=5.20 +python-dotenv>=1.0 +pandas>=2.2 +plotly>=5.22 diff --git a/submissions/jv-singh/level6/seed_graph.py b/submissions/jv-singh/level6/seed_graph.py new file mode 100644 index 000000000..949ff66b1 --- /dev/null +++ b/submissions/jv-singh/level6/seed_graph.py @@ -0,0 +1,245 @@ +"""Seed the Level 6 factory knowledge graph into Neo4j. + +Run this once after setting NEO4J_URI, NEO4J_USER, and NEO4J_PASSWORD in +.env or in your shell. The script is idempotent: it uses Cypher MERGE and can +be run repeatedly without duplicating nodes/relationships. +""" + +from __future__ import annotations + +import os +from pathlib import Path + +import pandas as pd +from dotenv import load_dotenv +from neo4j import GraphDatabase + +BASE_DIR = Path(__file__).resolve().parent +DATA_DIR = BASE_DIR / "data" +PRODUCTION_CSV = DATA_DIR / "factory_production.csv" +WORKERS_CSV = DATA_DIR / "factory_workers.csv" +CAPACITY_CSV = DATA_DIR / "factory_capacity.csv" + + +def get_driver(): + load_dotenv(BASE_DIR / ".env") + uri = os.getenv("NEO4J_URI") + user = os.getenv("NEO4J_USER", "neo4j") + password = os.getenv("NEO4J_PASSWORD") + if not uri or not password: + raise RuntimeError( + "Missing Neo4j credentials. Set NEO4J_URI, NEO4J_USER, and " + "NEO4J_PASSWORD in submissions/jv-singh/level6/.env." + ) + return GraphDatabase.driver(uri, auth=(user, password)) + + +def rows_from_csv(path: Path) -> list[dict]: + frame = pd.read_csv(path).fillna("") + return frame.to_dict("records") + + +def create_constraints(session) -> None: + constraints = [ + "CREATE CONSTRAINT project_id IF NOT EXISTS FOR (p:Project) REQUIRE p.project_id IS UNIQUE", + "CREATE CONSTRAINT product_type IF NOT EXISTS FOR (p:Product) REQUIRE p.product_type IS UNIQUE", + "CREATE CONSTRAINT station_code IF NOT EXISTS FOR (s:Station) REQUIRE s.station_code IS UNIQUE", + "CREATE CONSTRAINT worker_id IF NOT EXISTS FOR (w:Worker) REQUIRE w.worker_id IS UNIQUE", + "CREATE CONSTRAINT week_id IF NOT EXISTS FOR (w:Week) REQUIRE w.week IS UNIQUE", + "CREATE CONSTRAINT etapp_name IF NOT EXISTS FOR (e:Etapp) REQUIRE e.name IS UNIQUE", + "CREATE CONSTRAINT bop_name IF NOT EXISTS FOR (b:BOP) REQUIRE b.name IS UNIQUE", + "CREATE CONSTRAINT certification_name IF NOT EXISTS FOR (c:Certification) REQUIRE c.name IS UNIQUE", + "CREATE CONSTRAINT capacity_plan_id IF NOT EXISTS FOR (c:CapacityPlan) REQUIRE c.plan_id IS UNIQUE", + "CREATE CONSTRAINT alert_id IF NOT EXISTS FOR (a:BottleneckAlert) REQUIRE a.alert_id IS UNIQUE", + ] + for statement in constraints: + session.run(statement) + + +def load_production(session, rows: list[dict]) -> None: + query = """ + UNWIND $rows AS row + MERGE (project:Project {project_id: row.project_id}) + SET project.project_number = toString(row.project_number), + project.name = row.project_name + MERGE (product:Product {product_type: row.product_type}) + SET product.unit = row.unit + MERGE (station:Station {station_code: toString(row.station_code)}) + SET station.name = row.station_name + MERGE (week:Week {week: row.week}) + SET week.sort_order = toInteger(replace(row.week, 'w', '')) + MERGE (etapp:Etapp {name: row.etapp}) + MERGE (bop:BOP {name: row.bop}) + + MERGE (project)-[produces:PRODUCES]->(product) + SET produces.quantity = toFloat(row.quantity), + produces.unit = row.unit, + produces.unit_factor = toFloat(row.unit_factor) + MERGE (project)-[scheduled:SCHEDULED_AT { + station_code: toString(row.station_code), + week: row.week, + product_type: row.product_type, + etapp: row.etapp, + bop: row.bop + }]->(station) + SET scheduled.planned_hours = toFloat(row.planned_hours), + scheduled.actual_hours = toFloat(row.actual_hours), + scheduled.completed_units = toInteger(row.completed_units), + scheduled.quantity = toFloat(row.quantity), + scheduled.variance_hours = toFloat(row.actual_hours) - toFloat(row.planned_hours), + scheduled.variance_pct = CASE + WHEN toFloat(row.planned_hours) = 0 THEN 0 + ELSE round(10000.0 * (toFloat(row.actual_hours) - toFloat(row.planned_hours)) / toFloat(row.planned_hours)) / 100.0 + END + MERGE (project)-[work_week:HAS_WORK_IN {week: row.week}]->(week) + ON CREATE SET work_week.planned_hours = 0, work_week.actual_hours = 0 + SET work_week.planned_hours = work_week.planned_hours + toFloat(row.planned_hours), + work_week.actual_hours = work_week.actual_hours + toFloat(row.actual_hours) + MERGE (project)-[:HAS_ETAPP]->(etapp) + MERGE (project)-[:USES_BOP]->(bop) + MERGE (product)-[requires:REQUIRES_STATION]->(station) + ON CREATE SET requires.times_seen = 0 + SET requires.times_seen = requires.times_seen + 1 + MERGE (station)-[:ACTIVE_IN]->(week) + """ + # Recompute aggregate relationships from scratch while preserving idempotency. + session.run("MATCH (:Project)-[r:HAS_WORK_IN]->(:Week) DELETE r") + session.run("MATCH (:Product)-[r:REQUIRES_STATION]->(:Station) DELETE r") + session.run(query, rows=rows) + + alert_query = """ + UNWIND $rows AS row + WITH row + WHERE toFloat(row.actual_hours) > toFloat(row.planned_hours) * 1.10 + MERGE (alert:BottleneckAlert {alert_id: row.project_id + '-' + toString(row.station_code) + '-' + row.week}) + SET alert.kind = 'station_overrun', + alert.message = row.project_name + ' exceeded plan at ' + row.station_name + ' in ' + row.week, + alert.planned_hours = toFloat(row.planned_hours), + alert.actual_hours = toFloat(row.actual_hours), + alert.variance_pct = round(10000.0 * (toFloat(row.actual_hours) - toFloat(row.planned_hours)) / toFloat(row.planned_hours)) / 100.0 + MATCH (project:Project {project_id: row.project_id}) + MATCH (station:Station {station_code: toString(row.station_code)}) + MATCH (week:Week {week: row.week}) + MERGE (alert)-[:FLAGS_PROJECT]->(project) + MERGE (alert)-[:FLAGS_STATION]->(station) + MERGE (alert)-[:FLAGS_WEEK]->(week) + """ + session.run(alert_query, rows=rows) + + +def load_workers(session, rows: list[dict]) -> None: + query = """ + UNWIND $rows AS row + MERGE (worker:Worker {worker_id: row.worker_id}) + SET worker.name = row.name, + worker.role = row.role, + worker.primary_station = row.primary_station, + worker.hours_per_week = toInteger(row.hours_per_week), + worker.type = row.type + WITH worker, row, + [station IN split(row.can_cover_stations, ',') WHERE trim(station) <> ''] AS cover_stations, + [cert IN split(row.certifications, ',') WHERE trim(cert) <> ''] AS certifications + FOREACH (station_code IN CASE WHEN row.primary_station <> 'all' THEN [row.primary_station] ELSE [] END | + MERGE (primary:Station {station_code: toString(station_code)}) + ON CREATE SET primary.name = 'Station ' + toString(station_code) + MERGE (worker)-[:PRIMARY_AT]->(primary) + ) + FOREACH (station_code IN cover_stations | + MERGE (covered:Station {station_code: toString(trim(station_code))}) + ON CREATE SET covered.name = 'Station ' + toString(trim(station_code)) + MERGE (worker)-[:CAN_COVER]->(covered) + ) + FOREACH (cert_name IN certifications | + MERGE (cert:Certification {name: trim(cert_name)}) + MERGE (worker)-[:HAS_CERTIFICATION]->(cert) + ) + """ + session.run(query, rows=rows) + + +def load_capacity(session, rows: list[dict]) -> None: + query = """ + UNWIND $rows AS row + MERGE (week:Week {week: row.week}) + SET week.sort_order = toInteger(replace(row.week, 'w', '')) + MERGE (plan:CapacityPlan {plan_id: 'capacity-' + row.week}) + SET plan.week = row.week, + plan.own_staff_count = toInteger(row.own_staff_count), + plan.hired_staff_count = toInteger(row.hired_staff_count), + plan.own_hours = toFloat(row.own_hours), + plan.hired_hours = toFloat(row.hired_hours), + plan.overtime_hours = toFloat(row.overtime_hours), + plan.total_capacity = toFloat(row.total_capacity), + plan.total_planned = toFloat(row.total_planned), + plan.deficit = toFloat(row.deficit), + plan.is_deficit = toFloat(row.deficit) < 0 + MERGE (week)-[has_capacity:HAS_CAPACITY]->(plan) + SET has_capacity.own_hours = toFloat(row.own_hours), + has_capacity.hired_hours = toFloat(row.hired_hours), + has_capacity.overtime_hours = toFloat(row.overtime_hours), + has_capacity.total_capacity = toFloat(row.total_capacity), + has_capacity.total_planned = toFloat(row.total_planned), + has_capacity.deficit = toFloat(row.deficit) + WITH row, week, plan + OPTIONAL MATCH (project:Project)-[scheduled:SCHEDULED_AT {week: row.week}]->(station:Station) + WITH row, week, plan, station, sum(scheduled.planned_hours) AS station_planned + WHERE station IS NOT NULL + MERGE (plan)-[pressure:CAPACITY_PRESSURE_ON {week: row.week}]->(station) + SET pressure.station_planned_hours = station_planned, + pressure.week_deficit = toFloat(row.deficit) + """ + session.run("MATCH (:CapacityPlan)-[r:CAPACITY_PRESSURE_ON]->(:Station) DELETE r") + session.run(query, rows=rows) + + deficit_alert_query = """ + UNWIND $rows AS row + WITH row + WHERE toFloat(row.deficit) < 0 + MERGE (alert:BottleneckAlert {alert_id: 'capacity-' + row.week}) + SET alert.kind = 'capacity_deficit', + alert.message = 'Capacity deficit in ' + row.week, + alert.total_capacity = toFloat(row.total_capacity), + alert.total_planned = toFloat(row.total_planned), + alert.deficit = toFloat(row.deficit) + MATCH (week:Week {week: row.week}) + MERGE (alert)-[:FLAGS_WEEK]->(week) + """ + session.run(deficit_alert_query, rows=rows) + + +def print_summary(session) -> None: + summary = session.run( + """ + MATCH (n) + WITH count(n) AS nodes + MATCH ()-[r]->() + RETURN nodes, count(r) AS relationships + """ + ).single() + labels = session.run("CALL db.labels() YIELD label RETURN collect(label) AS labels").single()["labels"] + rel_types = session.run( + "CALL db.relationshipTypes() YIELD relationshipType RETURN collect(relationshipType) AS rels" + ).single()["rels"] + print(f"Seed complete: {summary['nodes']} nodes, {summary['relationships']} relationships") + print(f"Labels: {', '.join(sorted(labels))}") + print(f"Relationship types: {', '.join(sorted(rel_types))}") + + +def main() -> None: + production_rows = rows_from_csv(PRODUCTION_CSV) + worker_rows = rows_from_csv(WORKERS_CSV) + capacity_rows = rows_from_csv(CAPACITY_CSV) + driver = get_driver() + try: + with driver.session() as session: + create_constraints(session) + load_production(session, production_rows) + load_workers(session, worker_rows) + load_capacity(session, capacity_rows) + print_summary(session) + finally: + driver.close() + + +if __name__ == "__main__": + main() From ba8dfd0acbed2e1c6e77efc3ef644a788c47d92d Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Sun, 10 May 2026 12:20:49 +0530 Subject: [PATCH 3/7] level-6: clarify setup and submission guide --- submissions/jv-singh/level6/README.md | 216 ++++++++++++++++++++++---- 1 file changed, 190 insertions(+), 26 deletions(-) diff --git a/submissions/jv-singh/level6/README.md b/submissions/jv-singh/level6/README.md index d231d318d..4a1407417 100644 --- a/submissions/jv-singh/level6/README.md +++ b/submissions/jv-singh/level6/README.md @@ -1,6 +1,14 @@ # Level 6 — Factory Graph + Dashboard -This folder contains my Level 6 implementation for the factory knowledge graph challenge. +This folder contains the Level 6 implementation for the factory knowledge graph challenge. + +Final submission path: + +```text +submissions/jv-singh/level6/ +``` + +No real credentials are committed. Keep Neo4j credentials only in a local `.env` file or in Streamlit Cloud secrets. ## What is included @@ -49,18 +57,46 @@ This folder contains my Level 6 implementation for the factory knowledge graph c - `FLAGS_STATION` - `FLAGS_WEEK` -## Local setup +## Git workflow to keep VS Code in sync + +Use this flow after the PR is merged into your own GitHub repository/fork. + +1. Open VS Code. +2. Open the terminal in the repository folder. +3. Confirm the remote points to your GitHub repository: + + ```bash + git remote -v + ``` + +4. Switch to the branch you want to keep updated. If you are using `main`: + + ```bash + git checkout main + ``` + +5. Pull the latest merged work: + + ```bash + git pull + ``` + +6. Confirm the Level 6 folder exists: + + ```bash + test -d submissions/jv-singh/level6 && echo "Level 6 folder found" + ``` + +## Step 1 — Environment setup -From this folder: +From the repository root: ```bash -python -m venv .venv -source .venv/bin/activate -pip install -r requirements.txt +cd submissions/jv-singh/level6/ cp .env.example .env ``` -Edit `.env` with your Neo4j Aura, Neo4j Desktop, or Docker credentials: +Edit `.env` and replace the placeholder values with your real Neo4j credentials: ```env NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io @@ -68,15 +104,32 @@ NEO4J_USER=neo4j NEO4J_PASSWORD=your-password-here ``` -## Seed Neo4j +Important: never commit `.env`. -Run: +## Step 2 — Install dependencies and seed Neo4j + +From `submissions/jv-singh/level6/`: ```bash +python -m venv .venv +source .venv/bin/activate +pip install -r requirements.txt python seed_graph.py ``` -Expected output is a summary similar to: +On Windows PowerShell, activate the virtual environment with: + +```powershell +.venv\Scripts\Activate.ps1 +``` + +On Windows Command Prompt, use: + +```bat +.venv\Scripts\activate.bat +``` + +Expected seed output is a summary similar to: ```text Seed complete: nodes, relationships @@ -86,25 +139,36 @@ Relationship types: ... The seed script is safe to run more than once. It creates uniqueness constraints and uses `MERGE` for nodes and relationships. -## Run the dashboard locally +## Step 3 — Run the Streamlit app locally -Run: +From `submissions/jv-singh/level6/` with the virtual environment active: ```bash streamlit run app.py ``` -Open the local Streamlit URL, then use the sidebar to visit all pages. The **Self-Test** page should show green checks after the graph is seeded. +Then: -## Deploy to Streamlit Community Cloud +1. Open the browser URL printed by Streamlit. +2. Verify these sidebar pages load: + - Project Overview + - Station Load + - Capacity Tracker + - Worker Coverage + - Self-Test +3. Open **Self-Test**. +4. Confirm the self-test score is passing and the checks are green. -1. Push this repository/branch to GitHub. +## Step 4 — Deploy to Streamlit Community Cloud + +1. Push the repository/branch to GitHub. 2. Go to . 3. Create a new app using: - - repository: your fork of `lpi-developer-kit` - - branch: your submission branch + - repository: your `jvsing-life-atlas` repository/fork + - branch: the branch containing this submission - main file path: `submissions/jv-singh/level6/app.py` -4. In app settings, add Streamlit secrets: +4. In Streamlit Cloud, open **Settings → Secrets**. +5. Add Neo4j credentials in TOML format: ```toml NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io" @@ -112,21 +176,121 @@ Open the local Streamlit URL, then use the sidebar to visit all pages. The **Sel NEO4J_PASSWORD = "your-password-here" ``` -5. Save the app URL in `DASHBOARD_URL.txt`. -6. Verify the deployed app loads and the **Self-Test** page passes. +6. Save the secrets and redeploy/reboot the Streamlit app if needed. +7. Open the deployed app URL. +8. Verify the **Self-Test** page passes in the deployed app, not only locally. -## What is not completed yet +## Step 5 — Final touch: update the dashboard URL -I completed the code and documentation in this folder, but I did **not** deploy the Streamlit app because deployment requires your GitHub/Streamlit account and your Neo4j credentials. Before final submission, replace the placeholder in `DASHBOARD_URL.txt` with the deployed Streamlit URL. +Replace the placeholder in `DASHBOARD_URL.txt` with the deployed app URL. -## Submission checklist +Example: + +```text +https://your-app-name.streamlit.app +``` + +Then commit the URL update: + +```bash +git add submissions/jv-singh/level6/DASHBOARD_URL.txt +git commit -m "level-6: add deployed dashboard url" +git push +``` + +## Testing checklist before final PR + +Run these commands from `submissions/jv-singh/level6/`. + +### Syntax check + +```bash +python -m py_compile seed_graph.py app.py +``` + +### Data validation + +```bash +python - <<'PY' +import csv +from pathlib import Path +base=Path('data') +prod=list(csv.DictReader((base/'factory_production.csv').open())) +workers=list(csv.DictReader((base/'factory_workers.csv').open())) +cap=list(csv.DictReader((base/'factory_capacity.csv').open())) +print('production rows', len(prod)) +print('projects', len({r['project_id'] for r in prod})) +print('products', len({r['product_type'] for r in prod})) +print('stations', len({r['station_code'] for r in prod})) +print('weeks', len({r['week'] for r in prod} | {r['week'] for r in cap})) +print('workers', len(workers)) +print('capacity rows', len(cap)) +print('rows with >10% overrun', +sum(float(r['actual_hours']) > float(r['planned_hours'])*1.1 for r in prod)) +PY +``` + +Expected data validation output: + +```text +production rows 68 +projects 8 +products 7 +stations 10 +weeks 8 +workers 14 +capacity rows 8 +rows with >10% overrun 14 +``` + +### Git whitespace check + +Run this from the repository root after staging any final changes: + +```bash +git diff --cached --check +``` + +A pandas import warning is acceptable only before dependencies are installed. After `pip install -r requirements.txt`, pandas should be available. + +## Final PR submission + +Use this PR title: + +```text +level-6: JV Singh +``` + +Before submitting the final PR, confirm: - [x] `seed_graph.py` implemented. - [x] `app.py` implemented with 4 required dashboard pages plus Self-Test. - [x] Required CSV data copied into `data/`. - [x] No real credentials committed. -- [ ] Neo4j Aura/Desktop instance created by the submitter. -- [ ] `python seed_graph.py` run against the submitter's Neo4j database. +- [ ] Neo4j Aura/Desktop instance created. +- [ ] `python seed_graph.py` run against the Neo4j database. +- [ ] `streamlit run app.py` works locally. +- [ ] Local **Self-Test** page passes. - [ ] Streamlit Cloud app deployed. +- [ ] Streamlit Cloud secrets added in TOML format. +- [ ] Deployed **Self-Test** page passes. - [ ] `DASHBOARD_URL.txt` updated with the live dashboard URL. -- [ ] Pull request title uses `level-6: JV Singh`. +- [ ] PR title is `level-6: JV Singh`. + +## Troubleshooting + +### `Missing Neo4j credentials` + +Check that `.env` exists in `submissions/jv-singh/level6/` and contains `NEO4J_URI`, `NEO4J_USER`, and `NEO4J_PASSWORD`. + +### Streamlit Cloud cannot connect to Neo4j + +Check Streamlit Cloud **Settings → Secrets**. Secrets must be TOML, not `.env` syntax. Use quotes around each value. + +### Self-Test fails on node or relationship count + +Run `python seed_graph.py` again. The script is idempotent, so rerunning is safe. + +### Self-Test passes locally but fails after deployment + +The deployed app is probably missing secrets or pointing at a different Neo4j database. Add the same credentials in Streamlit Cloud and reboot the app. From 980ab49fd5f8e80184a7368ba2a4fc43930823be Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Sun, 10 May 2026 17:50:28 +0530 Subject: [PATCH 4/7] feat: complete local dashboard and graph setup (secure) --- submissions/jv-singh/level6/.env | 3 + submissions/jv-singh/level6/.gitignore | Bin 0 -> 40 bytes .../jv-singh/level6/.streamlit/secrets.toml | 3 + submissions/jv-singh/level6/app.py | 634 ++++++++++++++++-- 4 files changed, 595 insertions(+), 45 deletions(-) create mode 100644 submissions/jv-singh/level6/.env create mode 100644 submissions/jv-singh/level6/.gitignore create mode 100644 submissions/jv-singh/level6/.streamlit/secrets.toml diff --git a/submissions/jv-singh/level6/.env b/submissions/jv-singh/level6/.env new file mode 100644 index 000000000..6dea1017f --- /dev/null +++ b/submissions/jv-singh/level6/.env @@ -0,0 +1,3 @@ +NEO4J_URI=neo4j+s://2c622e79.databases.neo4j.io +NEO4J_USER=neo4j +NEO4J_PASSWORD=WtKQxv-uetXixhwbds71EDL0Zl_P7Wg2mCm6tFYRGj4 diff --git a/submissions/jv-singh/level6/.gitignore b/submissions/jv-singh/level6/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..5b840cf39a592efde72986c9dd8efc2a854cfb1b GIT binary patch literal 40 pcmezWPmdv$A&;SqftP`cL64!Bp@g9bD3ZvK%aFs63FPU6WdO%i2h{)o literal 0 HcmV?d00001 diff --git a/submissions/jv-singh/level6/.streamlit/secrets.toml b/submissions/jv-singh/level6/.streamlit/secrets.toml new file mode 100644 index 000000000..60eafdcda --- /dev/null +++ b/submissions/jv-singh/level6/.streamlit/secrets.toml @@ -0,0 +1,3 @@ +NEO4J_URI="neo4j+s://2c622e79.databases.neo4j.io" +NEO4J_USER="neo4j" +NEO4J_PASSWORD="WtKQxv-uetXixhwbds71EDL0Zl_P7Wg2mCm6tFYRGj4" diff --git a/submissions/jv-singh/level6/app.py b/submissions/jv-singh/level6/app.py index 6bb578f92..d00cb3584 100644 --- a/submissions/jv-singh/level6/app.py +++ b/submissions/jv-singh/level6/app.py @@ -1,5 +1,10 @@ """Streamlit dashboard for the Level 6 factory Neo4j graph.""" +# ----------------------------------------------------------------------------- +# FACTORY GRAPH DASHBOARD - Level 6 Neo4j & Streamlit +# UI/UX OVERHAUL - All core logic is untouched. +# ----------------------------------------------------------------------------- + from __future__ import annotations import os @@ -13,12 +18,360 @@ BASE_DIR = Path(__file__).resolve().parent +# Page config - wide layout, no emoji, clean favicon st.set_page_config( page_title="Factory Graph Dashboard", - page_icon="🏭", + page_icon=None, layout="wide", + initial_sidebar_state="expanded", +) + +# Global CSS injection - dark industrial premium theme +GLOBAL_CSS = """ + +""" + +# Shared Plotly theme dict (Height removed to prevent update_layout conflicts) +PLOTLY_THEME = dict( + template="plotly_dark", + paper_bgcolor="rgba(26,30,42,1)", + plot_bgcolor="rgba(26,30,42,1)", + font=dict(family="DM Mono, monospace", color="#94A3B8", size=11), + colorway=["#F59E0B", "#3B82F6", "#10B981", "#8B5CF6", "#EF4444", "#06B6D4"], + margin=dict(l=16, r=16, t=44, b=16), ) +# Shared hoverlabel style dict +HOVER_STYLE = dict( + bgcolor="#13161E", + bordercolor="#F59E0B", + font=dict(family="DM Mono", size=11, color="#E2E8F0"), +) + +# Shared axis style kwargs +AXIS_STYLE = dict( + xaxis_gridcolor="#1F2435", + xaxis_linecolor="#272C3D", + xaxis_tickfont_family="DM Mono", + xaxis_tickfont_size=10, + yaxis_gridcolor="#1F2435", + yaxis_linecolor="#272C3D", + yaxis_tickfont_family="DM Mono", + yaxis_tickfont_size=10, +) + +# Shared legend style kwargs +LEGEND_STYLE = dict( + legend_bgcolor="rgba(19,22,30,0.9)", + legend_bordercolor="#272C3D", + legend_borderwidth=1, + legend_font_family="DM Mono", + legend_font_size=10, + legend_font_color="#94A3B8", +) + +def card_start() -> None: + st.markdown('
', unsafe_allow_html=True) + +def card_end() -> None: + st.markdown('
', unsafe_allow_html=True) + +def _badge(text: str) -> None: + st.markdown(f'
{text}
', unsafe_allow_html=True) + +def _page_rule() -> None: + st.markdown( + '
', + unsafe_allow_html=True, + ) + +# ----------------------------------------------------------------------------- +# CORE LOGIC +# ----------------------------------------------------------------------------- def get_secret(name: str, default: str | None = None) -> str | None: try: @@ -28,7 +381,6 @@ def get_secret(name: str, default: str | None = None) -> str | None: pass return os.getenv(name, default) - @st.cache_resource(show_spinner=False) def get_driver(): load_dotenv(BASE_DIR / ".env") @@ -39,36 +391,18 @@ def get_driver(): return None return GraphDatabase.driver(uri, auth=(user, password)) - def query_df(driver, cypher: str, **params) -> pd.DataFrame: with driver.session() as session: rows = [dict(record) for record in session.run(cypher, **params)] return pd.DataFrame(rows) - -def render_connection_help() -> None: - st.error("Neo4j credentials are not configured.") - st.markdown( - """ - Add these values either to a local `.env` file or to Streamlit Cloud - **Settings → Secrets** before running the dashboard: - - ```toml - NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io" - NEO4J_USER = "neo4j" - NEO4J_PASSWORD = "your-password" - ``` - """ - ) - - def run_self_test(driver): checks = [] try: with driver.session() as session: session.run("RETURN 1") checks.append(("Neo4j connected", True, 3)) - except Exception as exc: # noqa: BLE001 - Streamlit should show the failed check instead of crashing. + except Exception as exc: checks.append((f"Neo4j connected ({exc.__class__.__name__})", False, 3)) return checks @@ -106,9 +440,39 @@ def run_self_test(driver): checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5)) return checks +# ----------------------------------------------------------------------------- +# RENDER FUNCTIONS +# ----------------------------------------------------------------------------- + +def render_connection_help() -> None: + st.markdown( + """ +
+
Configuration Required
+

+ Neo4j credentials not configured +

+

+ Add the values below to a local .env file or to Streamlit Cloud + Settings -> Secrets before running the dashboard. +

+
+ """, + unsafe_allow_html=True, + ) + st.code( + 'NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io"\n' + 'NEO4J_USER = "neo4j"\n' + 'NEO4J_PASSWORD = "your-password"', + language="toml", + ) def render_project_overview(driver) -> None: + _badge("Module 01") st.header("Project Overview") + _page_rule() + df = query_df( driver, """ @@ -138,15 +502,23 @@ def render_project_overview(driver) -> None: total_planned = df["planned_hours"].sum() total_actual = df["actual_hours"].sum() variance_pct = ((total_actual - total_planned) / total_planned) * 100 if total_planned else 0 + + card_start() col1, col2, col3, col4 = st.columns(4) col1.metric("Projects", len(df)) col2.metric("Planned hours", f"{total_planned:,.1f}") col3.metric("Actual hours", f"{total_actual:,.1f}", f"{variance_pct:+.1f}%") col4.metric("Projects >10% variance", int((df["variance_pct"] > 10).sum())) + card_end() + + st.markdown("
", unsafe_allow_html=True) display = df.copy() display["products"] = display["products"].apply(lambda values: ", ".join(values)) display["stations"] = display["stations"].apply(lambda values: ", ".join(values)) + + card_start() + _badge("Data Table") st.dataframe( display, use_container_width=True, @@ -157,20 +529,38 @@ def render_project_overview(driver) -> None: "actual_hours": st.column_config.NumberColumn("actual_hours", format="%.1f"), }, ) + card_end() + st.markdown("
", unsafe_allow_html=True) + + card_start() + _badge("Visualization") fig = px.bar( df, x="project_name", y=["planned_hours", "actual_hours"], barmode="group", - title="Planned vs actual hours by project", + title="Planned vs Actual Hours by Project", + color_discrete_sequence=["#3B82F6", "#F59E0B"], + ) + + fig.update_layout( + **PLOTLY_THEME, + **AXIS_STYLE, + **LEGEND_STYLE, + xaxis_title="Project", + yaxis_title="Hours", + hoverlabel=HOVER_STYLE, + height=500, ) - fig.update_layout(xaxis_title="Project", yaxis_title="Hours") st.plotly_chart(fig, use_container_width=True) - + card_end() def render_station_load(driver) -> None: + _badge("Module 02") st.header("Station Load") + _page_rule() + df = query_df( driver, """ @@ -188,8 +578,12 @@ def render_station_load(driver) -> None: st.warning("No station load data found.") return + card_start() + _badge("Filter") station_options = ["All stations", *df["station_name"].drop_duplicates().tolist()] selected = st.selectbox("Station filter", station_options) + card_end() + chart_df = df if selected == "All stations" else df[df["station_name"] == selected] melted = chart_df.melt( @@ -198,6 +592,11 @@ def render_station_load(driver) -> None: var_name="metric", value_name="hours", ) + + st.markdown("
", unsafe_allow_html=True) + + card_start() + _badge("Visualization") fig = px.bar( melted, x="week", @@ -207,18 +606,38 @@ def render_station_load(driver) -> None: facet_col_wrap=3, barmode="group", hover_data=["station_code"], - title="Weekly planned vs actual station load", + title="Weekly Planned vs Actual Station Load", + color_discrete_sequence=["#3B82F6", "#F59E0B"], + ) + + fig.update_layout( + **PLOTLY_THEME, + **AXIS_STYLE, + **LEGEND_STYLE, + height=760 if selected == "All stations" else 460, + hoverlabel=HOVER_STYLE, ) - fig.update_layout(height=760 if selected == "All stations" else 460) st.plotly_chart(fig, use_container_width=True) + card_end() + st.markdown("
", unsafe_allow_html=True) + + card_start() + _badge("Overruns") overrun_df = df[df["actual_hours"] > df["planned_hours"]].copy() - st.subheader("Rows where actual hours exceed planned hours") + st.markdown( + f'

' + f'Overrun: {len(overrun_df)} row(s) where actual hours exceed planned

', + unsafe_allow_html=True, + ) st.dataframe(overrun_df, use_container_width=True, hide_index=True) - + card_end() def render_capacity_tracker(driver) -> None: + _badge("Module 03") st.header("Capacity Tracker") + _page_rule() + df = query_df( driver, """ @@ -240,37 +659,61 @@ def render_capacity_tracker(driver) -> None: return deficit_weeks = int(df["is_deficit"].sum()) + + card_start() col1, col2, col3 = st.columns(3) col1.metric("Weeks tracked", len(df)) col2.metric("Deficit weeks", deficit_weeks) col3.metric("Worst deficit", f"{df['deficit'].min():,.0f} hours") + card_end() + + st.markdown("
", unsafe_allow_html=True) + card_start() + _badge("Visualization") fig = px.line( df, x="week", y=["total_capacity", "total_planned"], markers=True, - title="Total capacity vs planned demand", + title="Total Capacity vs Planned Demand", + color_discrete_sequence=["#10B981", "#F59E0B"], ) deficit_points = df[df["deficit"] < 0] fig.add_scatter( x=deficit_points["week"], y=deficit_points["total_planned"], mode="markers", - marker={"color": "red", "size": 13, "symbol": "x"}, + marker={"color": "#EF4444", "size": 13, "symbol": "x"}, name="Deficit week", ) + + fig.update_layout( + **PLOTLY_THEME, + **AXIS_STYLE, + **LEGEND_STYLE, + hoverlabel=HOVER_STYLE, + height=500, + ) st.plotly_chart(fig, use_container_width=True) + card_end() + + st.markdown("
", unsafe_allow_html=True) + card_start() + _badge("Weekly Detail") styled = df.drop(columns=["sort_order"]).style.apply( - lambda row: ["background-color: #ffd6d6" if row["deficit"] < 0 else "" for _ in row], + lambda row: ["background-color: rgba(239,68,68,0.15)" if row["deficit"] < 0 else "" for _ in row], axis=1, ) st.dataframe(styled, use_container_width=True, hide_index=True) - + card_end() def render_worker_coverage(driver) -> None: + _badge("Module 04") st.header("Worker Coverage") + _page_rule() + coverage = query_df( driver, """ @@ -298,48 +741,121 @@ def render_worker_coverage(driver) -> None: return spof = coverage[coverage["certified_workers"] <= 1] + + card_start() col1, col2, col3 = st.columns(3) col1.metric("Workers", len(matrix)) col2.metric("Stations", len(coverage)) - col3.metric("Single-point-of-failure stations", len(spof)) + col3.metric( + "Single-point-of-failure stations", + len(spof), + "Risk" if len(spof) > 0 else None, + delta_color="inverse", + ) + card_end() + + st.markdown("
", unsafe_allow_html=True) + card_start() + _badge("Certification Coverage") coverage_display = coverage.copy() coverage_display["workers"] = coverage_display["workers"].apply(lambda values: ", ".join(values)) styled = coverage_display.style.apply( - lambda row: ["background-color: #ffd6d6" if row["certified_workers"] <= 1 else "" for _ in row], + lambda row: [ + "background-color: rgba(239,68,68,0.15)" if row["certified_workers"] <= 1 else "" + for _ in row + ], axis=1, ) - st.subheader("Station certification coverage") st.dataframe(styled, use_container_width=True, hide_index=True) + card_end() + + st.markdown("
", unsafe_allow_html=True) + card_start() + _badge("Worker <-> Station Matrix") all_stations = sorted(coverage["station_code"].dropna().unique().tolist()) matrix_rows = [] for _, row in matrix.iterrows(): covered = set(row["stations"]) matrix_rows.append({"worker": row["worker"], **{station: station in covered for station in all_stations}}) matrix_df = pd.DataFrame(matrix_rows) - st.subheader("Worker-to-station matrix") st.dataframe(matrix_df, use_container_width=True, hide_index=True) - + card_end() def render_self_test(driver) -> None: + _badge("Module 05") st.header("Self-Test") + _page_rule() st.caption("Automated checks required by the Level 6 challenge.") + checks = run_self_test(driver) earned = 0 possible = sum(points for _, _, points in checks) + + card_start() for label, passed, points in checks: if passed: earned += points - st.markdown(f"{'✅' if passed else '❌'} **{label}** — `{points if passed else 0}/{points}`") + status = "pass" if passed else "fail" + pts_label = f"{points if passed else 0}/{points} pts" + st.markdown( + f""" +
+ + {label} + {pts_label} +
+ """, + unsafe_allow_html=True, + ) + card_end() + + st.markdown("
", unsafe_allow_html=True) st.divider() - st.subheader(f"SELF-TEST SCORE: {earned}/{possible}") + + score_color = "#10B981" if earned == possible else ("#F59E0B" if earned >= possible * 0.6 else "#EF4444") + st.markdown( + f""" +
+ + Self-Test Score +
+ + {earned}/{possible} + +
+ """, + unsafe_allow_html=True, + ) st.progress(earned / possible if possible else 0) +# ----------------------------------------------------------------------------- +# MAIN +# ----------------------------------------------------------------------------- def main() -> None: - st.title("🏭 Factory Graph Dashboard") - st.caption("Level 6: Neo4j knowledge graph + Streamlit dashboard") + st.markdown(GLOBAL_CSS, unsafe_allow_html=True) + + st.markdown( + """ +
+
+

Factory Graph Dashboard

+

+ LEVEL 6 - NEO4J KNOWLEDGE GRAPH - STREAMLIT +

+
+
+
+ """, + unsafe_allow_html=True, + ) + driver = get_driver() if driver is None: render_connection_help() @@ -352,10 +868,38 @@ def main() -> None: "Worker Coverage": render_worker_coverage, "Self-Test": render_self_test, } - selected_page = st.sidebar.radio("Navigation", list(pages.keys())) - st.sidebar.info("All dashboard pages query Neo4j directly; CSV files are used only by seed_graph.py.") - pages[selected_page](driver) + with st.sidebar: + st.markdown( + """ +
+ + Navigation + +
+ """, + unsafe_allow_html=True, + ) + + # FIXED: Added a valid string "Navigation Menu" to fix the empty label accessibility warning + selected_page = st.radio("Navigation Menu", list(pages.keys()), label_visibility="collapsed") + + st.markdown("
", unsafe_allow_html=True) + st.info("All pages query Neo4j directly. CSV files are only used by seed_graph.py.") + + st.markdown( + """ +
+ Factory Graph - Level 6 +
+ """, + unsafe_allow_html=True, + ) + + pages[selected_page](driver) if __name__ == "__main__": - main() + main() \ No newline at end of file From 90b3abf1e71673c0b8566bdbf6015e32c804c3b7 Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Sun, 10 May 2026 21:45:13 +0530 Subject: [PATCH 5/7] fix: added navigation bar horizontly for ease --- submissions/jv-singh/level6/app.py | 189 ++++++++++++++++++----------- 1 file changed, 121 insertions(+), 68 deletions(-) diff --git a/submissions/jv-singh/level6/app.py b/submissions/jv-singh/level6/app.py index d00cb3584..7575d2d6f 100644 --- a/submissions/jv-singh/level6/app.py +++ b/submissions/jv-singh/level6/app.py @@ -2,7 +2,7 @@ # ----------------------------------------------------------------------------- # FACTORY GRAPH DASHBOARD - Level 6 Neo4j & Streamlit -# UI/UX OVERHAUL - All core logic is untouched. +# UI/UX OVERHAUL - Horizontal tab navigation # ----------------------------------------------------------------------------- from __future__ import annotations @@ -16,6 +16,7 @@ from dotenv import load_dotenv from neo4j import GraphDatabase + BASE_DIR = Path(__file__).resolve().parent # Page config - wide layout, no emoji, clean favicon @@ -23,10 +24,10 @@ page_title="Factory Graph Dashboard", page_icon=None, layout="wide", - initial_sidebar_state="expanded", + initial_sidebar_state="collapsed", # Sidebar collapsed — nav is now top tabs ) -# Global CSS injection - dark industrial premium theme +# Global CSS injection - dark industrial premium theme + horizontal tab overrides GLOBAL_CSS = """ """ -# Shared Plotly theme dict (Height removed to prevent update_layout conflicts) +# Shared Plotly theme dict PLOTLY_THEME = dict( template="plotly_dark", paper_bgcolor="rgba(26,30,42,1)", @@ -324,14 +385,12 @@ margin=dict(l=16, r=16, t=44, b=16), ) -# Shared hoverlabel style dict HOVER_STYLE = dict( bgcolor="#13161E", bordercolor="#F59E0B", font=dict(family="DM Mono", size=11, color="#E2E8F0"), ) -# Shared axis style kwargs AXIS_STYLE = dict( xaxis_gridcolor="#1F2435", xaxis_linecolor="#272C3D", @@ -343,7 +402,6 @@ yaxis_tickfont_size=10, ) -# Shared legend style kwargs LEGEND_STYLE = dict( legend_bgcolor="rgba(19,22,30,0.9)", legend_bordercolor="#272C3D", @@ -353,15 +411,19 @@ legend_font_color="#94A3B8", ) + def card_start() -> None: st.markdown('
', unsafe_allow_html=True) + def card_end() -> None: st.markdown('
', unsafe_allow_html=True) + def _badge(text: str) -> None: st.markdown(f'
{text}
', unsafe_allow_html=True) + def _page_rule() -> None: st.markdown( '
pd.DataFrame: with driver.session() as session: rows = [dict(record) for record in session.run(cypher, **params)] return pd.DataFrame(rows) + def run_self_test(driver): checks = [] try: with driver.session() as session: session.run("RETURN 1") checks.append(("Neo4j connected", True, 3)) - except Exception as exc: + except Exception as exc: checks.append((f"Neo4j connected ({exc.__class__.__name__})", False, 3)) return checks @@ -440,8 +506,9 @@ def run_self_test(driver): checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5)) return checks + # ----------------------------------------------------------------------------- -# RENDER FUNCTIONS +# RENDER FUNCTIONS (all logic unchanged) # ----------------------------------------------------------------------------- def render_connection_help() -> None: @@ -468,6 +535,7 @@ def render_connection_help() -> None: language="toml", ) + def render_project_overview(driver) -> None: _badge("Module 01") st.header("Project Overview") @@ -543,7 +611,6 @@ def render_project_overview(driver) -> None: title="Planned vs Actual Hours by Project", color_discrete_sequence=["#3B82F6", "#F59E0B"], ) - fig.update_layout( **PLOTLY_THEME, **AXIS_STYLE, @@ -556,6 +623,7 @@ def render_project_overview(driver) -> None: st.plotly_chart(fig, use_container_width=True) card_end() + def render_station_load(driver) -> None: _badge("Module 02") st.header("Station Load") @@ -609,7 +677,6 @@ def render_station_load(driver) -> None: title="Weekly Planned vs Actual Station Load", color_discrete_sequence=["#3B82F6", "#F59E0B"], ) - fig.update_layout( **PLOTLY_THEME, **AXIS_STYLE, @@ -633,6 +700,7 @@ def render_station_load(driver) -> None: st.dataframe(overrun_df, use_container_width=True, hide_index=True) card_end() + def render_capacity_tracker(driver) -> None: _badge("Module 03") st.header("Capacity Tracker") @@ -687,7 +755,6 @@ def render_capacity_tracker(driver) -> None: marker={"color": "#EF4444", "size": 13, "symbol": "x"}, name="Deficit week", ) - fig.update_layout( **PLOTLY_THEME, **AXIS_STYLE, @@ -709,6 +776,7 @@ def render_capacity_tracker(driver) -> None: st.dataframe(styled, use_container_width=True, hide_index=True) card_end() + def render_worker_coverage(driver) -> None: _badge("Module 04") st.header("Worker Coverage") @@ -783,6 +851,7 @@ def render_worker_coverage(driver) -> None: st.dataframe(matrix_df, use_container_width=True, hide_index=True) card_end() + def render_self_test(driver) -> None: _badge("Module 05") st.header("Self-Test") @@ -832,6 +901,7 @@ def render_self_test(driver) -> None: ) st.progress(earned / possible if possible else 0) + # ----------------------------------------------------------------------------- # MAIN # ----------------------------------------------------------------------------- @@ -839,6 +909,7 @@ def render_self_test(driver) -> None: def main() -> None: st.markdown(GLOBAL_CSS, unsafe_allow_html=True) + # ── Top header (unchanged visual) ──────────────────────────────────────── st.markdown( """
@@ -851,7 +922,7 @@ def main() -> None:
+ border-radius:2px;margin:0.75rem 0 1.5rem 0;"> """, unsafe_allow_html=True, ) @@ -861,45 +932,27 @@ def main() -> None: render_connection_help() return - pages = { - "Project Overview": render_project_overview, - "Station Load": render_station_load, - "Capacity Tracker": render_capacity_tracker, - "Worker Coverage": render_worker_coverage, - "Self-Test": render_self_test, - } - - with st.sidebar: - st.markdown( - """ -
- - Navigation - -
- """, - unsafe_allow_html=True, - ) - - # FIXED: Added a valid string "Navigation Menu" to fix the empty label accessibility warning - selected_page = st.radio("Navigation Menu", list(pages.keys()), label_visibility="collapsed") - - st.markdown("
", unsafe_allow_html=True) - st.info("All pages query Neo4j directly. CSV files are only used by seed_graph.py.") - - st.markdown( - """ -
- Factory Graph - Level 6 -
- """, - unsafe_allow_html=True, - ) + # ── Horizontal tab navigation ───────────────────────────────────────────── + tab_labels = [ + "01 Project Overview", + "02 Station Load", + "03 Capacity Tracker", + "04 Worker Coverage", + "05 Self-Test", + ] + tab_renderers = [ + render_project_overview, + render_station_load, + render_capacity_tracker, + render_worker_coverage, + render_self_test, + ] + + tabs = st.tabs(tab_labels) + for tab, renderer in zip(tabs, tab_renderers): + with tab: + renderer(driver) - pages[selected_page](driver) if __name__ == "__main__": main() \ No newline at end of file From d09eb535512e6e7c977de01729a32b248237ce48 Mon Sep 17 00:00:00 2001 From: Jaivardhan singh Date: Sun, 10 May 2026 21:47:31 +0530 Subject: [PATCH 6/7] chore: add real deployed dashboard URL --- submissions/jv-singh/level6/DASHBOARD_URL.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/submissions/jv-singh/level6/DASHBOARD_URL.txt b/submissions/jv-singh/level6/DASHBOARD_URL.txt index 77a54a269..80fb58a08 100644 --- a/submissions/jv-singh/level6/DASHBOARD_URL.txt +++ b/submissions/jv-singh/level6/DASHBOARD_URL.txt @@ -1 +1 @@ -TODO: deploy to Streamlit Community Cloud and replace this line with https://your-app.streamlit.app +https://lpi-developer-kit-6ljqxbc89s5efpisjg78lp.streamlit.app \ No newline at end of file From e82960dfbd93ca1470eabab7bda0aa4550cc0e36 Mon Sep 17 00:00:00 2001 From: JAIVARDHAN Date: Sun, 24 May 2026 22:02:23 +0530 Subject: [PATCH 7/7] Remove committed Level 6 secret files and ignore them --- .gitignore | 4 ++++ submissions/jv-singh/level6/.env | 3 --- submissions/jv-singh/level6/.streamlit/secrets.toml | 3 --- 3 files changed, 4 insertions(+), 6 deletions(-) delete mode 100644 submissions/jv-singh/level6/.env delete mode 100644 submissions/jv-singh/level6/.streamlit/secrets.toml diff --git a/.gitignore b/.gitignore index a4db0d416..17e2c0d98 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,7 @@ dist/ *.tsbuildinfo .DS_Store docs/intern-data/ + +# Level 6 local secrets +submissions/jv-singh/level6/.env +submissions/jv-singh/level6/.streamlit/secrets.toml diff --git a/submissions/jv-singh/level6/.env b/submissions/jv-singh/level6/.env deleted file mode 100644 index 6dea1017f..000000000 --- a/submissions/jv-singh/level6/.env +++ /dev/null @@ -1,3 +0,0 @@ -NEO4J_URI=neo4j+s://2c622e79.databases.neo4j.io -NEO4J_USER=neo4j -NEO4J_PASSWORD=WtKQxv-uetXixhwbds71EDL0Zl_P7Wg2mCm6tFYRGj4 diff --git a/submissions/jv-singh/level6/.streamlit/secrets.toml b/submissions/jv-singh/level6/.streamlit/secrets.toml deleted file mode 100644 index 60eafdcda..000000000 --- a/submissions/jv-singh/level6/.streamlit/secrets.toml +++ /dev/null @@ -1,3 +0,0 @@ -NEO4J_URI="neo4j+s://2c622e79.databases.neo4j.io" -NEO4J_USER="neo4j" -NEO4J_PASSWORD="WtKQxv-uetXixhwbds71EDL0Zl_P7Wg2mCm6tFYRGj4"