Madhvansh Choksi — building production AI systems for industrial water treatment.
I don't make demos. I make systems that run 24/7 on real cooling towers, replacing chemical dosing decisions that used to require a human operator staring at a SCADA screen.
Currently shipping TGF — an autonomous cooling tower control system that predicts water chemistry 6–24 hours ahead and optimizes chemical dosing across 7 simultaneous treatment chemicals using Model Predictive Control.
The short version: Nalco and ChemTreat charge $50K+/year to reactively dose 1–2 chemicals. TGF predicts the future and optimizes all 7 at once — delivering 15–30% chemical savings with zero critical failures.
Industrial cooling towers waste billions of dollars annually on chemical treatment because dosing is reactive — operators wait for pH to drift, then dump chemicals. Meanwhile, scaling corrodes heat exchangers, biofouling clogs fills, and plants lose efficiency.
Every major vendor (Nalco/Ecolab, ChemTreat, Solenis) sells the same thing: a fluorescent tracer that measures one chemical accurately, then charges you for the privilege of vendor lock-in.
Sensors (pH, Conductivity, Temp, ORP)
│
▼
┌───────────────────────┐ ┌────────────────────────────┐
│ Chronos-2 Forecaster │────▶│ Statistical Fallback │
│ (Zero-shot p10/50/90)│ │ (when GPU unavailable) │
└──────────┬────────────┘ └────────────────────────────┘
│
▼
┌───────────────────────┐ ┌────────────────────────────┐
│ Physics Engine │◀───▶│ Chemical Residual Tracker │
│ (LSI/RSI/CoC/Risk) │ │ (Mass balance × 7 chems) │
└──────────┬────────────┘ └───────────┬────────────────┘
│ │
▼ ▼
┌──────────────────────────────────────────────┐
│ MPC Dosing Optimizer │
│ scipy L-BFGS-B · 2-hour receding horizon │
│ Cost = chemical_cost + risk_penalties │
└─────────────────┬────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ Safety Layer │
│ Sensor fault detection · Hard limits │
│ Rate limiting · PID backup · Emergency stop │
└─────────────────┬────────────────────────────┘
│
▼
Pump Commands + Blowdown
| Metric | Value |
|---|---|
| LSI in optimal range | 86.2% |
| CRITICAL risk cycles | 0.0% |
| Preemptive decisions | 78% (forecast-driven, not reactive) |
| Chemical adequacy | 52–86% across all 7 chemicals |
| Risk profile | 47.5% LOW · 44.7% MODERATE · 7.8% HIGH |
| Nalco TRASAR | ChemTreat | TGF | |
|---|---|---|---|
| Chemicals tracked | 1–2 (fluorescent) | 1 (tracer) | All 7 (mass balance) |
| Prediction | None (reactive) | None (reactive) | 6–24h ahead (Chronos-2) |
| Dosing strategy | Threshold-based | Threshold-based | MPC-optimized |
| Multi-vendor | ❌ Locked in | ❌ Locked in | ✅ Configurable |
| Cost optimization | ❌ | ❌ | ✅ INR-minimizing |
- MPC over RL (SAC/PPO): Hard safety constraints are guaranteed, not learned. Works with 5K samples. Explainable to plant operators.
- Chronos-2 over PatchTST: Zero-shot works immediately on new towers. PatchTST needs fine-tuning on data we don't have yet.
- Mass balance over virtual sensors: We tried ML-based virtual sensors for hardness/alkalinity — R²=0.37 was too unreliable. Physics-based mass balance with weekly lab calibration is honest engineering.
- Statistical fallback always ready: Every Chronos-2 call has a Holt-Winters backup. No single point of failure.
tech_stack = {
"ai_ml": ["PyTorch", "Chronos-2", "MOMENT", "TransNAS", "scikit-learn"],
"optimization": ["scipy (L-BFGS-B)", "Model Predictive Control"],
"backend": ["FastAPI", "SQLite (WAL)", "uvicorn"],
"physics": ["Langelier SI", "Ryznar SI", "Arrhenius decay", "Mass balance"],
"infra": ["Real-time dashboards", "Alert systems", "Sensor simulation"],
"research": ["State Space Models (S4/Mamba)", "CoreML", "Time-series AD"],
"languages": ["Python", "JavaScript", "Java", "SQL", "HTML/CSS"],
}| Project | What it does | Stack |
|---|---|---|
| Cooling Tower Dashboard | Production monitoring dashboard for 15 cooling towers at Atul Ltd. Real-time status, priority-based issue tracking, Chart.js viz. Deployed at Vercel & Netlify. | HTML · CSS · JS · Chart.js |
| SAiDL Spring 2025 | Research assignments — State Space Models (S4/Mamba), CoreML exploration | PyTorch · Jupyter |
| DSA | Data structures & algorithms | — |
| OOPs | Object-oriented programming patterns | Java |
- MOMENT foundation model integration — reconstruction-based anomaly detection plugged into the TGF control loop (architecture is ready, awaiting model fine-tuning)
- Multi-tower orchestration — coordinated dosing across cooling tower farms sharing blowdown/makeup water
- Edge deployment — running the full MPC stack on industrial edge hardware (Raspberry Pi + sensor HATs for proof-of-concept)
I use Claude extensively in my development workflow — from debugging MPC cost functions to exploring Chronos-2 integration patterns to writing the physics engine's Arrhenius decay model. The TGF codebase is deeply intertwined with Claude-assisted development.
What Claude Max would unlock:
- Faster iteration on the MOMENT anomaly detection integration
- Multi-file refactoring across the 20+ module codebase
- Exploring novel MPC formulations with longer planning horizons
- Writing comprehensive test suites for safety-critical dosing logic
This isn't a side project. It's production software targeting a $10B+ industrial water treatment market where the incumbents haven't innovated in decades. Claude is the only AI assistant that can reason about the physics and the code simultaneously.
