Demand Forecasting Agent

An AI agent that acts as an intelligent supply chain analyst. Given a retail dataset, it autonomously explores data, engineers features, trains forecasting models, and answers natural language questions about demand — combining machine learning fundamentals with modern AI agent architecture.

Overview

The Problem

Supply chain planning depends on accurate demand forecasts. Traditional approaches require analysts to manually load data, run models, interpret metrics, and generate reports — a time-consuming pipeline that doesn't scale.

The Solution

Instead of a static forecasting script, this project implements a reasoning agent that decides what to do based on the question asked. Ask "which products are most volatile?" and it runs a volatility analysis. Ask "what if demand spikes 30%?" and it simulates the scenario with inventory impact calculations. The agent chains multiple tools together when needed — finding the hardest product to forecast, predicting its demand, and generating a chart, all from a single request.

The agent is also backed by a RAG pipeline: it can retrieve product descriptions and client policy documents from a vector store to ground its answers in domain context, not just model outputs.

Architecture

User (HTTP request or terminal)
            │
            ▼
┌───────────────────────────────┐
│        FastAPI (api.py)       │  ← REST endpoint (POST /ask)
│   Receives JSON { question }  │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│      Agent (LangGraph)        │
│   Claude LLM ←→ Agent State   │
│   Reasoning + tool selection  │
└──────┬────────────────────────┘
       │ tool calls
   ┌───┼──────────────────────────────┐
   ▼   ▼         ▼         ▼         ▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────────┐
│ Data │ │Fore- │ │Analy-│ │ Viz  │ │  RAG     │
│Tools │ │cast  │ │sis   │ │Tools │ │  Tool    │
│      │ │Tools │ │Tools │ │      │ │          │
└──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └────┬─────┘
   └────────┴─────────┴────────┘          │
            │                             ▼
   ┌────────┴──────────┐     ┌────────────────────┐
   │   ML Pipeline     │     │  ChromaDB          │
   │  (pure Python)    │     │  Vector Store      │
   │                   │     │                    │
   │  data_loader      │     │  Product docs      │
   │  feature_eng.     │     │  Policy docs       │
   │  model (XGBoost)  │     │  all-MiniLM-L6-v2  │
   │  visualizations   │     └────────────────────┘
   └───────────────────┘

Key design decisions:

ML pipeline is framework-agnostic. model.py and feature_engineering.py are pure Python with pandas/scikit-learn/XGBoost — zero dependency on LangChain. The tools layer is a thin wrapper. If the agent framework changes, only the wrapper needs updating.
RAG is decoupled from the ML pipeline. ChromaDB and sentence-transformers run independently. The agent decides when to call the RAG tool based on the question; it doesn't run on every request.
Model caching in agent state. The trained XGBoost model is cached in memory after the first training call. Subsequent prediction and analysis requests reuse the cached model instantly instead of retraining.

Demos

Core Workflow

End-to-end pipeline: load data → train model → predict → visualize.

📹 Watch demo

Analytical Investigation

A real supply chain investigation: identify volatile products → compare across stores → simulate demand spike → visualize.

📹 Watch demo

Autonomous Tool Chaining

One prompt, three chained tool calls. The agent finds the hardest product to forecast, predicts its demand, and generates the chart — all from a single ambiguous request.

📹 Watch demo

Docker Demo

End-to-end HTTP workflow: query the agent via FastAPI → RAG retrieval grounds the answer in product context → what-if simulation with chart. 📹 Watch demo

Features

Data exploration — Loads and summarizes 913K rows of retail sales data (10 stores × 50 items × 5 years)
Automated feature engineering — Generates 24 features: time-based (day of week, month, quarter), lag features (1/7/14/28 day), rolling statistics (mean and std over 7/14/30 day windows), and holiday indicators
XGBoost forecasting — Trains with early stopping, time-based train/test split, and per-item evaluation
Natural language Q&A — Ask questions like "predict demand for item 5 in store 1" or "which products are hardest to forecast?"
What-if simulation — Simulate demand spikes and see inventory shortfall impact
Store comparison — Compare demand patterns across locations for inventory allocation
RAG-powered context — Retrieve product descriptions and client policy documents via semantic search (ChromaDB + sentence-transformers)
REST API — Query the agent over HTTP via FastAPI (POST /ask)
Visualization — Generates sales trends, forecast vs actual charts, weekly patterns, volatility rankings, feature importance, demand distributions, and store comparisons

Tech Stack

Component	Technology	Why
Language	Python 3.10+	Industry standard for ML
ML Model	XGBoost	Best performance on structured tabular data with engineered features
Data	pandas, NumPy	Standard data manipulation
Evaluation	scikit-learn	MAE, RMSE, MAPE metrics
Agent Framework	LangGraph	Explicit state management, stable API, replaces deprecated AgentExecutor
LLM	Claude (Anthropic)	Strong reasoning and tool-calling capabilities
Visualization	matplotlib	Reliable static chart generation
Features	holidays	US holiday calendar for demand signals
RAG	ChromaDB + sentence-transformers	Local vector store, no external API needed
API	FastAPI + Uvicorn	Lightweight async REST framework
Containerization	Docker + docker-compose	Reproducible one-command deployment

Tech Decisions

XGBoost over LSTM/Neural Networks: Tree-based models consistently outperform deep learning on structured tabular data with hand-crafted features. XGBoost trains in seconds, provides interpretable feature importances, and doesn't require normalization or sequence windowing. LSTMs would add significant complexity for marginal benefit on this data type.

Time-based split over random split: The train/test split is by date (train up to Sept 2017, test Oct-Dec 2017), not random. Random splitting would leak future information into training data, invalidating the evaluation entirely.

Shift inside groupby transform: Rolling features use .transform(lambda x: x.rolling(...).mean().shift(1)) with the shift inside the transform. Placing shift outside would cause cross-group leakage — the first row of one product would incorrectly use another product's last rolling value.

ChromaDB over a hosted vector DB: All embeddings are computed locally with all-MiniLM-L6-v2 (sentence-transformers). No external API key, no network dependency, persisted to disk. The right choice for a self-contained portfolio project.

FastAPI over Flask: Async by default, automatic OpenAPI docs at /docs, and native Pydantic validation. Lower boilerplate for a single-endpoint agent wrapper.

Dataset

Store Item Demand Forecasting from Kaggle.

10 stores × 50 items × 1,826 days = 913,000 rows
Daily sales data from 2013-01-01 to 2017-12-31
Clean data with zero missing values
No built-in features (price, promotions) — all 24 features are engineered
Item descriptions are simulated: the Kaggle competition provides no metadata, so product categories and policy documents were generated to demonstrate RAG retrieval on realistic supply chain text

Results

Metric	Value	Meaning
MAE	5.93	Forecast is off by ~6 units on average
RMSE	7.68	Typical error with large misses penalized more
MAPE	13.0%	Average percentage error (acceptable for retail)

Top features by importance: sales_rolling_mean_7 (34%), sales_rolling_mean_14 (26%), sales_lag_7 (25%) — confirming strong weekly seasonality and recent trend dependence.

Project Structure

demand-forecasting-agent/
│
├── main.py                      # Entry point — run the agent in terminal
├── api.py                       # FastAPI endpoint wrapping the agent
├── config.py                    # All hyperparameters and paths
├── docker-compose.yml           # One-command Docker setup
├── Dockerfile
├── requirements.txt
├── .env.example                 # API key template
│
├── src/
│   ├── __init__.py
│   ├── data_loader.py           # Load and validate raw data
│   ├── feature_engineering.py   # Feature creation (lags, rolling, holidays)
│   ├── model.py                 # XGBoost training, prediction, evaluation
│   ├── visualizations.py        # Chart generation
│   │
│   ├── tools/                   # Agent tool wrappers (@tool decorator)
│   │   ├── __init__.py          # Tool registry (all_tools list)
│   │   ├── data_tools.py        # explore_dataset, get_item_details
│   │   ├── forecast_tools.py    # train_forecast_model, predict_demand
│   │   ├── analysis_tools.py    # find_volatile, simulate_spike, compare_stores
│   │   ├── viz_tools.py         # Chart generation tools
│   │   └── rag_tool.py          # @tool wrapper — exposes similarity search to the agent
│   │
│   ├── agent/
│   │   ├── __init__.py          # Shared AgentState (model cache, data cache)
│   │   ├── graph.py             # LangGraph ReAct agent definition
│   │   └── prompts.py           # System prompt for the LLM
│   │
│   └── rag/
│       ├── __init__.py
│       ├── documents.py         # Simulated product descriptions and client policy docs
│       └── vector_store.py      # ChromaDB setup, document ingestion, similarity search
│
├── tests/
│   ├── test_data_loader.py
│   ├── test_feature_engineering.py
│   ├── test_model.py
│   ├── test_visualizations.py
│   ├── test_tools.py
│   └── test_agent.py
│
├── data/
│   ├── raw/                     # Place train.csv here
│   └── chroma_db/               # ChromaDB persisted vector store (auto-created)
└── outputs/                     # Agent-generated charts

How to Run

Prerequisites

An Anthropic API key (get one here)
train.csv from the Kaggle competition placed in data/raw/

git clone https://github.com/Massi99RM/demand-forecasting-agent.git
cd demand-forecasting-agent
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

Option 1: Docker (recommended)

No Python environment setup needed. Requires Docker Desktop.

docker compose up

The API will be available at http://localhost:8000. Query the agent:

curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"question": "What data do we have?"}'

Interactive API docs available at http://localhost:8000/docs.

Note: docker-compose.yml mounts both data/raw, data/chroma_db and outputs automatically. Make sure train.csv is in data/raw/ before running.

Option 2: Terminal (for development)

pip install -r requirements.txt
python main.py

The agent starts an interactive chat session in the terminal.

Run Tests (no API key needed)

python tests/test_data_loader.py
python tests/test_feature_engineering.py
python tests/test_model.py
python tests/test_visualizations.py
python tests/test_tools.py
python tests/test_agent.py

Example Conversation

📊 You: What data do we have?
🤖 Agent: Here's what we're working with:
   - 5 years of daily sales data (2013-2017)
   - 10 stores × 50 items = 500 time series
   - 913,000 total records
   - Average daily sales: 52 units per item

📊 You: Train a forecasting model
🤖 Agent: Model trained successfully!
   - MAE: 5.93 units (forecast off by ~6 units on average)
   - MAPE: 13.0% (acceptable for retail forecasting)

📊 You: Which products are most volatile?
🤖 Agent: Store 6, Item 5 is the most unpredictable (CV: 0.373).
   Item 5 appears across multiple stores in the top volatile list,
   suggesting inherently unstable demand patterns.

📊 You: What if demand for item 5 spikes 50%?
🤖 Agent: The model can't handle it without intervention.
   - 89% of days would be understocked
   - ~700 unit shortfall over 3 months
   Recommendation: increase safety stock by 60-70%.

📊 You: What do you know about item 4?
🤖 Agent: [RAG retrieval] Item 4 is a punnet of strawberries (400g). Here are the key characteristics:Product Details:- Type: Fresh strawberries in 400g punnets.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demand Forecasting Agent

Overview

The Problem

The Solution

Architecture

Demos

Core Workflow

Analytical Investigation

Autonomous Tool Chaining

Docker Demo

Features

Tech Stack

Tech Decisions

Dataset

Results

Project Structure

How to Run

Prerequisites

Option 1: Docker (recommended)

Option 2: Terminal (for development)

Run Tests (no API key needed)

Example Conversation

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data/raw		data/raw
outputs		outputs
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api.py		api.py
config.py		config.py
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Demand Forecasting Agent

Overview

The Problem

The Solution

Architecture

Demos

Core Workflow

Analytical Investigation

Autonomous Tool Chaining

Docker Demo

Features

Tech Stack

Tech Decisions

Dataset

Results

Project Structure

How to Run

Prerequisites

Option 1: Docker (recommended)

Option 2: Terminal (for development)

Run Tests (no API key needed)

Example Conversation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages