Real-world DSPy workflows for pharma/medtech teams. This project provides a flexible multi-classification system for Ozempic-related text analysis. Currently supports three classification tasks:
- AE vs PC Detection - Distinguish Adverse Events from Product Complaints
- AE Category Classification - Categorize adverse events into specific medical categories
- PC Category Classification - Categorize product complaints into specific quality categories
The framework shows how to:
- Programmatically optimize prompts with DSPy
- Support multiple classification tasks with dynamic signatures
- Track experiments with MLflow (SQLite backend for easy querying)
- Persist tuned artifacts to disk (separate from source)
- Serve classifiers via FastAPI with typed Pydantic contracts
- Python 3.13+
uvfor env + dependency management (nopip/poetry)- OpenAI-compatible API key
- Default provider: OpenRouter using
nvidia/nemotron-3-nano-30b-a3b:free - Override via environment variables without touching code
- Default provider: OpenRouter using
| Variable | Description | Default |
|---|---|---|
OPENROUTER_API_KEY |
Primary key for OpenRouter | — |
OPENAI_API_KEY / DSPY_API_KEY |
Override for OpenAI/custom key | — |
DSPY_MODEL_NAME |
Model ID | nvidia/nemotron-3-nano-30b-a3b:free |
DSPY_LOCAL_BASE |
Base URL for local provider | http://localhost:8080/v1 |
DSPY_HTTP_HEADERS |
JSON blob for extra HTTP headers | {} |
OPENROUTER_HTTP_REFERER, OPENROUTER_APP_TITLE |
OpenRouter analytics headers | — |
DSPY_RUN_ID |
Training run identifier | auto-generated |
DSPY_ARTIFACT_AUTO_UPDATE |
Auto-update artifact model metadata on load | false |
Copy .env.example and fill in whichever keys you need:
cp .env.example .envuv sync --extra devsource .venv/bin/activateuv run python scripts/datagen/adverse_event_sample_data.py
uv run python scripts/datagen/complaint_category_sample_data.py
uv run python scripts/datagen/ae_pc_classification_sample_data.pyThis creates a clean layout:
.
├── artifacts/ # Saved DSPy artifacts (git-tracked)
│ ├── ozempic_classifier_ae-pc_optimized.json
│ ├── ozempic_classifier_ae-category_optimized.json
│ └── ozempic_classifier_pc-category_optimized.json
├── data/ # Synthetic train/test data organized by task
│ ├── ae-pc-classification/ # AE vs PC detection
│ │ ├── train.json
│ │ └── test.json
│ ├── ae-category-classification/ # AE category classification
│ │ ├── train.json
│ │ └── test.json
│ └── pc-category-classification/ # PC category classification
│ ├── train.json
│ └── test.json
├── mlflow/ # MLflow experiment tracking (auto-created)
│ ├── mlflow.db # SQLite database for runs/metrics
│ └── artifacts/ # Logged artifacts
├── scripts/
│ ├── datagen/ # Data generation scripts
│ └── deploy/ # Deployment scripts
├── src/
│ ├── api/ # FastAPI app
│ ├── common/ # Shared logic (config, datasets, classifier)
│ ├── pipeline/ # Optimization pipeline
│ └── serving/ # Pydantic request/response + helpers
└── inference_demo.py # Simple batch inference helper
This project uses Ruff for both formatting and linting (line length: 120).
Format and fix all issues:
uv run ruff format . # Format all Python files
uv run ruff check --fix . # Fix all auto-fixable linting issuesCheck for issues without fixing:
uv run ruff check . # Check for linting issues
uv run ruff format --check . # Check formatting without changing filesNote: Ruff's formatter preserves triple-quoted strings (""") as-is by design. For files with long triple-quoted
strings (like data generation scripts), you may need to manually wrap them if desired.
VSCode users: Format on save is enabled by default using Ruff. Install the recommended extensions (Python, Ruff) when prompted.
Train a classifier for a specific task using the --classification-type flag:
uv run python -m src.pipeline.main --classification-type ae-pcuv run python -m src.pipeline.main --classification-type ae-categoryuv run python -m src.pipeline.main --classification-type pc-category| Flag | Short | Description |
|---|---|---|
--classification-type |
-t |
Classification type: ae-pc, ae-category, pc-category (default: ae-pc) |
--verbose |
-v |
Show detailed output (per-example evaluation, MIPROv2 progress) |
--inspect |
-i |
Show DSPy prompts/responses after optimization completes |
# Quiet output (default) - just key progress messages
uv run python -m src.pipeline.main -t ae-pc
# Verbose - see evaluation details and optimizer progress
uv run python -m src.pipeline.main -t ae-pc --verbose
# Inspect prompts after training
uv run python -m src.pipeline.main -t ae-pc --inspect
# Both verbose and inspect
uv run python -m src.pipeline.main -t ae-pc -v -iThe run will:
- Configure DSPy with your provider settings.
- Load the appropriate
data/<type>-classification/train.jsonandtest.json. - Evaluate the baseline classifier.
- Optimize via
MIPROv2(withauto="medium"). - Evaluate the optimized program.
- Write the artifact to
artifacts/ozempic_classifier_<type>_optimized.json. - Log params, metrics, and artifacts to MLflow (
mlflow/mlflow.db).
Training runs are automatically tracked in a local SQLite database. Query your experiments:
sqlite3 mlflow/mlflow.db "
SELECT
e.name as experiment,
r.name as run_name,
r.status,
m.key,
m.value
FROM runs r
JOIN experiments e ON r.experiment_id = e.experiment_id
LEFT JOIN metrics m ON r.run_uuid = m.run_uuid
ORDER BY r.start_time DESC;
"sqlite3 mlflow/mlflow.db "
SELECT
r.name,
MAX(CASE WHEN m.key = 'baseline_accuracy' THEN m.value END) as baseline,
MAX(CASE WHEN m.key = 'optimized_accuracy' THEN m.value END) as optimized,
MAX(CASE WHEN m.key = 'improvement' THEN m.value END) as improvement
FROM runs r
JOIN metrics m ON r.run_uuid = m.run_uuid
GROUP BY r.run_uuid
ORDER BY r.start_time DESC;
"Or launch the MLflow UI:
mlflow ui --backend-store-uri sqlite:///mlflow/mlflow.dbuv run uvicorn src.api.app:app --reload- API Root:
http://localhost:8000/(shows available endpoints) - Swagger/OpenAPI UI:
http://localhost:8000/docs - ReDoc UI:
http://localhost:8000/redoc - Health endpoint:
GET /health
The API provides three classification endpoints:
POST /classify/ae-pc- Classify as Adverse Event or Product Complaint (first-stage classification)POST /classify/ae-category- Classify adverse events into specific medical categories (e.g., Gastrointestinal disorders, Pancreatitis, Hypoglycemia)POST /classify/pc-category- Classify product complaints into quality/defect categories (e.g., Device malfunction, Packaging defect)
curl -X POST http://localhost:8000/classify/ae-pc \
-H "Content-Type: application/json" \
-d '{
"complaint": "After injecting Ozempic I had severe hives and needed an EpiPen."
}'Response:
{
"classification": "Adverse Event",
"justification": "Describes a systemic allergic reaction following Ozempic use.",
"classification_type": "ae-pc"
}curl -X POST http://localhost:8000/classify/ae-category \
-H "Content-Type: application/json" \
-d '{
"complaint": "I experienced severe nausea and vomiting after taking Ozempic."
}'curl -X POST http://localhost:8000/classify/pc-category \
-H "Content-Type: application/json" \
-d '{
"complaint": "The pen arrived with a cracked dose dial."
}'If an artifact is missing, the API returns 503 Service Unavailable with instructions to rerun the pipeline.
from src.common.config import configure_lm
from src.serving.service import ComplaintRequest, get_classification_function
configure_lm()
predict = get_classification_function()
payload = ComplaintRequest(complaint="Pen arrived with a broken dose dial.")
result = predict(payload)
print(result.classification, result.justification)Pass model_path="artifacts/ozempic_classifier_optimized.json" (or another artifact) to pin a different tuned model per
tenant or use-case.
uv run python inference_demo.pyRuns a few sample complaints through the classifier and shows the full DSPy prompt/response for each using dspy.inspect_history(). Useful for demos and understanding how DSPy translates to actual LLM requests.
docker build -t dspy-reference .
docker run --rm \
--env-file .env \
-p 8080:8080 \
-v "$(pwd)/data:/data" \
dspy-reference- The image uses the pre-baked
.venvfromuv sync --frozen --no-devand serves FastAPI on0.0.0.0:8080. - Mount
$(pwd)/datato/datawhen you need persistence (e.g., refreshed artifacts, uploads, sqlite files). - Override the port by passing
-e PORT=9000; the default command readsPORTand falls back to8080. - Run portability smoke checks for Railway-like and Foundry-like runtimes:
bash scripts/test_docker_portability.sh
- Push this repo (with the Dockerfile) to GitHub and create a Railway project using the Docker template.
- In the Railway dashboard, set the required env vars (
OPENROUTER_API_KEY,DSPY_MODEL_NAME, etc.). Railway automatically setsPORT; no build args are needed. - Attach a persistent volume mounted at
/dataif you need on-disk artifacts or databases. - Each deploy builds directly from the Dockerfile’s multi-stage workflow; use
railway upor manual deploys after committing changes.
The container always starts via uvicorn src.api.app:app --host 0.0.0.0 --port ${PORT:-8080}, matching the local dev
commands.
This repo is configured to support a Foundry-friendly workflow: ship a Docker image that embeds an OpenAPI contract (as the server.openapi image label), then import FastAPI routes as Foundry functions via Detect from OpenAPI specification.
More context + screenshots:
docs/foundry-auto-deploy.md
Generate and validate the Foundry-constrained OpenAPI artifact:
uv run python scripts/deploy/foundry_openapi.py --generate --spec-path openapi.foundry.json
uv run python scripts/deploy/foundry_openapi.py --spec-path openapi.foundry.jsonThe generated Foundry profile uses servers: [{"url":"http://localhost:5000"}].
Validate both the spec and a built image (checks linux/amd64, numeric non-root user, and server.openapi label):
uv run python scripts/deploy/foundry_openapi.py \
--spec-path openapi.foundry.json \
--image-ref "<registry>/<repo>/<image>:<tag>"Full build/push/import sequence is in docs/foundry-openapi-runbook.md.
GitHub workflow automation and required secrets/variables are documented in docs/deploy-ci.md.
To run a local LLM server using llama.cpp:
cd llama.cpp
# Build llama.cpp
cmake -B build
cmake --build build --config Release
# Download the model from Hugging Face (save to models directory)
# Visit the model page on HF for the curl command, e.g.:
# curl -L -o models/Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.gguf <HF_URL>./serve.sh -m ~/llama.cpp/models/Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.ggufexport DSPY_LOCAL_BASE=http://localhost:8080/v1
export DSPY_MODEL_NAME=local-model- Replace
data/*-classification/*.jsonwith real labeled datasets or updatesrc/common/data_utils.pyto read from your storage systems. - Add new classification types by:
- Adding a new entry to
CLASSIFICATION_CONFIGSinsrc/common/classifier.py - Adding a new entry to
CLASSIFICATION_TYPESinsrc/common/paths.py - Creating training data scripts in
scripts/datagen/ - Training with
--classification-type <new-type>
- Adding a new entry to
- Add additional pipelines (extraction, severity grading, etc.) by following the same pattern: shared logic in
src/common, tuning flows insrc/pipeline, serving code insrc/api/src/serving. - The LM client is OpenAI-compatible; switching to Anthropic, Azure OpenAI, or self-hosted proxies is just a matter of environment variables.
MIT – see LICENSE for details.
Created by Anand Pant

