Skip to content

DSPY for model agnostic optimization. Dockerized deployment of FastAPI classification routes to railway, cf workers, and Palantir Foundry. Pharma/Medtech complaints vs. AE example.

License

Notifications You must be signed in to change notification settings

shpitdev/dspy-template-openapi-compute-modules-function

Repository files navigation

DSPy Reference Examples

Python DSPy MLflow FastAPI Pydantic Ruff uv

Models: Nemotron-3-Nano-30B GPT-OSS-120B

DSPy Reference Examples

Real-world DSPy workflows for pharma/medtech teams. This project provides a flexible multi-classification system for Ozempic-related text analysis. Currently supports three classification tasks:

  1. AE vs PC Detection - Distinguish Adverse Events from Product Complaints
  2. AE Category Classification - Categorize adverse events into specific medical categories
  3. PC Category Classification - Categorize product complaints into specific quality categories

The framework shows how to:

  • Programmatically optimize prompts with DSPy
  • Support multiple classification tasks with dynamic signatures
  • Track experiments with MLflow (SQLite backend for easy querying)
  • Persist tuned artifacts to disk (separate from source)
  • Serve classifiers via FastAPI with typed Pydantic contracts

Architecture overview

DSPy pipeline and serving flow


Requirements

  • Python 3.13+
  • uv for env + dependency management (no pip/poetry)
  • OpenAI-compatible API key
    • Default provider: OpenRouter using nvidia/nemotron-3-nano-30b-a3b:free
    • Override via environment variables without touching code

Environment variables

Variable Description Default
OPENROUTER_API_KEY Primary key for OpenRouter
OPENAI_API_KEY / DSPY_API_KEY Override for OpenAI/custom key
DSPY_MODEL_NAME Model ID nvidia/nemotron-3-nano-30b-a3b:free
DSPY_LOCAL_BASE Base URL for local provider http://localhost:8080/v1
DSPY_HTTP_HEADERS JSON blob for extra HTTP headers {}
OPENROUTER_HTTP_REFERER, OPENROUTER_APP_TITLE OpenRouter analytics headers
DSPY_RUN_ID Training run identifier auto-generated
DSPY_ARTIFACT_AUTO_UPDATE Auto-update artifact model metadata on load false

Copy .env.example and fill in whichever keys you need:

cp .env.example .env

Project Setup

uv sync --extra dev
source .venv/bin/activate

Generate training data for all classification types

uv run python scripts/datagen/adverse_event_sample_data.py
uv run python scripts/datagen/complaint_category_sample_data.py
uv run python scripts/datagen/ae_pc_classification_sample_data.py

This creates a clean layout:

.
├── artifacts/                           # Saved DSPy artifacts (git-tracked)
│   ├── ozempic_classifier_ae-pc_optimized.json
│   ├── ozempic_classifier_ae-category_optimized.json
│   └── ozempic_classifier_pc-category_optimized.json
├── data/                                # Synthetic train/test data organized by task
│   ├── ae-pc-classification/            # AE vs PC detection
│   │   ├── train.json
│   │   └── test.json
│   ├── ae-category-classification/      # AE category classification
│   │   ├── train.json
│   │   └── test.json
│   └── pc-category-classification/      # PC category classification
│       ├── train.json
│       └── test.json
├── mlflow/                              # MLflow experiment tracking (auto-created)
│   ├── mlflow.db                        # SQLite database for runs/metrics
│   └── artifacts/                       # Logged artifacts
├── scripts/
│   ├── datagen/                         # Data generation scripts
│   └── deploy/                          # Deployment scripts
├── src/
│   ├── api/                             # FastAPI app
│   ├── common/                          # Shared logic (config, datasets, classifier)
│   ├── pipeline/                        # Optimization pipeline
│   └── serving/                         # Pydantic request/response + helpers
└── inference_demo.py                    # Simple batch inference helper

Code Formatting

This project uses Ruff for both formatting and linting (line length: 120).

Format and fix all issues:

uv run ruff format .              # Format all Python files
uv run ruff check --fix .         # Fix all auto-fixable linting issues

Check for issues without fixing:

uv run ruff check .               # Check for linting issues
uv run ruff format --check .      # Check formatting without changing files

Note: Ruff's formatter preserves triple-quoted strings (""") as-is by design. For files with long triple-quoted strings (like data generation scripts), you may need to manually wrap them if desired.

VSCode users: Format on save is enabled by default using Ruff. Install the recommended extensions (Python, Ruff) when prompted.


1. Optimize / Refresh the Classifier

Train a classifier for a specific task using the --classification-type flag:

Train AE vs PC classifier (default)

uv run python -m src.pipeline.main --classification-type ae-pc

Train AE category classifier

uv run python -m src.pipeline.main --classification-type ae-category

Train PC category classifier

uv run python -m src.pipeline.main --classification-type pc-category

CLI Options

Flag Short Description
--classification-type -t Classification type: ae-pc, ae-category, pc-category (default: ae-pc)
--verbose -v Show detailed output (per-example evaluation, MIPROv2 progress)
--inspect -i Show DSPy prompts/responses after optimization completes
# Quiet output (default) - just key progress messages
uv run python -m src.pipeline.main -t ae-pc

# Verbose - see evaluation details and optimizer progress
uv run python -m src.pipeline.main -t ae-pc --verbose

# Inspect prompts after training
uv run python -m src.pipeline.main -t ae-pc --inspect

# Both verbose and inspect
uv run python -m src.pipeline.main -t ae-pc -v -i

The run will:

  1. Configure DSPy with your provider settings.
  2. Load the appropriate data/<type>-classification/train.json and test.json.
  3. Evaluate the baseline classifier.
  4. Optimize via MIPROv2 (with auto="medium").
  5. Evaluate the optimized program.
  6. Write the artifact to artifacts/ozempic_classifier_<type>_optimized.json.
  7. Log params, metrics, and artifacts to MLflow (mlflow/mlflow.db).

Experiment Tracking with MLflow

Training runs are automatically tracked in a local SQLite database. Query your experiments:

List all runs with metrics

sqlite3 mlflow/mlflow.db "
SELECT 
    e.name as experiment,
    r.name as run_name,
    r.status,
    m.key,
    m.value
FROM runs r
JOIN experiments e ON r.experiment_id = e.experiment_id
LEFT JOIN metrics m ON r.run_uuid = m.run_uuid
ORDER BY r.start_time DESC;
"

Compare baseline vs optimized accuracy across runs

sqlite3 mlflow/mlflow.db "
SELECT 
    r.name,
    MAX(CASE WHEN m.key = 'baseline_accuracy' THEN m.value END) as baseline,
    MAX(CASE WHEN m.key = 'optimized_accuracy' THEN m.value END) as optimized,
    MAX(CASE WHEN m.key = 'improvement' THEN m.value END) as improvement
FROM runs r
JOIN metrics m ON r.run_uuid = m.run_uuid
GROUP BY r.run_uuid
ORDER BY r.start_time DESC;
"

Or launch the MLflow UI:

mlflow ui --backend-store-uri sqlite:///mlflow/mlflow.db

2. Serve the Classifier via FastAPI

uv run uvicorn src.api.app:app --reload
  • API Root: http://localhost:8000/ (shows available endpoints)
  • Swagger/OpenAPI UI: http://localhost:8000/docs
  • ReDoc UI: http://localhost:8000/redoc
  • Health endpoint: GET /health

Classification Endpoints

The API provides three classification endpoints:

  1. POST /classify/ae-pc - Classify as Adverse Event or Product Complaint (first-stage classification)
  2. POST /classify/ae-category - Classify adverse events into specific medical categories (e.g., Gastrointestinal disorders, Pancreatitis, Hypoglycemia)
  3. POST /classify/pc-category - Classify product complaints into quality/defect categories (e.g., Device malfunction, Packaging defect)

Example: AE vs PC Classification

curl -X POST http://localhost:8000/classify/ae-pc \
     -H "Content-Type: application/json" \
     -d '{
           "complaint": "After injecting Ozempic I had severe hives and needed an EpiPen."
         }'

Response:

{
  "classification": "Adverse Event",
  "justification": "Describes a systemic allergic reaction following Ozempic use.",
  "classification_type": "ae-pc"
}

Example: AE Category Classification

curl -X POST http://localhost:8000/classify/ae-category \
     -H "Content-Type: application/json" \
     -d '{
          "complaint": "I experienced severe nausea and vomiting after taking Ozempic."
         }'

Example: PC Category Classification

curl -X POST http://localhost:8000/classify/pc-category \
     -H "Content-Type: application/json" \
     -d '{
          "complaint": "The pen arrived with a cracked dose dial."
         }'

If an artifact is missing, the API returns 503 Service Unavailable with instructions to rerun the pipeline.


3. Use the Pydantic Interface Directly

from src.common.config import configure_lm
from src.serving.service import ComplaintRequest, get_classification_function

configure_lm()
predict = get_classification_function()

payload = ComplaintRequest(complaint="Pen arrived with a broken dose dial.")
result = predict(payload)
print(result.classification, result.justification)

Pass model_path="artifacts/ozempic_classifier_optimized.json" (or another artifact) to pin a different tuned model per tenant or use-case.


Demo Script

uv run python inference_demo.py

Runs a few sample complaints through the classifier and shows the full DSPy prompt/response for each using dspy.inspect_history(). Useful for demos and understanding how DSPy translates to actual LLM requests.


4. Docker & Railway Deployment

Build & Run Locally

docker build -t dspy-reference .
docker run --rm \
  --env-file .env \
  -p 8080:8080 \
  -v "$(pwd)/data:/data" \
  dspy-reference
  • The image uses the pre-baked .venv from uv sync --frozen --no-dev and serves FastAPI on 0.0.0.0:8080.
  • Mount $(pwd)/data to /data when you need persistence (e.g., refreshed artifacts, uploads, sqlite files).
  • Override the port by passing -e PORT=9000; the default command reads PORT and falls back to 8080.
  • Run portability smoke checks for Railway-like and Foundry-like runtimes:
    bash scripts/test_docker_portability.sh

Deploy to Railway

  1. Push this repo (with the Dockerfile) to GitHub and create a Railway project using the Docker template.
  2. In the Railway dashboard, set the required env vars (OPENROUTER_API_KEY, DSPY_MODEL_NAME, etc.). Railway automatically sets PORT; no build args are needed.
  3. Attach a persistent volume mounted at /data if you need on-disk artifacts or databases.
  4. Each deploy builds directly from the Dockerfile’s multi-stage workflow; use railway up or manual deploys after committing changes.

The container always starts via uvicorn src.api.app:app --host 0.0.0.0 --port ${PORT:-8080}, matching the local dev commands.


5. Foundry OpenAPI Compute Module Deployment

This repo is configured to support a Foundry-friendly workflow: ship a Docker image that embeds an OpenAPI contract (as the server.openapi image label), then import FastAPI routes as Foundry functions via Detect from OpenAPI specification.

Foundry function call from imported OpenAPI

More context + screenshots:

  • docs/foundry-auto-deploy.md

Generate and validate the Foundry-constrained OpenAPI artifact:

uv run python scripts/deploy/foundry_openapi.py --generate --spec-path openapi.foundry.json
uv run python scripts/deploy/foundry_openapi.py --spec-path openapi.foundry.json

The generated Foundry profile uses servers: [{"url":"http://localhost:5000"}].

Validate both the spec and a built image (checks linux/amd64, numeric non-root user, and server.openapi label):

uv run python scripts/deploy/foundry_openapi.py \
  --spec-path openapi.foundry.json \
  --image-ref "<registry>/<repo>/<image>:<tag>"

Full build/push/import sequence is in docs/foundry-openapi-runbook.md. GitHub workflow automation and required secrets/variables are documented in docs/deploy-ci.md.


Local LLM Server (llama.cpp)

To run a local LLM server using llama.cpp:

cd llama.cpp

# Build llama.cpp
cmake -B build
cmake --build build --config Release

# Download the model from Hugging Face (save to models directory)
# Visit the model page on HF for the curl command, e.g.:
# curl -L -o models/Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.gguf <HF_URL>

Start the server

./serve.sh -m ~/llama.cpp/models/Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.gguf

Then configure DSPy to use your local server by setting:

export DSPY_LOCAL_BASE=http://localhost:8080/v1
export DSPY_MODEL_NAME=local-model

Notes & Next Steps

  • Replace data/*-classification/*.json with real labeled datasets or update src/common/data_utils.py to read from your storage systems.
  • Add new classification types by:
    1. Adding a new entry to CLASSIFICATION_CONFIGS in src/common/classifier.py
    2. Adding a new entry to CLASSIFICATION_TYPES in src/common/paths.py
    3. Creating training data scripts in scripts/datagen/
    4. Training with --classification-type <new-type>
  • Add additional pipelines (extraction, severity grading, etc.) by following the same pattern: shared logic in src/common, tuning flows in src/pipeline, serving code in src/api/src/serving.
  • The LM client is OpenAI-compatible; switching to Anthropic, Azure OpenAI, or self-hosted proxies is just a matter of environment variables.

License

MIT – see LICENSE for details.


Author

Created by Anand Pant

About

DSPY for model agnostic optimization. Dockerized deployment of FastAPI classification routes to railway, cf workers, and Palantir Foundry. Pharma/Medtech complaints vs. AE example.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages