Skip to content

kamongi/OcularGuard

Repository files navigation

OcularGuard Agent

OcularGuard - Agentic Multimodal AI for Post-Surgical Eye Care

Agentic Multimodal AI for Post-Surgical Eye Care

CI Python 3.10+ Poetry License: Apache 2.0 Tests Competition

OcularGuard is an intelligent AI agent built for the Kaggle Med-Gemma Impact Challenge. It provides comprehensive monitoring and triage for post-surgical ophthalmology patients with multiple concurrent conditions.


📖 Documentation Quick Links

🚀 New to OcularGuard? Start here:

📚 Want more details?

📊 Implementation Status:


Overview

The Problem

Post-surgical eye care patients with multiple conditions (retinal detachment, corneal implants, glaucoma) face:

  • Fragmented Care: Data silos across different specialists
  • Silent Deterioration: Critical changes between clinic visits go unnoticed
  • Complex Monitoring: Multiple interacting conditions requiring specialized expertise

Our Solution

OcularGuard Agent uses MedGemma and HAI-DEF models to provide:

  • 🔍 Multimodal Data Integration: Visual (fundus/anterior images) + Numerical (IOP readings) + Text (symptoms)
  • 🤖 Agentic Workflow: LangGraph-powered multi-step clinical reasoning
  • 🧠 Cross-Condition Analysis: Holistic evaluation across retinal, corneal, and glaucoma conditions
  • 🎯 Confidence Scoring: Mathematical uncertainty quantification with epistemic humility (abstains when confidence <60%)
  • 📱 Edge Deployment: Quantized models for offline mobile use
  • 🔒 HIPAA Compliance: End-to-end encryption and audit logging

Quick Start

Prerequisites

  • Python 3.10 or higher
  • Poetry 1.7+
  • CUDA-capable GPU (recommended: 24GB+ VRAM for MedGemma-27B, 8GB+ for MedGemma-4B)
  • 50GB+ free disk space

Installation

# 1. Clone the repository
git clone https://github.com/kamongi/OcularGuard.git
cd OcularGuard

# 2. Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -

# 3. Install dependencies
poetry install

# 4. Activate virtual environment
poetry shell

# 5. Set up environment variables
cp .env.example .env
# Edit .env with your API keys (Hugging Face, Kaggle, etc.)

# 6. Download models (this may take a while)
python scripts/download_models.py --model medgemma-4b

# 7. Run the Gradio demo
python -m ocularguard.ui.app

The Gradio interface will be available at http://localhost:7860

Quick Test

# Run tests
pytest tests/ -v

# Check code quality
black --check src/
mypy src/ocularguard/

# Generate coverage report
pytest tests/ --cov=ocularguard --cov-report=html

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Patient Data Input                       │
│  📷 Eye Images  │  🔢 IOP Readings  │  💬 Symptoms          │
└──────────────────────┬──────────────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │   Data Preprocessing        │
        │   • PHI De-identification   │
        │   • Image Normalization     │
        │   • Multimodal Fusion       │
        └──────────────┬──────────────┘
                       │
        ┌──────────────┴──────────────────────────────┐
        │      LangGraph Agentic Workflow             │
        │                                              │
        │  ┌──────┐     ┌───────────┐    ┌─────────┐ │
        │  │Triage│ --> │ Diagnosis │ -->│Education│ │
        │  └──────┘     └───────────┘    └─────────┘ │
        │      │              │                 │     │
        │      └──> Emergency Detection <───────┘     │
        │                                              │
        │  Models: MedGemma-27B, MedSigLIP, HeAR     │
        └──────────────┬──────────────────────────────┘
                       │
        ┌──────────────┴──────────────┐
        │   Results & Recommendations  │
        │  • Severity Assessment       │
        │  • Clinical Findings         │
        │  • Patient Education         │
        │  • Emergency Alerts          │
        └──────────────────────────────┘

Key Features

1. Multimodal Data Ingestion

  • Visual: Fundus photographs, anterior segment images, OCT scans
  • Numerical: IOP readings, visual acuity, medication schedules
  • Text/Audio: Patient symptom reports via text or voice (HeAR model)

2. Agentic Workflow (LangGraph)

from ocularguard.agents.graph import create_ocularguard_graph

# Create the agentic workflow
graph = create_ocularguard_graph()

# Run analysis
result = graph.invoke({
    "patient_id": "P12345",
    "fundus_image": "path/to/image.png",
    "iop_reading": 28.0,
    "symptoms": "Sudden vision loss with floaters"
})

print(f"Severity: {result['severity']}")
print(f"Recommendations: {result['recommendations']}")

3. Cross-Condition Reasoning

OcularGuard simultaneously evaluates:

  • Retinal Status: Detachment, PVR, choroidal issues
  • Corneal Graft: Rejection signs, edema, clarity
  • Glaucoma: IOP trends, optic nerve damage
  • Myopic Complications: Choroidal neovascularization

4. Edge Deployment

# Quantize model for mobile
python scripts/quantize_for_edge.py --model medgemma-4b --format int8

# Deploy to mobile (Android example)
cd deployments/mobile/android
./gradlew assembleRelease

5. Confidence Scoring & Epistemic Humility

OcularGuard implements mathematical confidence scoring using log probabilities:

from ocularguard.models.medgemma import create_medgemma

model = create_medgemma(model_size="4b")

# Generate with confidence
result = model.ocular_triage(
    iop_reading=28.0,
    symptoms="Rainbow halos around lights",
    fundus_image="path/to/image.png"
)

print(f"Urgency: {result['urgency']}")
print(f"Confidence: {result['confidence']:.1%}")
print(f"Level: {result['confidence_level']}")  # HIGH, MODERATE, or LOW

Three-Tier Confidence Framework:

  • HIGH (>90%): Auto-triage safe, clear signals
  • MODERATE (60-90%): Advisory mode, verify with doctor
  • LOW (<60%): Abstention - human review required immediately

When confidence falls below 60%, the system acknowledges uncertainty and requires specialist review rather than providing potentially incorrect guidance.

Competition Alignment

Track Our Approach Status
Main Track Human-centered AI for complex post-surgical care ✅ Implemented
Agentic Workflow LangGraph multi-step reasoning with MedGemma ✅ Implemented
Edge AI INT8-quantized MedGemma-4B for offline mobile use ✅ Implemented

Project Structure

OcularGuard/
├── src/ocularguard/        # Core application code
│   ├── agents/             # LangGraph agentic workflow
│   ├── models/             # MedGemma, MedSigLIP wrappers
│   ├── data/               # Data ingestion & preprocessing
│   ├── ui/                 # Gradio interface
│   ├── security/           # HIPAA encryption & compliance
│   └── utils/              # Configuration, logging
├── data/                   # Models cache & test cases
├── tests/                  # Test suite
├── scripts/                # Utility scripts
├── configs/                # YAML configurations
└── docs/                   # Documentation

Development

Code Style

# Format code
black src/ tests/
isort src/ tests/

# Type checking
mypy src/ocularguard/

# Linting
flake8 src/ tests/

Testing

# Run all tests
pytest tests/ -v

# Run specific test types
pytest tests/unit/ -v           # Unit tests
pytest tests/integration/ -v    # Integration tests
pytest tests/validation/ -v     # Clinical validation

# With coverage
pytest tests/ --cov=ocularguard --cov-report=html

Documentation

Clinical Use Case

Patient Profile:

  • Post-retinal detachment surgery (gas bubble tamponade)
  • Corneal implant (DSEK/DMEK)
  • Pre-existing glaucoma
  • High myopia

OcularGuard Workflow:

  1. Patient uploads fundus photo + enters IOP + describes symptoms
  2. Agent performs cross-condition triage
  3. Detects urgent signs (IOP spike, graft rejection, redetachment)
  4. Generates personalized patient education
  5. Alerts provider if emergency detected

Models

  • MedGemma-27B: Primary clinical reasoning model
  • MedGemma-4B: Lightweight edge deployment model
  • MedSigLIP: Vision encoder for zero-shot detection
  • CXR Foundation: Chest X-ray analysis (optional)
  • HeAR: Health Acoustic Representations (optional)

HIPAA Compliance

  • ✅ AES-256-GCM encryption at rest
  • ✅ TLS 1.3 for data in transit
  • ✅ Comprehensive audit logging
  • ✅ PHI de-identification
  • ✅ Role-based access control
  • ✅ Secure deletion policies

License

Apache License 2.0 - see LICENSE file for details.

Competition Details

Team

OcularGuard Team

Acknowledgments

  • Google Research for Med Gemma and HAI-DEF models
  • Kaggle for hosting the competition
  • The medical imaging community for public datasets

Citation

If you use OcularGuard in your research or publications, please cite it as:

BibTeX

@software{ocularguard2026,
  author       = {Patrick Kamongi},
  title        = {OcularGuard: Agentic Multimodal AI for Post-Surgical Eye Care},
  year         = {2026},
  publisher    = {GitHub},
  url          = {https://github.com/kamongi/OcularGuard},
  note         = {Kaggle Med-Gemma Impact Challenge Submission}
}

APA Format

Patrick Kamongi. (2026). OcularGuard: Agentic Multimodal AI for Post-Surgical Eye Care [Computer software]. GitHub. https://github.com/kamongi/OcularGuard

Chicago Format

Patrick Kamongi. "OcularGuard: Agentic Multimodal AI for Post-Surgical Eye Care." GitHub, 2026. https://github.com/kamongi/OcularGuard.

Disclaimer

⚠️ This is a research demonstration for the Kaggle competition and is not intended for clinical use. Always consult qualified healthcare professionals for medical advice.


Built with ❤️ for the Med-Gemma Impact Challenge

About

OcularGuard is an intelligent AI agent. It provides comprehensive monitoring and triage for post-surgical ophthalmology patients with multiple concurrent conditions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors