OmniBioAI ModelHub is a production-grade experiment tracking and model lifecycle management system for AI/ML models within the OmniBioAI ecosystem — purpose-built for biomedical AI plugins.
It provides:
- Experiment tracking — log params, metrics, tags from training runs
- Step-indexed metric history with sparkline visualization
- Immutable model versioning (write-once)
- Cryptographic integrity verification (SHA256)
- Staged promotion workflows (latest → staging → production)
- Alias management with full audit trail
- MySQL-backed run storage with filesystem fallback
- Plugin-first design —
PluginRunClientfor TES container environments - Local-first, cloud-ready storage abstraction
- REST API (FastAPI) + CLI (
omr) + Python SDK
The registry is implemented as a standalone Python library (package name: omnibioai-model-registry, CLI entrypoint: omr) and ships a self-contained FastAPI service.
- ✅ Experiment tracking (
RunLogger,PluginRunClient) - ✅ MySQL-backed metric + param storage
- ✅ Immutable and verifiable model storage
- ✅ Audit-ready promotion workflow
- ✅ 18 REST endpoints (tracking + registry + governance)
- ✅ 11 CLI commands
- ✅ ModelHub UI with Experiments tab + metric sparklines
- ✅ Local-first, cloud-ready design
Biomedical AI requires:
- Reproducibility
- Auditability
- Governance
- Offline / air-gapped deployment
- Cross-infrastructure execution parity
Traditional ML tooling often assumes cloud-first infrastructure, mutable artifacts, and weak provenance guarantees.
OmniBioAI ModelHub is designed differently.
It treats AI models as scientific artifacts that must be immutable, verifiable, and reproducible across environments.
Two clients cover the two primary execution contexts.
Writes directly to the registry root on the local filesystem. No network required.
from omnibioai_model_registry import RunLogger, register_model
with RunLogger(task="celltype_sc", model_name="human_pbmc") as run:
run.log_params({"lr": 0.001, "epochs": 50, "batch_size": 32})
for epoch, acc in enumerate(training_curve):
run.log_metric("accuracy", acc, step=epoch)
run.log_metric("val_loss", loss, step=epoch)
run.set_tag("team", "bioml")
register_model(
task="celltype_sc",
model_name="human_pbmc",
version="2026-06-14_001",
artifacts_dir="/path/to/artifacts",
metadata={"lineage": {"run_id": run.run_id}},
)Filesystem layout produced by RunLogger:
{registry_root}/tasks/{task}/models/{model_name}/runs/{run_id}/
params.json # {"lr": 0.001, "epochs": 50}
tags.json # {"team": "bioml"}
metrics/
accuracy.jsonl # one JSON record per step
val_loss.jsonl
Posts metrics and params to the ModelHub REST API. Designed for training jobs running inside TES-scheduled containers that cannot access the registry filesystem directly.
import os
from omnibioai_model_registry import PluginRunClient
with PluginRunClient(
task="celltype_sc",
model_name="human_pbmc",
registry_url=os.environ["MODEL_REGISTRY_BASE_URL"],
) as run:
run.log_params({"lr": 0.001})
run.log_metric("accuracy", 0.95, step=0)
run.set_tag("plugin_version", "1.2.3")Both clients share the same log_params / log_metric / set_tag interface. The storage backend is the only difference.
OmniBioAI follows a four-plane architecture:
| Plane | Responsibility |
|---|---|
| Control Plane | UI, registries, metadata, governance |
| Compute Plane | Workflow execution, HPC/cloud adapters |
| Data Plane | Artifacts, datasets, outputs |
| AI Plane | Reasoning, RAG, agents, interpretation |
The ModelHub belongs to the Control Plane and provides AI artifact governance, deterministic inference references, promotion and audit workflows, and infrastructure-independent model resolution.
Each model version is write-once: no overwrites, no silent mutation, full historical trace. This guarantees scientific reproducibility.
Every model package includes a SHA256 manifest (sha256sums.txt) that hashes the package contents (excluding itself). This enables bit-level reproducibility, tamper detection, and trustworthy deployment in regulated environments.
Each model stores structured metadata via model_meta.json:
- training code version (git commit)
- dataset reference (e.g., DVC / object store ref)
- hyperparameters and preprocessing
lineage.run_idlinking back to the originating tracking run
Models move through controlled stages:
latest → staging → production
All promotions are explicit, append-only, and audited (audit/promotions.jsonl).
v0.2.0 supports a local filesystem backend (localfs) with a MySQL-backed tracking layer. S3 / Azure Blob backends are on the roadmap.
omnibioai-model-registry/
├── omnibioai_model_registry/
│ ├── api.py
│ ├── config.py
│ ├── refs.py
│ ├── errors.py
│ ├── run.py # RunLogger — filesystem-based tracking
│ ├── plugin_client.py # PluginRunClient — HTTP-based tracking for TES plugins
│ ├── db.py # MySQL connection + table bootstrap
│ ├── tracking.py # Pure-SQL tracking functions
│ ├── storage/
│ ├── package/
│ ├── audit/
│ ├── cli/
│ └── service/
├── frontend/
│ └── omnibioai-model-registry-ui/ # ModelHub UI (React + TypeScript)
├── tests/
├── pyproject.toml
└── README.md
Registered models follow a strict, portable structure:
<OMNIBIOAI_MODEL_REGISTRY_ROOT>/
tasks/<task>/models/<model_name>/
versions/<version>/
model.pt
model_genes.txt
label_map.json
model_meta.json
metrics.json
feature_schema.json
sha256sums.txt
aliases/
latest.json
staging.json
production.json
audit/
promotions.jsonl
This guarantees deterministic loading, integrity validation, and cross-environment portability.
export OMNIBIOAI_MODEL_REGISTRY_ROOT=~/local_registry/model_registrypip install -e .Verify:
python -c "import omnibioai_model_registry as m; print('OK', m.__file__)"
omr --helppip install build
python -m buildArtifacts are written to dist/:
dist/omnibioai_model_registry-0.2.0-py3-none-any.whldist/omnibioai_model_registry-0.2.0.tar.gz
Install the wheel:
pip install dist/*.whl11 commands covering the full model lifecycle.
omr register \
--task celltype_sc \
--model human_pbmc \
--version 2026-06-14_001 \
--artifacts /tmp/model_pkg \
--set-alias latestomr resolve --task celltype_sc --ref human_pbmc@latestomr promote --task celltype_sc --model human_pbmc --version 2026-06-14_001 --alias productionomr verify --task celltype_sc --ref human_pbmc@productionomr show --task celltype_sc --ref human_pbmc@production --jsonomr list --task celltype_scomr metrics --task celltype_sc --ref human_pbmc@latestomr aliases --task celltype_sc --model human_pbmcomr tag --task celltype_sc --ref human_pbmc@2026-06-14_001 --key team --value biomlomr stage --task celltype_sc --model human_pbmc --version 2026-06-14_001 --stage productionValid stages: none, staging, production, archived.
omr compare --task celltype_sc --model human_pbmc --versions 2026-02-14_001 2026-06-14_001from omnibioai_model_registry import register_model, resolve_model, promote_model
register_model(
task="celltype_sc",
model_name="human_pbmc",
version="2026-06-14_001",
artifacts_dir="/tmp/model_pkg",
metadata={
"framework": "pytorch",
"model_type": "classifier",
"provenance": {
"git_commit": "abc123",
"training_data_ref": "s3://bucket/datasets/pbmc_v1",
"trainer_version": "0.2.0",
},
},
set_alias="latest",
actor="manish",
reason="initial training",
)
# Resolve by alias (or version)
path = resolve_model("celltype_sc", "human_pbmc@latest", verify=True)
print("Resolved model dir:", path)
# Promote to production
promote_model(
task="celltype_sc",
model_name="human_pbmc",
alias="production",
version="2026-06-14_001",
actor="manish",
reason="validated metrics",
)pip install -e .
uvicorn omnibioai_model_registry.service.app.main:app --host 0.0.0.0 --port 8095Health check:
curl -s http://127.0.0.1:8095/health | python -m json.toolRegistry
| Method | Path | Description |
|---|---|---|
| POST | /v1/register | Register a model version |
| GET | /v1/resolve | Resolve a model reference to a path |
| POST | /v1/promote | Promote a version to an alias |
| POST | /v1/verify | Verify SHA256 integrity |
| GET | /v1/show | Return model_meta.json for a ref |
| GET | /v1/models | List all registered model versions |
Tracking (requires MySQL — HTTP 503 if DB_HOST is unset)
| Method | Path | Description |
|---|---|---|
| POST | /v1/runs/log-metric | Log a single metric point |
| POST | /v1/runs/log-param | Log a single parameter |
| POST | /v1/runs/log-batch | Log metrics, params, and tags |
| GET | /v1/runs/get | Fetch a full run snapshot |
| GET | /v1/runs/list | List runs for a (task, model) |
Governance
| Method | Path | Description |
|---|---|---|
| GET | /v1/aliases | List all aliases for a model |
| GET | /v1/metrics | Return version metrics + step history from DB/JSONL |
| GET | /v1/compare | Compare metrics across two or more versions |
| GET | /v1/artifacts | List files in a version package with SHA256 + sizes |
| PUT | /v1/tags | Set a tag on a model version |
| POST | /v1/versions/patch | Patch description or tags on a version |
| POST | /v1/stage | Set lifecycle stage (none/staging/production/archived) |
When DB_HOST is set, the service bootstraps five tables on startup:
omr_runs — run lifecycle (run_id, status, started_at, finished_at)
omr_params — key/value params per run
omr_metrics — step-indexed metric values per run
omr_tags — key/value tags per run
omr_version_tags — key/value tags per model version
Environment variables:
export DB_HOST=localhost
export DB_PORT=3306
export DB_USER=omr
export DB_PASSWORD=secret
export DB_NAME=model_registryWhen DB_HOST is absent, the service runs in filesystem-only mode. Tracking endpoints return HTTP 503; all registry and governance endpoints remain fully functional.
pip install -e ".[dev]"
pytest -qThe ModelHub is a control-plane component of OmniBioAI.
Companion repositories:
- omnibioai → AI-powered bioinformatics workbench
- omnibioai-tes → execution orchestration across local/HPC/cloud
- omnibioai-rag → reasoning and literature intelligence
- omnibioai-lims → laboratory data management
- omnibioai-workflow-bundles → reproducible pipelines
- omnibioai-sdk → Python client access
The ModelHub provides the AI artifact governance layer shared by all.
- Experiment tracking with
RunLogger+PluginRunClient - MySQL-backed run/metric/param/tag storage
- ModelHub UI with Experiments tab + metric sparklines
- Stage management (
none→staging→production→archived) - Alias listing, metric comparison, artifact browser endpoints
- S3 / Azure Blob storage backends
- Step-history sparklines in UI pulled from DB (currently single-point)
- Model signature validation (input/output schema enforcement)
- RBAC — per-task access control
- Parallel coordinates plot for hyperparameter search
- Auto-link
run_id→ model version in UI (Registered As chip) - Pagination + filtering on
GET /v1/modelsandGET /v1/runs/list - Promotion policies (metric threshold gates)
- Regulatory-ready audit and lineage export (PDF/CSV)
- Enterprise biomedical AI governance platform
- Deeper LIMS integration (sample → dataset → run → model chain)