An AI-agent-first Python SDK for the OpenDataProducts.org standards family. It gives agents, agent hosts, and automation systems one consistent surface for loading, detecting, validating, explaining, searching, traversing, and summarizing documents across:
- Open Data Product Specification (ODPS),
- Open Data Product Catalog (ODPC),
- Open Data Product Graphs (ODPG), and
- Open Data Product Vocabulary (ODPV).
The package still includes developer-facing Python helpers, but the primary contract is agent-ready: structured validation results, lightweight artifact summaries, reference discovery, Data Contract orchestration, bundled retrieval resources, a unified CLI, an MCP stdio server, and an ARWS agent manifest.
- One cross-spec entry point: Agents can call
load_document,validate_document,explain_document, andresolve_referencesacross ODPS, ODPC, ODPG, and ODPV files. - Structured outputs: Validation, references, resources, summaries, and graph reasoning helpers return predictable objects that are easy for agents to inspect.
- Small-context workflows:
load_summaryreturns metadata, size, hash, spec, kind, and id without returning full document bodies. - Retrieval-ready resources: Bundled schemas, prompt templates, vocabulary records, catalog object records, and graph object records are discoverable through
list_resourcesand MCP tools. - Agent-ready ODPC and ODPV helpers: Catalog building, catalog artifact checks, vocabulary term resolution, canonical term packets, relationship compatibility checks, and term context packets are available through Python, CLI, and MCP surfaces where safe.
- Graph reasoning for agents: ODPG helpers support graph summaries, traversal, strategic analysis, and trusted focus-node context extraction.
- Data Contract orchestration: Optional
datacontract-cliintegration validates external contracts while the SDK resolves ODPS contract references, extracts schemas, checks static product-contract alignment, and returns agent-ready reports. - Host integration: MCP-capable tools can launch
open-data-products serve, while ARWS-compatible systems can read the generated manifest.
Use the top-level API when building AI agents, automation, validation pipelines, or tools that need to work across the Open Data Products standards family without knowing the spec namespace ahead of time:
from open_data_products import (
explain_document,
generate_local_artifact,
generate_local_artifacts,
load_generation_prompt,
list_resources,
load_document,
resolve_references,
validate_document,
)
document = load_document("examples/product.yaml")
result = validate_document(document)
print(result.valid, result.spec, result.kind)
print(explain_document(document))
for reference in resolve_references(document):
print(reference.pointer, reference.ref)
for resource in list_resources():
print(resource.id, resource.spec, resource.type)
prompt = load_generation_prompt("odps_data_product_fragment.md")
signal = generate_local_artifact(
"signal",
"open_data_products/generation/source_docs/turnaround-delay-signal.txt",
"open_data_products/generation/fragments",
)
all_artifacts = generate_local_artifacts(
"open_data_products/generation/source_docs",
"open_data_products/generation/fragments",
)The top-level CLI exposes the same workflow with machine-readable output:
open-data-products validate examples/product.yaml --json
open-data-products explain examples/product.yaml --json
open-data-products refs graph.yaml --json
open-data-products resources --json
open-data-products summary examples/product.yaml # lightweight reference: size, hash, spec
open-data-products manifest --json # ARWS agent manifest
open-data-products serve # MCP server over stdioData Contract support is optional and product-oriented. The SDK recognizes
native ODPS /product/contract references ($ref, contractURL, and inline
spec) as well as practical extension-style references such as
extensions.dataContract.href. External contract lint/export uses
datacontract-cli when installed; inline ODPS contract specs are used for
static summaries and alignment without running live source tests.
from open_data_products import (
check_product_contract_alignment,
extract_contract_schema,
generate_product_contract_report,
resolve_product_contracts,
summarize_contract,
validate_contract,
)
for reference in resolve_product_contracts("examples/product.yaml"):
print(reference.pointer, reference.href)
print(validate_contract("examples/contract.yaml").passed)
print(extract_contract_schema("examples/contract.yaml").field_count)
print(check_product_contract_alignment("examples/product.yaml", "examples/contract.yaml").summary)
print(generate_product_contract_report("examples/product.yaml").summary)Run open-data-products serve to expose the SDK as a local MCP server, or
open-data-products manifest --json to render the ARWS manifest. See
Agent surface for Codex/Claude Code setup, MCP tools,
and bundled skills.
Use open_data_products.<spec> namespaces for every standard:
| Namespace | Standard | Status |
|---|---|---|
open_data_products.odps |
Open Data Product Specification | Implemented |
open_data_products.odpc |
Open Data Product Catalog | Catalog helpers implemented |
open_data_products.odpg |
Open Data Product Graph | Graph helpers implemented |
open_data_products.odpv |
Open Data Product Vocabulary | Vocabulary tools implemented |
| Area | What agents and developers can do |
|---|---|
| Cross-spec API | Detect, load, validate, explain, summarize, and resolve references across ODPS, ODPC, ODPG, and ODPV |
| MCP + ARWS | Run a local stdio MCP server, expose safe tools, and generate an ARWS agent manifest |
| ODPS | Create, load, validate, serialize, and inspect ODPS v4.1 data product documents |
| ODPC | Build catalogs from fragments, validate catalogs, explain catalog metadata, search bundled catalog object guidance, and generate/check derived catalog schema artifacts |
| ODPG | Validate graphs, summarize nodes and edges, traverse relationships, analyze governance/strategy signals, and extract agent context |
| ODPV | Load, validate, search, generate vocabulary artifacts, resolve terms and aliases, explain canonical term packets, check relationships, and produce agent context for shared ODP terminology |
| Data Contracts | Resolve ODPS contract references, validate external contracts through optional datacontract-cli, extract schemas, check static alignment, and generate product-level reports |
| Bundled resources | Discover schemas, examples, vocabulary records, catalog object records, and graph object records through the resource registry |
ODPS support is scoped to the 4.x generation of the specification. The SDK primarily targets ODPS v4.1 and keeps backward-compatible support for ODPS v4.0 documents.
ODPS field validation includes ISO language, country, currency, date/time, phone, email, and URI formats where those standards apply.
pip install open-data-products
# Optional Data Contract validation adapter:
pip install "open-data-products[contracts]"
# For development:
pip install "open-data-products[dev]"This README is intentionally a short landing page. Use the focused references below for implementation details:
- API reference: Agent API, spec helper namespaces, ODPS models, validators, serialization, and examples.
- Agent surface: MCP server, ARWS manifest, and bundled skills for agent hosts.
- Command guide: what each common CLI command does, what it reads, and what it writes.
- LLM generation: Ollama or configured external LLM source-doc to ODPC fragment and ODPG graph workflow.
- Data Contract workflows: ODPS contract resolution, optional
datacontract-cli, alignment, and reports. - Capability drift reports: dated SDK alignment reports against upstream specification tooling.
- Tooling development model: human-facing explanation of how spec-level scripts mature into consolidated SDK capabilities.
- Functional test report: public API, CLI, and MCP functional coverage matrix.
- Example scripts: runnable ODPS examples, including v4.1 strategy and MCP access examples.
- Course-style guides: simple human SDK workflows and LLM generation lessons.
- Sample apps: independent CLIs built on top of the SDK.
- Agent handoff: compact machine-readable routing for AI agents.
Most commands print human-readable output by default; add --json when agents,
CI jobs, or scripts need a stable machine-readable response. See the
command guide for what each command reads, checks, and
produces.
# Cross-spec validation and summaries
open-data-products validate examples/product.yaml --json
open-data-products explain examples/odpc_catalog.yaml --json
open-data-products refs open_data_products/odpg/data/graph/graph.yaml --json
open-data-products summary examples/product.yaml
# Bundled agent resources
open-data-products resources --json
open-data-products resources --id generation.prompt.system --json
open-data-products resources --id odpc.objects --json
open-data-products resources --id odpv.terms --json
open-data-products resources --id odpg.objects --jsonThe LLM generation commands require Ollama or configured provider credentials.
# LLM generation
open-data-products generate --json
open-data-products generate --config open_data_products/generation/generation.config.yaml --json
open-data-products generate --config open_data_products/generation/generation.config.yaml --provider groq --json
open-data-products generate --config open_data_products/generation/generation.config.yaml --provider claude --json
open-data-products generate --config open_data_products/generation/generation.config.yaml --kind signal --json# Generated fragment artifacts
open-data-products validate open_data_products/generation/fragments/odpg_graph.yaml --json
open-data-products odpg-generate open_data_products/generation/fragments/odpg_graph.yaml --output /tmp/odp-generation-graph.html --json
# ODPC catalog helpers
open-data-products odpc-build examples/odpc_catalog_fragments/ --output /tmp/odp-catalog.yaml --json
open-data-products odpc-build examples/odpc_catalog_fragments/ --output /tmp/odp-catalog.yaml --html /tmp/odp-catalog.html --json
open-data-products odpc-summary /tmp/odp-catalog.yaml --json
open-data-products odpc-search "catalog data" --limit 3 --json
# ODPV vocabulary helpers
open-data-products odpv-summary --json
open-data-products odpv-search "governance policy risk" --limit 3 --json
open-data-products odpv-resolve "reusable data asset" --json
open-data-products odpv-explain DataProduct --json
open-data-products odpv-relationship DataProduct supports UseCase --json
open-data-products odpv-context DataProduct --json
# ODPG graph reasoning
open-data-products odpg-summary open_data_products/odpg/data/graph/graph.yaml
open-data-products odpg-traverse open_data_products/odpg/data/graph/graph.yaml --start AGENT-AVIATION-001 --depth 2
open-data-products odpg-analyze open_data_products/odpg/data/graph/graph.yaml
open-data-products odpg-agent-context open_data_products/odpg/data/graph/graph.yaml --node AGENT-AVIATION-001 --depth 2
open-data-products odpg-convert --input examples/graph.graphml --output /tmp/odp-converted-graph.yaml --json
open-data-products odpg-generate open_data_products/odpg/data/graph/graph.yaml --output /tmp/odp-graph-explorer.html --json
# Product-level Data Contract inspection
open-data-products product resolve-contracts examples/product.yaml --json
open-data-products product contract-schema examples/contract.yaml --jsonSee Data Contract workflows for product contract
resolution, optional datacontract-cli integration, alignment checks, reports,
and supported ODPS contract reference shapes.
Live LLM generation requires Ollama or a configured provider API key; see
LLM generation for runnable provider examples.
open_data_products.generation: editable prompt templates and provider-backed generation helpers for ODPS, ODPC, and ODPG YAML artifacts. Defaults to local Ollama/Qwen 2.5 and can use configured external providers such as OpenAI.open_data_products.odps: ODPS v4.1 models, standards-aware validation, YAML/JSON I/O, compliance helpers, andpricing_to_402.open_data_products.odpc: ODPC catalog building, loading, validation, explanation, and object guidance search.open_data_products.odpg: ODPG graph validation, summary, traversal, analysis, agent context, object search, external graph conversion, and graph explorer generation.open_data_products.odpv: ODPV vocabulary loading, validation, search, and generated vocabulary artifacts.
git clone https://github.com/Open-Data-Product-Initiative/odps-python
cd odps-python
pip install -e ".[dev]"
python examples/basic_usage.pyThe library requires the following runtime packages:
PyYAML: YAML format supportjsonschema: ODPC and ODPG schema validation
The library provides detailed validation error messages that reference specific standards:
try:
odp.validate()
except ODPSValidationError as e:
print(e)
# Output: "Validation errors: Invalid ISO 639-1 language code: 'xyz';
# dataHolder email must be a valid RFC 5322 email address"See examples/odps_v41_example.py for a demonstration of key v4.1 features including:
- ProductStrategy with business objectives
- KPI definitions with targets and calculations
- AI agent integration via MCP
- Enhanced $ref support
Run the example:
python examples/odps_v41_example.py- Basic ODPS Creation
- Comprehensive ODPS Document
- Advanced Features
- ODPC catalog fragments plus generated catalog YAML and standalone HTML
See LLM generation for source documents, prompts, provider configuration, generated fragments, ODPG graph YAML, and graph explorer output.
The examples/apps/ folder contains independent, runnable Python
sample apps built on top of the SDK. Each app lives in its own folder with a
cli.py entry point and can be run directly from the repository root.
- ODP Document Inspector CLI: inspect any ODPS, ODPC, ODPG, or ODPV YAML/JSON document and print validation, explanation, references, and bundled resource metadata.
- ODPV Vocabulary Finder CLI: search bundled ODPV terms by natural-language query and print definitions, scores, matched fields, and related terms.
- ODPS Pricing 402 Builder CLI: build an HTTP 402 payment envelope from an ODPS product with pricing plans.
python examples/apps/document_inspector/cli.py examples/apps/pricing_402_builder/priced_product.yaml
python examples/apps/vocabulary_finder/cli.py "governance policy risk" --limit 5 --json
python examples/apps/pricing_402_builder/cli.py examples/apps/pricing_402_builder/priced_product.yaml --jsonWe extend our gratitude to the following:
Open Data Product Initiative Team - Special thanks to the team at opendataproducts.org for creating and maintaining the emerging Open Data Product standards family, including the Open Data Product Specification (ODPS), Open Data Product Catalog (ODPC), Open Data Product Graphs (ODPG), and Open Data Product Vocabulary (ODPV). Their vision of standardizing data product descriptions, catalogs, graphs, and shared vocabulary has made this SDK possible. These specifications represent years of collaborative effort from industry experts, data practitioners, and open source contributors who are driving the future of data standardization.
Chris Howard / Kitard - Special thanks to Chris Howard from Accenture for creating the original odps-python library. His foundational work made it possible to extend the project into the broader Open Data Products SDK and agent toolkit.
devlouie - Special thanks to devlouie for contributing the MCP layer and Agent Surface on top of the SDK, helping make the Open Data Products standards family easier to use from agentic tools and workflows.
Data Contract CLI - Special thanks to Stefan Negele, Jochen Christ, and Simon Harrer for creating Data Contract CLI, the open source execution engine this SDK can optionally use for external Data Contract validation, export, and ecosystem interoperability.
Python Community - For the exceptional ecosystem of libraries and tools that power this implementation, including PyYAML, jsonschema, and the countless other packages that make Python development a joy.
Data Community - For embracing open standards and driving the need for better data product specifications and tooling that benefits everyone in the data ecosystem.
Documentation Support - Documentation assistance provided by Claude (Anthropic).
Contributions are welcome. Please read CONTRIBUTING.md for guidelines, browse the open issues, and consider helping with new features, bug fixes, examples, documentation, or agent-facing workflow improvements.
Apache License 2.0 - see LICENSE file for details.
- Open Data Product Specification v4.1
- ODPS Schema
- Open Data Product Catalog (ODPC)
- Open Data Product Graphs (ODPG)
- Open Data Product Vocabulary (ODPV)
- Open Data Product Standards Knowledge Base
- ISO 639-1 Language Codes
- ISO 3166-1 Country Codes
- ISO 4217 Currency Codes
- ISO 8601 Date/Time Format
- E.164 Phone Number Format
- RFC 5322 Email Format
- RFC 3986 URI Format
