From b69cf187d15399bf49620a7db378cec582d2a834 Mon Sep 17 00:00:00 2001 From: Bill Murdock Date: Mon, 15 Jun 2026 07:54:38 -0400 Subject: [PATCH] Add RFC-0008 (Skill Registry) and RFC-0009 (Harness Integration) RFC-0008 adds a governed, metadata-first Skill Registry to MLflow for AI agent capabilities (skills, subagents, hooks, skill bundles). RFC-0009 adds harness-specific installation, bundle import, lock files, and trace instrumentation for Claude Code, Codex CLI, Cursor, and other agent harnesses. Co-Authored-By: Claude Opus 4.6 --- .../0008-skill-registry.md | 822 +++++++++++ .../implementation-details.md | 1239 +++++++++++++++++ .../0009-skill-harness-integration.md | 530 +++++++ .../implementation-details.md | 462 ++++++ 4 files changed, 3053 insertions(+) create mode 100644 rfcs/0008-skill-registry/0008-skill-registry.md create mode 100644 rfcs/0008-skill-registry/implementation-details.md create mode 100644 rfcs/0009-skill-harness-integration/0009-skill-harness-integration.md create mode 100644 rfcs/0009-skill-harness-integration/implementation-details.md diff --git a/rfcs/0008-skill-registry/0008-skill-registry.md b/rfcs/0008-skill-registry/0008-skill-registry.md new file mode 100644 index 0000000..d2e114e --- /dev/null +++ b/rfcs/0008-skill-registry/0008-skill-registry.md @@ -0,0 +1,822 @@ +--- +start_date: 2026-04-22 +mlflow_issue: https://github.com/mlflow/mlflow/issues/22833 +rfc_pr: https://github.com/mlflow/rfcs/pull/10 +--- + +# RFC: Skill Registry + +| Author(s) | Bill Murdock (Red Hat) | +| :--------------------- | :-- | +| **Date Last Modified** | 2026-06-12 | +| **AI Assistant(s)** | Claude Code (Opus 4.6) | + +# Summary + +Add a Skill Registry to MLflow: a governed, metadata-first registry for +AI agent capabilities. The registry stores metadata and typed source +pointers (to Git repos, OCI registries, ZIP archives, etc.). It can +also store content directly via MLflow artifact storage, but the +primary design is metadata-first. It provides enterprise governance +on top of existing distribution mechanisms: lifecycle management, +usage analytics via traces, and federated discovery across sources. + +The registry manages four entity types under the `mlflow.genai.skills` +SDK namespace (CLI: `mlflow skills`), each with full lifecycle +(versioning, aliases, tags, status): + +- **Skills**: a directory containing a SKILL.md entry point plus + supporting files (scripts, templates, reference material) +- **Subagents**: sub-agent definitions that can be invoked by a + parent agent +- **Hooks**: event-triggered actions (harness-specific) +- **Skill bundles**: versioned, governed units that group related + capabilities and map to the "plugin" concept in agent harnesses. + Bundles can also reference MCP servers from the MCP Server Registry + (RFC-0004) via cross-registry membership. + +`mlflow skills pull` provides a harness-agnostic way to fetch +registered content from its source. Harness-specific installation +(manifest generation, directory placement) is covered in a companion +RFC (RFC-0009). + +# User journeys + +These journeys illustrate the end-to-end workflows that the Skill +Registry enables. Each shows both CLI and UI paths. + +## Register a skill bundle + +1. Register individual capability versions pointing to their sources: + ```bash + mlflow skills register --name code-review --version 1.0.0 \ + --source https://github.com/acme/agent-skills/tree/v1.0.0/code-review + mlflow subagents register --name security-auditor --version 1.0.0 \ + --source https://github.com/acme/agent-skills/tree/v1.0.0/security-auditor + mlflow hooks register --name pre-commit-scan --version 1.0.0 \ + --source https://github.com/acme/agent-skills/tree/v1.0.0/pre-commit-scan + ``` + **SDK equivalent:** + ```python + import mlflow + + mlflow.genai.skills.register_skill( + name="code-review", + version="1.0.0", + description="Reviews pull requests for correctness, style, and security", + source_type="git", + source="https://github.com/acme/agent-skills/tree/v1.0.0/code-review", + ) + ``` + **UI path:** Navigate to the Skills page, click "Register Skill," + fill in name, version, source type, and source URL, then submit. + Repeat for subagents and hooks using the type selector. +2. Create a skill bundle version that pins these members: + ```bash + mlflow skill-bundles create-version --name pr-workflow --version 1.0.0 \ + --skill code-review:1.0.0 \ + --subagent security-auditor:1.0.0 \ + --hook pre-commit-scan:1.0.0 + ``` + **UI path:** Navigate to the Bundles tab, click "Create Bundle," + add members by searching and selecting from registered capabilities. +3. Transition the bundle version from draft to active: + ```bash + mlflow skill-bundles update-version --name pr-workflow \ + --version 1.0.0 --status active + ``` + **UI path:** Open the bundle version detail page, use the status + dropdown to change from "draft" to "active." +4. Set an alias for stable downstream resolution: + ```bash + mlflow skill-bundles set-alias --name pr-workflow \ + --alias production --version 1.0.0 + ``` + **UI path:** In the bundle detail page, click "Add Alias" and map + `production` to version `1.0.0`. + +## Discover a skill for a specific purpose + +1. Search the registry by keyword: + ```bash + mlflow skills search --filter "name LIKE '%review%'" --status active + ``` + **UI path:** Navigate to the Skills page, type "review" in the + search bar, and filter by status "active" using the dropdown. +2. Browse the returned list of matching skills with names, + descriptions, and latest versions. + **UI path:** Scan the card-based list view. Each card shows the + skill name, description, latest version badge, status badge, and + tags. +3. Get details on a promising result: + ```bash + mlflow skills get --name code-review + ``` + **UI path:** Click a card to open the detail view with metadata, + version history, aliases, tags, and bundle memberships. +4. Inspect a specific version's source and metadata: + ```bash + mlflow skills get-version --name code-review --version 1.0.0 + ``` +5. Pull the skill locally to read the content and decide whether + it fits: + ```bash + mlflow skills pull --name code-review --version 1.0.0 \ + --destination ./review-skill + ``` + +## Install a skill bundle, run the agent, browse traces + +1. Install the bundle for a harness + ([RFC-0009](../0009-skill-harness-integration/0009-skill-harness-integration.md)): + ```bash + mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code --lock + ``` + This pulls the bundle content, generates harness-specific + manifests, writes a lock file, and writes a trace manifest + (`mlflow-skills-manifest.json`) with installed registry + coordinates. Harnesses that support hooks + (e.g., Claude Code) can use this manifest to automatically + create SKILL spans when a registered skill is invoked, with no + manual `skill_context()` calls needed (see + [Trace integration](#trace-integration)). +2. Run the agent. The harness loads the installed plugin and invokes + skills during a conversation. +3. Open the MLflow UI and navigate to the Traces page. Click the + "Skills" tab to filter for traces with SKILL spans. +4. Find the trace for the agent run. Skill invocations appear as + SKILL spans in the trace tree, annotated with registry coordinates + (skill name, version, registry). +5. Click a SKILL span to see which registered skill version was used + and how long it took. Click the skill name link to navigate to the + skill's registry detail page. + +## Evaluate two bundle versions with LLM judges + +MLflow's +[LLM judges](https://mlflow.org/docs/latest/genai/eval-monitor/scorers/) +can autonomously explore execution traces via MCP tools. Because +skill invocations produce traced SKILL spans, LLM judges can +analyze how skills were used during an agent run. + +1. Register a new version of the bundle with updated members: + ```bash + mlflow skills register --name code-review --version 2.0.0 \ + --source https://github.com/acme/agent-skills/tree/v2.0.0/code-review + mlflow skill-bundles create-version --name pr-workflow --version 2.0.0 \ + --skill code-review:2.0.0 \ + --subagent security-auditor:1.0.0 \ + --hook pre-commit-scan:1.0.0 + ``` +2. Install v1.0.0 and run it on a set of test inputs. Traces are + recorded in MLflow under experiment A. +3. Install v2.0.0 and run it on the same test inputs. Traces are + recorded under experiment B. +4. Use `mlflow.genai.evaluate()` with a `make_judge` scorer that + uses the `{{ trace }}` template variable to score both sets of + traces against quality criteria (correctness, helpfulness, safety). +5. Compare the evaluation results side by side in the MLflow UI to + determine whether v2.0.0 is an improvement. +6. If v2.0.0 is better, transition it to active and update the + production alias: + ```bash + mlflow skill-bundles update-version --name pr-workflow \ + --version 2.0.0 --status active + mlflow skill-bundles set-alias --name pr-workflow \ + --alias production --version 2.0.0 + ``` + +## CI pipeline for automated regression detection + +1. A CI job (e.g., GitHub Actions) triggers on pushes to the skill + source repo. +2. The job registers a new skill bundle version from the updated + source: + ```bash + mlflow skills register --name code-review --version 1.1.0 \ + --source https://github.com/acme/agent-skills/tree/v1.1.0/code-review + mlflow skill-bundles create-version --name pr-workflow --version 1.1.0 \ + --skill code-review:1.1.0 \ + --subagent security-auditor:1.0.0 \ + --hook pre-commit-scan:1.0.0 + ``` +3. The job installs the new bundle version and runs it against a + benchmark dataset, collecting traces in a dedicated MLflow + experiment. +4. The job runs + [LLM judge](https://mlflow.org/docs/latest/genai/eval-monitor/scorers/) + evaluation on the collected traces, producing scored results. +5. The job fetches the benchmark results from the previous production + version (stored as MLflow metrics or evaluation artifacts). +6. The job compares the new scores against the previous scores. If + any quality metric regresses beyond a configured threshold, it + sends an alert (Slack, email, or fails the CI check). +7. If no regression is detected, the job transitions the new version + to active and optionally updates the production alias. + +See [implementation-details.md: SDK and CLI code +examples](implementation-details.md#sdk-and-cli-code-examples) for +additional SDK examples including cross-registry bundles, OCI subpath +registration, and discovery/search operations. + +## Motivation + +### The problem + +AI agent capabilities (skills, sub-agents, MCP server configurations, +and hooks) are becoming a critical asset class in enterprise AI +platforms. As organizations adopt agentic AI, they accumulate these +capabilities across teams, repositories, and agent harnesses. + +A cross-harness portable format is emerging around these capabilities. +The registry is format-agnostic but is designed to interoperate with +the conventions gaining adoption across agent harnesses: + +- **SKILL.md**: a markdown file with structured instructions for the + agent. Supported by Claude Code, Codex CLI, Cursor, GitHub Copilot, + OpenClaw, Kilo Code, and Antigravity. This is the most broadly + portable format for skills and subagents. +- **MCP server configs**: JSON configuration for Model Context + Protocol servers. MCP is a universal tool extension protocol + supported by nearly all major harnesses. +- **Hooks**: event-triggered shell commands or scripts. Less + standardized; Claude Code and Codex CLI have the most mature hook + support. +- **Plugin bundles**: packaging of skills, subagents, MCP configs, and + hooks into a single installable unit. Formats range from + harness-specific (Claude Code and Codex CLI `plugin.json` manifests) + to cross-harness (e.g., Lola's "AI Context Modules," which use + directory auto-discovery to target multiple harnesses from a single + package). + +Today, these capabilities are managed as ad-hoc files in Git +repositories. This works well for individual developers and small +teams. GitHub provides versioning, collaboration, and access control. + +However, enterprises face governance challenges that Git alone does not +address: + +1. **No status lifecycle.** Git has no concept of "this version is + approved for production use" vs. "this is deprecated." Teams resort + to branch naming conventions or external tracking to manage + promotion. + +2. **Fragmented discovery.** Capabilities may live in multiple Git + repos, OCI registries, or other distribution systems. There is no + single discovery layer across all of these. + +3. **No cross-type bundling.** Agent harnesses like Claude Code and + Codex CLI support plugins that bundle skills, subagents, MCP + servers, and hooks together. But there is no agent-neutral way to + represent these bundles for governance and discovery. + +4. **No trace-to-skill linkage.** MLflow already traces agent + conversations (Claude Code via `mlflow autolog claude`, SDK + applications via framework autologgers such as + `mlflow.langchain.autolog()` and `mlflow.anthropic.autolog()`). + These traces capture LLM calls, tool use, and token consumption, + but there is no way to know which governed, versioned skill was + active during any part of a trace. This RFC introduces + `mlflow.skill_context()` for manual instrumentation and an + install-time trace manifest for automatic instrumentation via + harness hooks (see [Trace integration](#trace-integration)). + Without a registry, organizations cannot answer questions like + "which skill versions are most used?" or "show me all traces where + the deprecated code-review v1.0 was loaded." + +5. **No pull mechanism.** Once a user discovers a capability in the + registry, there is no standard way to fetch its content from the + source system. Users must manually copy source pointers and run + harness-specific install steps. + +### Out of scope + +- **Artifact storage as the only path.** The registry supports both + external source pointers (Git, OCI, ZIP) and direct artifact storage + (`source_type="mlflow"`). However, it is not an artifact-only store; + the metadata-first, source-pointer model remains the primary design. +- **Authoring or development tools.** The registry manages published + capabilities, not the process of writing them. +- **Format specification.** The registry is format-agnostic. It does + not define what a skill must contain or how it must be structured. + The SKILL.md convention is an ecosystem convention, not a registry + requirement. +- **Agent routing or orchestration.** The registry is a metadata and + governance layer. It does not decide which skills to invoke at + runtime or how agents compose capabilities. +- **MCP server hosting.** MCP server deployment and runtime management + are covered by the MCP Server Registry (RFC-0004) and the MCP + Gateway. +- **Prompts.** MLflow already has a Prompt Registry for versioned + prompt template management. Skills and prompts serve different + purposes: a skill provides instructions and tools for agent + autonomy, while a prompt provides templated text for structured + generation. Skills may reference prompts, but they belong in + separate registries because they have different lifecycles, different + audience (harness-based agents vs. custom agentic code). The two + registries are complementary but separate. + +## Detailed design + +### Entities and data model + +```mermaid +erDiagram +Skill ||--o{ SkillVersion : "has versions" +Skill ||--o{ SkillTag : "has tags" +Skill ||--o{ SkillAlias : "has aliases" +Subagent ||--o{ SubagentVersion : "has versions" +Subagent ||--o{ SubagentTag : "has tags" +Subagent ||--o{ SubagentAlias : "has aliases" +Hook ||--o{ HookVersion : "has versions" +Hook ||--o{ HookTag : "has tags" +Hook ||--o{ HookAlias : "has aliases" +SkillBundle ||--o{ SkillBundleVersion : "has versions" +SkillBundle ||--o{ SkillBundleTag : "has tags" +SkillBundle ||--o{ SkillBundleAlias : "has aliases" +SkillBundleVersion ||--o{ SkillBundleVersionMember : "has members" +SkillBundleVersionMember }o--|| SkillVersion : "skill member" +SkillBundleVersionMember }o--|| SubagentVersion : "subagent member" +SkillBundleVersionMember }o--|| HookVersion : "hook member" +SkillBundleVersionMember }o--|| MCPServerVersion : "mcp_server member" + +SkillBundleVersionMember { + string member_type + string member_name + string member_version + string member_subpath +} +``` + +`SkillBundleVersionMember` is the membership row for an entry in a +bundle version. The member target is determined by `member_type`; MCP +server references may be resolved against the MCP Registry rather than +enforced as local database foreign keys. + +#### Skill + +A skill is a directory containing a SKILL.md entry point plus +supporting files (scripts, templates, reference material). The +`Skill` entity is the logical governed asset, scoped to a workspace. +Key fields include `name` (unique within workspace), `display_name`, +`status` (read-only, derived from the parent-resolved version), +`latest_version` (read-only, highest active semver), and `aliases`. + +**MCP servers.** MCP servers are registered in the MCP Server Registry +(RFC-0004), not in this registry. Skill bundles can reference MCP +registry entries in their `mcp_servers` list. MCP configs embedded in +bundle-level artifacts (e.g., `.mcp.json` inside an OCI image) are +treated as artifact content discovered by harness adapters during +installation (RFC-0009), not as separately registered entities. + +#### SkillVersion + +A versioned record containing a typed source pointer (`git`, `oci`, +`zip`, or `mlflow`), status, and tags. The `(name, version)` pair is +unique within a workspace. Source pointers and version strings are +immutable after creation; to point to different content, register a +new version. The optional `subpath` field identifies content within a +shared artifact (used with OCI and ZIP). The optional `content_digest` +field enables integrity verification. + +#### Subagent and Hook + +`Subagent` (a sub-agent definition invocable by a parent agent) and +`Hook` (an event-triggered action, e.g., a shell command before a +commit) follow the same structure as `Skill`: top-level governed +assets with the same fields, versions, tags, aliases, and lifecycle. +`SubagentVersion` and `HookVersion` follow the same structure as +`SkillVersion`. + +All registry entity types share the same version, tag, alias, and +lifecycle patterns. The store interface, REST API, and SDK expose +parallel operations for each type. + +#### SkillBundle + +A skill bundle groups related capabilities (skills, subagents, hooks, +and MCP servers) into a governed unit that maps to the "plugin" +concept in agent harnesses. Follows the same top-level pattern as +Skill: versions, tags, aliases, and derived status. + +**Why bundles instead of tags?** Tags could express "these skills +are related" but cannot provide versioned membership snapshots +(reproducible point-in-time combinations), cross-registry references +(MCP servers from RFC-0004), bundle-level source pointers (a single +OCI image), independent lifecycle (deprecate a bundle without +deprecating its members), or direct mapping to the harness plugin +concept. + +#### SkillBundleVersion + +A versioned snapshot of a bundle's membership. A bundle version is +one of two kinds: + +- **Assembled:** captures `(name, version)` tuples for skills, + subagents, hooks, and MCP servers. Skill, subagent, and hook + versions have their own sources. `pull` fetches members individually. +- **Monolithic:** has its own source pointer (e.g., a single OCI + image containing a complete plugin) and member references. Skill, + subagent, and hook versions may omit their own sources when their + content lives inside the bundle artifact. `pull` fetches the bundle + artifact as a unit. + +A bundle version cannot have both a bundle-level source and +skill/subagent/hook member versions with their own sources. This avoids +confusion about which source is authoritative for registry-managed +capability content. + +Dataclass definitions, field tables, source type details, and +cross-registry reference handling for all entity types are in +[implementation-details.md](implementation-details.md#skill-entity). + +#### Aliases and tags + +All entity types use the same alias pattern: a frozen `(name, alias, +version)` tuple mapping a stable name (e.g., `production`) to a +specific version string. Tags are `(key, value)` pairs at both the +entity level and version level. Subagent, Hook, and SkillBundle +follow the same patterns. + +### Status and lifecycle + +This lifecycle aligns with the MCP Server Registry (RFC-0004). + +#### Per-version status + +Each `SkillVersion`, `SubagentVersion`, `HookVersion`, and +`SkillBundleVersion` has an independent status: + +| State | Meaning | Downstream surfacing | +|---|---|---| +| `draft` | Registered but not yet ready for downstream use | Not surfaced to consumers | +| `active` | Ready for downstream use | Surfaced to discovery, traces, consumers | +| `deprecated` | Still functional but no longer recommended | Surfaced with deprecation signal | +| `deleted` | Soft-deleted; preserved internally for history, no longer active | Not surfaced by normal get/search/list APIs | + +New versions default to `draft` upon creation. + +Allowed transitions: + +| From | To | +|---|---| +| `draft` | `active`, `deleted` | +| `active` | `draft`, `deprecated` | +| `deprecated` | `active`, `deleted` | + +`draft` allows a version to be registered and reviewed before being +made visible to consumers. `active` can +return to `draft` (unpublish) for cases where a version needs to be +pulled back for further review. `deprecated` can return to `active` +(re-activate) for cases where a deprecation was premature. `deleted` +is terminal. + +Normal version delete operations (`delete_skill_version`, +`delete_subagent_version`, `delete_hook_version`, and +`delete_skill_bundle_version`) transition the version to `deleted` +rather than physically removing the version row, subject to the allowed +lifecycle transitions above. Active versions must first be unpublished +or deprecated before they can be deleted. As in the Model Registry, +normal get/search/latest resolution excludes deleted versions, while +internal audit/provenance paths may still retain enough metadata to +explain historical traces and bundle snapshots. Deleting a version also +removes aliases that point to that version. + +Top-level entity delete operations (`delete_skill`, `delete_subagent`, +`delete_hook`, and `delete_skill_bundle`) are administrative hard +deletes that remove the parent and cascade to child rows, following the +Model Registry registered-model pattern. These operations are subject +to referential-integrity checks: for example, a skill version referenced +by a bundle version cannot be physically removed until the referencing +bundle version is removed or otherwise no longer references it. Normal +retirement should use version deprecation or version soft delete rather +than top-level hard delete. + +#### Entity-level status + +`Skill.status`, `Subagent.status`, `Hook.status`, and +`SkillBundle.status` are read-only. They are derived from the +parent-resolved version: the highest semantic version among `active` +versions if one exists, otherwise the highest semantic version among +non-`deleted` non-`active` versions. Deleted versions never drive +parent status. This follows the MCP Server Registry pattern +(RFC-0004). + +#### `latest_version` resolution + +Version strings must follow [semantic versioning](https://semver.org/) +(e.g., `1.0.0`, `2.1.0-beta.1`). `get_latest_skill_version(name)` +returns the highest semantic version among `active` versions. +Prerelease identifiers participate in semantic-version ordering, +while build metadata does not. Deprecated versions do not participate +in `latest` resolution. `latest_version` is a read-only computed +field on the parent entity (not manually pinnable); aliases cover the +use case of pointing a stable name (e.g., `production`) at a specific +version. Parent status uses the separate fallback rule above, so an +entity can still derive `draft` or `deprecated` status when it has no +active versions. + +The alias name `latest` is reserved: `set_skill_alias(..., +alias="latest", ...)` is rejected, while +`get_skill_version_by_alias(..., alias="latest")` is treated as a +convenience alias for `get_latest_skill_version(...)`. + +The same pattern applies to `Subagent`, `Hook`, `SkillBundle`, and +their corresponding `get_latest_*_version` methods. This aligns with +the MCP Server Registry (RFC-0004). + +### Implementation details + +Database schema (table definitions), store interface (method +signatures), SDK convenience functions, REST API endpoints, +pagination/filtering, and Python SDK/CLI mapping are in +[implementation-details.md](implementation-details.md). + +### Pull semantics + +`pull` is a client-side operation. The SDK reads the source pointer +from the registry via the REST API, then fetches content directly +from the source system to the caller's local filesystem. The registry +server is not involved in content transfer. `pull` is +source-type-aware: + +| Source type | Pull behavior | +|---|---| +| `git` | `git clone` or `git archive` of the referenced path/ref | +| `oci` | `oci pull` of the referenced image/tag; if `subpath` is set, extract only that path from the image | +| `zip` | HTTP download and extract; if `subpath` is set, extract only that path from the archive | +| `mlflow` | Download the version's MLflow-managed artifact directory tree using the same artifact APIs and credentials as other MLflow artifact operations | + +**Single skill pull.** Fetches the content at the skill version's +`source` to the destination directory. If `subpath` is set, only the +content at that path within the artifact is extracted. Returns an +error if the skill version has no `source`; source-less embedded skill +versions are pullable only through their containing monolithic bundle. + +**Skill bundle pull.** For monolithic bundles, fetch the bundle +artifact as a single unit to the destination directory. For assembled +bundles, pull each member individually from its own `source` to a +subdirectory of the destination, named by the member's name. If a +skill, subagent, or hook member in an assembled bundle has no `source`, +the pull fails rather than producing a partial local bundle. + +If `content_digest` is set, `pull` verifies the fetched content +matches the digest and returns an error on mismatch. This +verification is client-side. The server stores the digest as metadata +but does not re-verify artifact store contents on each request. + +`pull` is harness-agnostic. It downloads content but does not generate +harness-specific manifests or place files in harness-specific +directories. Harness-specific installation is covered in RFC-0009. + +See [implementation-details.md: Pull semantics +details](implementation-details.md#pull-semantics-details) for source +authentication mechanisms per source type, source availability error +handling, and credential management. + +### Workspace scoping + +All skill registry operations are workspace-scoped, following MLflow's +existing workspace-aware registry patterns (model registry, MCP +registry). Cross-workspace sharing is out of scope for this RFC and +should be solved at the platform level across all MLflow registries. + +### Permissions + +The skill registry integrates with MLflow's existing permission +framework (READ / EDIT / MANAGE), applied at the `Skill`, `Subagent`, +`Hook`, and `SkillBundle` level. Versions, tags, aliases, and +memberships inherit permissions from their parent entity. + +| Permission | Operations | +|---|---| +| `READ` | Search entities, get versions, resolve aliases, list tags and memberships | +| `EDIT` | Create entities, create versions, set tags, update description, status transitions (activate, deprecate), set aliases | +| `MANAGE` | Delete aliases, delete tags, soft-delete versions, hard-delete entities, manage permissions | + +This follows the same pattern as the model registry and MCP Server +Registry (RFC-0004): status transitions and alias setting are gated +by `EDIT` (`can_update`), while destructive operations (deletes) are +gated by `MANAGE` (`can_delete`). +- **Creator gets MANAGE.** When a user creates an entity (skill, + subagent, hook, or bundle), they automatically receive MANAGE + permission, following the MLflow model registry pattern. + +### UI + +The Skills page lives under the GenAI workflow in the MLflow sidebar, +alongside Experiments, Prompts, MCP Servers, and AI Gateway. + +#### List view + +The list view shows skills, subagents, hooks, and bundles together +using a card-based layout consistent with the MCP Server Registry +(RFC-0004). Each card displays: + +- Entity type badge (skill, subagent, hook, or bundle) +- Name and optional display name +- Description (truncated to 2-3 lines) +- Latest version badge (e.g., "v1.0.0") +- Status badge with color coding: draft (gray), active (green), + deprecated (amber) +- Source type indicator (Git, OCI, ZIP, MLflow) +- Tag chips + +The filter bar provides: + +- **Type dropdown**: skill, subagent, hook, bundle (multi-select) +- **Status dropdown**: draft, active, deprecated +- **Source type dropdown**: git, oci, zip, mlflow +- **Search**: by name or description + +A "Register Skill" button (with a dropdown for subagent, hook, or +bundle) initiates registration. + +#### Detail view: skills, subagents, hooks + +The detail view for an individual capability shows: + +- **Metadata section**: name, display name, description, status, + workspace, source type, created by, created at, last updated +- **Version table**: Version, Registered at, Status, Source type, + Created by, Description. Clicking a version row navigates to the + version detail page showing source, subpath, content digest, and + tags. +- **Aliases**: alias name to version mapping (e.g., + `production -> 1.0.0`) +- **Tags**: key-value list with edit controls +- **Bundle memberships**: list of bundles that include this capability, + with links to each bundle's detail page +- **Related traces**: link to the GenAI Traces page filtered by this + skill's name, showing recent SKILL spans that reference this + capability + +#### Detail view: bundles + +The bundle detail view shows: + +- **Metadata section** (as above) +- **Members table** for the selected bundle version, grouped by type: + Type (skill/subagent/hook/mcp_server), Name, Pinned version, Source + type, Status. Each row links to the member's detail page. + Cross-registry members (MCP servers) link to the MCP Server Registry + detail page. +- **Version history table**: Version, Registered at, Status, Created + by, Member count +- **Aliases and tags** (as above) + +#### Trace integration display + +The GenAI Traces page includes a "Skills" tab alongside the existing +"Prompts" tab, showing SKILL spans for each trace. The trace detail +view displays SKILL spans with registry coordinates (skill name, +version, workspace) and links to the skill's registry detail page. +Skill version detail pages surface related traces using the same +association data. + +### Trace integration + +MLflow already traces agent conversations across multiple frameworks: +Claude Code (via `mlflow autolog claude`), SDK applications (via +framework autologgers such as `mlflow.langchain.autolog()` and +`mlflow.anthropic.autolog()`), and others. These +traces capture LLM calls, tool use, and timing as a tree of spans. +The skill registry closes the observability loop by letting agent +developers indicate which registered skill is active during each +part of a trace. + +#### `mlflow.skill_context()` context manager + +The primary instrumentation API is a context manager that creates a +span of type `SKILL` and attaches registry coordinates as span +attributes: + +```python +with mlflow.skill_context( + name="code-review", version="1.0.0" +) as span: + # All spans created inside this block (including those from + # autologgers) become children of this SKILL span. + result = llm.chat([{"role": "user", "content": "Review this code..."}]) +``` + +The context manager creates a span with `mlflow.skill.name`, +`mlflow.skill.version`, and `mlflow.skill.workspace` +attributes that link the span back to a specific skill version in +the registry. See [implementation-details.md: skill_context() span +attributes](implementation-details.md#skill_context-span-attributes) +for the full attribute table. + +#### Scope of skill_context() + +`skill_context()` is for skills only, and it is not applicable to +subagents, hooks, or bundles. + Bundle-level analytics are derived by aggregating over traces of + individual member skills. + +#### Workspace resolution + +When `skill_context()` is called, the workspace is resolved from +the `mlflow-skills-manifest.json` written by the install commands +(`mlflow skills install` / `install-bundle`) +(defined in +[RFC-0009](../0009-skill-harness-integration/0009-skill-harness-integration.md)). +The manifest always contains the workspace for each installed skill or +other bundle entry. +For SDK users calling `skill_context()` directly without a manifest, +the workspace defaults to the current MLflow tracking URI's workspace +context, consistent with other MLflow operations. + +#### Skill stacks via nesting + +Skills can invoke other skills. Nesting `skill_context()` calls +produces a skill stack in the trace tree: + +``` ++-- Span: "code-review" (type: SKILL, version: 1.0.0) +| +-- Span: ChatCompletion (type: LLM) +| +-- Span: "style-check" (type: SKILL, version: 2.0.0) +| | +-- Span: ChatCompletion (type: LLM) +| +-- Span: ChatCompletion (type: LLM) +``` + +Walking up the ancestor chain and collecting SKILL spans reconstructs +the skill stack for any span. + +#### What this enables + +Skill-annotated traces enable adoption tracking (which versions are +most used), deprecation impact analysis (which traces used a +deprecated version), per-skill cost attribution (aggregate token +usage and latency per SKILL span), and regression detection (compare +trace outcomes across skill versions). + +#### Autologger compatibility + +Because `skill_context()` creates a standard MLflow span, it works +with existing autologgers without modification. When an autologger +(Claude, LangChain, OpenAI, etc.) creates a span inside a +`skill_context()` block, that span automatically becomes a child of +the SKILL span. No changes to the autologgers are needed. + +For harness-specific integration (e.g., Claude Code automatically +wrapping skill loads in `skill_context()` spans), see RFC-0009. + +#### Registry validation + +`skill_context()` does not validate that the named skill exists in +the registry at call time. Validating on every invocation would add +latency and create a hard dependency on registry availability. The +trace records the `{workspace, name, version}` coordinates +regardless; the MLflow UI performs a best-effort lookup when +displaying traces and shows a "not found in registry" indicator if +the coordinates do not resolve. + +#### Relationship to MCP trace linking + +The MCP Registry (RFC-0004) uses after-the-fact, trace-level +association (`link_mcp_server_versions_to_trace()`). Skills use +span-level, inline annotation because skills are ambient (active +during inference) and can nest. Both approaches produce trace +metadata that the MLflow UI can display together. + +## Drawbacks + +- **Source pointer validity.** For external sources (git, oci, zip), + the registry cannot guarantee pointers remain valid. The optional + `content_digest` field mitigates content tampering but does not + prevent link rot. Users who need self-contained storage can use + `source_type="mlflow"` to store content directly in MLflow artifact + storage. + +# Alternatives + +## Store skill artifacts only in MLflow (no source pointers) + +Make MLflow artifact storage the sole storage mechanism, with no +support for external source pointers. + +Rejected because most organizations already manage skills in Git or +OCI. Source pointers federate across existing distribution mechanisms +without requiring migration. The current design supports both: +`source_type="mlflow"` for direct artifact storage alongside +`source_type="git"`, `"oci"`, and `"zip"` for external sources. + +## Use Git alone (no registry) + +Continue using Git repositories as the sole mechanism for skill +management. + +This is sufficient for individual developers and small teams. This RFC +proposes a governance layer on top of Git for enterprises that need +status lifecycle and federated discovery. +The two approaches are complementary. + +# Adoption strategy + +New feature, not a breaking change. Phased rollout: + +- **Phase 1 (this RFC):** Registry entities (Skill, Subagent, Hook, SkillBundle), store, REST API, SDK, CLI, UI, `mlflow skills pull`, and `mlflow.skill_context()` for trace integration. +- **Phase 2 (RFC-0009):** Harness-specific `mlflow skills install` / `install-bundle` for Claude Code, Codex CLI, and Cursor. Automatic `skill_context()` wrapping in harness-specific autologgers. +- **Phase 3 (follow-up):** Usage analytics dashboards, install count tracking, cross-workspace export/import (following cross-registry patterns), and shared base extraction with the MCP registry. diff --git a/rfcs/0008-skill-registry/implementation-details.md b/rfcs/0008-skill-registry/implementation-details.md new file mode 100644 index 0000000..33c3e23 --- /dev/null +++ b/rfcs/0008-skill-registry/implementation-details.md @@ -0,0 +1,1239 @@ +# RFC-0008: Skill Registry Implementation Details + +This document contains implementation-level specifications for +RFC-0008 (Skill Registry). It covers database schema, store interface +method signatures, SDK convenience functions, REST API endpoints, +pagination/filtering, and the Python SDK/CLI mapping. These details +support implementers; the main RFC covers the design rationale. + +## Database schema + +Tables are created via a single Alembic migration. All tables are +workspace-scoped. + +### `skills` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, default `'default'` | +| `name` | `String(256)` | PK | +| `display_name` | `String(256)` | mutable human-readable label | +| `description` | `String(5000)` | | +| `created_by` | `String(256)` | | +| `last_updated_by` | `String(256)` | | +| `creation_timestamp` | `BigInteger` | millis since epoch | +| `last_updated_timestamp` | `BigInteger` | millis since epoch | + +### `skill_versions` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `version` | `String(256)` | PK, valid semantic version | +| `version_major` | `Integer` | extracted from validated semantic version | +| `version_minor` | `Integer` | extracted from validated semantic version | +| `version_patch` | `Integer` | extracted from validated semantic version | +| `display_name` | `String(256)` | mutable human-readable label | +| `source_type` | `String(20)` | nullable; `git`, `oci`, `zip`, etc. | +| `source` | `String(2048)` | nullable pointer to skill content | +| `subpath` | `String(2048)` | nullable; path within the artifact | +| `content_digest` | `String(512)` | optional integrity digest | +| `status` | `String(20)` | default `'draft'` | +| `created_by` | `String(256)` | | +| `last_updated_by` | `String(256)` | | +| `creation_timestamp` | `BigInteger` | millis since epoch | +| `last_updated_timestamp` | `BigInteger` | millis since epoch | + +FK: `(workspace, name)` references `skills`, CASCADE delete. This +supports administrative hard deletion of the parent `Skill`; normal +version deletion is a status transition to `deleted` and does not +physically remove the version row. + +**Semantic version ordering**: `version_major`, `version_minor`, and +`version_patch` are materialized from the validated semantic version +string at write time. `get_latest_skill_version` filters to `active` +rows, orders by these numeric fields descending, then applies full +semantic-version precedence in application code when candidates share +the same `major.minor.patch` and differ by prerelease identifiers. +Build metadata is ignored for precedence. + +**Index**: `ix_skill_versions_latest_lookup` on `(workspace, name, +status, version_major, version_minor, version_patch)` supports +latest-resolution lookups. + +### `skill_tags` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `key` | `String(256)` | PK | +| `value` | `Text` | | + +### `skill_version_tags` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `version` | `String(256)` | PK, FK | +| `key` | `String(256)` | PK | +| `value` | `Text` | | + +### `skill_aliases` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `alias` | `String(256)` | PK | +| `version` | `String(256)` | target version string | + +### Subagent tables + +The `subagents`, `subagent_versions`, `subagent_tags`, +`subagent_version_tags`, and `subagent_aliases` tables follow the +same structure as the corresponding skill tables above, including +`version_major/minor/patch` columns and the latest-lookup index. FK +relationships mirror the skill tables: `subagent_versions` references +`subagents` with CASCADE delete, etc. + +### Hook tables + +The `hooks`, `hook_versions`, `hook_tags`, `hook_version_tags`, +and `hook_aliases` tables follow the same structure as the +corresponding skill tables, including `version_major/minor/patch` +columns and the latest-lookup index. FK relationships mirror the +skill tables. + +### `skill_bundles` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, default `'default'` | +| `name` | `String(256)` | PK | +| `display_name` | `String(256)` | mutable human-readable label | +| `description` | `String(5000)` | | +| `created_by` | `String(256)` | | +| `last_updated_by` | `String(256)` | | +| `creation_timestamp` | `BigInteger` | millis since epoch | +| `last_updated_timestamp` | `BigInteger` | millis since epoch | + +### `skill_bundle_versions` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `version` | `String(256)` | PK, valid semantic version | +| `version_major` | `Integer` | extracted from validated semantic version | +| `version_minor` | `Integer` | extracted from validated semantic version | +| `version_patch` | `Integer` | extracted from validated semantic version | +| `display_name` | `String(256)` | mutable human-readable label | +| `source_type` | `String(20)` | optional; `git`, `oci`, `zip`, etc. | +| `source` | `String(2048)` | optional pointer to bundle artifact | +| `subpath` | `String(2048)` | nullable; path within the artifact | +| `content_digest` | `String(512)` | optional integrity digest | +| `status` | `String(20)` | default `'draft'` | +| `created_by` | `String(256)` | | +| `last_updated_by` | `String(256)` | | +| `creation_timestamp` | `BigInteger` | millis since epoch | +| `last_updated_timestamp` | `BigInteger` | millis since epoch | + +FK: `(workspace, name)` references `skill_bundles`, CASCADE delete. +Semantic version ordering and index follow the same pattern as +`skill_versions`. + +### `skill_bundle_version_members` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK | +| `bundle_name` | `String(256)` | PK, FK to `skill_bundle_versions` | +| `bundle_version` | `String(256)` | PK, FK to `skill_bundle_versions` | +| `member_type` | `String(20)` | PK; `skill`, `subagent`, `hook`, or `mcp_server` | +| `member_name` | `String(256)` | PK | +| `member_version` | `String(256)` | PK | +| `member_subpath` | `String(2048)` | nullable; member path inside bundle artifact | + +FK: `(workspace, bundle_name, bundle_version)` references `skill_bundle_versions`, CASCADE delete. + +The `member_type` column distinguishes member categories. When +`member_type` is `skill`, a FK to `skill_versions` enforces +referential integrity with RESTRICT delete. Similarly for `subagent` +(FK to `subagent_versions`) and `hook` (FK to `hook_versions`). + +**Cross-registry references (`member_type='mcp_server'`).** There is no +database-level FK for MCP registry references. Referential integrity +is enforced at the application layer: the store validates that the +referenced `MCPServerVersion` exists when creating a bundle version +and returns `RESOURCE_DOES_NOT_EXIST` if it does not. This avoids +deployment-ordering dependencies between RFC-0004 and RFC-0008 +migrations and allows either registry to be deployed independently. + +### `skill_bundle_tags` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `key` | `String(256)` | PK | +| `value` | `Text` | | + +### `skill_bundle_version_tags` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `version` | `String(256)` | PK, FK | +| `key` | `String(256)` | PK | +| `value` | `Text` | | + +### `skill_bundle_aliases` + +| Column | Type | Notes | +|--------|------|-------| +| `workspace` | `String(63)` | PK, FK | +| `name` | `String(256)` | PK, FK | +| `alias` | `String(256)` | PK | +| `version` | `String(256)` | target bundle version string | + +**Workspace handling.** All tables use `(workspace, ...)` as the leading +primary key components. Single-tenant deployments use `'default'`. + +**Timestamps.** Set at the application layer via +`get_current_time_millis()`, not via DDL defaults. + +**Deletion semantics.** The registry follows the mixed deletion pattern +used by the Model Registry and RFC-0004: + +- Top-level entity delete operations (`delete_skill`, + `delete_subagent`, `delete_hook`, and `delete_skill_bundle`) are + administrative hard deletes. They physically remove the parent row and + cascade to child rows, subject to referential-integrity checks. +- Version delete operations (`delete_skill_version`, + `delete_subagent_version`, `delete_hook_version`, and + `delete_skill_bundle_version`) are soft deletes. They set + `status='deleted'` when allowed by the lifecycle transition rules, + update `last_updated_timestamp`, remove aliases that point to the + deleted version, and exclude the version from normal + get/search/list/latest resolution. Active versions must first be + unpublished or deprecated before they can be deleted. +- The `deleted` status is terminal. Internal audit or provenance paths + may retain enough metadata to explain historical traces and bundle + snapshots, but deleted versions are not surfaced to consumers. + +## Store interface + +The store interface follows the mixin pattern established by the MCP +Server Registry (RFC-0004). Methods raise `NotImplementedError` rather +than using `@abstractmethod`, allowing stores that don't support skills +(e.g., `FileStore`) to work without stubbing every method. + +In the store interface, `delete_*` methods on top-level entities are +hard deletes, while `delete_*_version` methods are soft deletes that +transition the version to `deleted`. + +```python +class SkillRegistryMixin: + # --- Skill operations --- + + def create_skill( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Skill: + raise NotImplementedError + + def get_skill(self, name: str) -> Skill: + raise NotImplementedError + + def search_skills( + self, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[Skill]: + raise NotImplementedError + + def update_skill( + self, + name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Skill: + raise NotImplementedError + + def delete_skill(self, name: str) -> None: + raise NotImplementedError + + # --- SkillVersion operations --- + + def create_skill_version( + self, + name: str, + version: str, + display_name: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_digest: str | None = None, + ) -> SkillVersion: + raise NotImplementedError + + def get_skill_version( + self, name: str, version: str, + ) -> SkillVersion: + raise NotImplementedError + + def get_skill_version_by_alias( + self, name: str, alias: str, + ) -> SkillVersion: + raise NotImplementedError + + def get_latest_skill_version(self, name: str) -> SkillVersion: + raise NotImplementedError + + def search_skill_versions( + self, + name: str, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[SkillVersion]: + raise NotImplementedError + + def update_skill_version( + self, + name: str, + version: str, + status: SkillStatus | None = None, + ) -> SkillVersion: + raise NotImplementedError + + def delete_skill_version( + self, name: str, version: str, + ) -> None: + raise NotImplementedError + + # --- Skill tag operations --- + + def set_skill_tag( + self, name: str, key: str, value: str, + ) -> None: + raise NotImplementedError + + def delete_skill_tag(self, name: str, key: str) -> None: + raise NotImplementedError + + def set_skill_version_tag( + self, name: str, version: str, + key: str, value: str, + ) -> None: + raise NotImplementedError + + def delete_skill_version_tag( + self, name: str, version: str, key: str, + ) -> None: + raise NotImplementedError + + # --- Skill alias operations --- + + def set_skill_alias( + self, name: str, alias: str, version: str, + ) -> None: + raise NotImplementedError + + def delete_skill_alias( + self, name: str, alias: str, + ) -> None: + raise NotImplementedError + + # --- Subagent operations --- + # Same shape as Skill: create, get, search, update, delete, + # plus version, tag, and alias operations. + + def create_subagent( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Subagent: + raise NotImplementedError + + def get_subagent(self, name: str) -> Subagent: + raise NotImplementedError + + def search_subagents( + self, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[Subagent]: + raise NotImplementedError + + def update_subagent( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Subagent: + raise NotImplementedError + + def delete_subagent(self, name: str) -> None: + raise NotImplementedError + + def create_subagent_version( + self, name: str, version: str, + display_name: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_digest: str | None = None, + ) -> SubagentVersion: + raise NotImplementedError + + # Remaining subagent version, tag, and alias operations + # follow the same pattern as skill operations above. + + # --- Hook operations --- + # Same shape as Skill: create, get, search, update, delete, + # plus version, tag, and alias operations. + + def create_hook( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Hook: + raise NotImplementedError + + def get_hook(self, name: str) -> Hook: + raise NotImplementedError + + def search_hooks( + self, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[Hook]: + raise NotImplementedError + + def update_hook( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> Hook: + raise NotImplementedError + + def delete_hook(self, name: str) -> None: + raise NotImplementedError + + def create_hook_version( + self, name: str, version: str, + display_name: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_digest: str | None = None, + ) -> HookVersion: + raise NotImplementedError + + # Remaining hook version, tag, and alias operations + # follow the same pattern as skill operations above. + + # --- SkillBundle operations --- + + def create_skill_bundle( + self, name: str, + display_name: str | None = None, + description: str | None = None, + ) -> SkillBundle: + raise NotImplementedError + + def get_skill_bundle(self, name: str) -> SkillBundle: + raise NotImplementedError + + def search_skill_bundles( + self, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[SkillBundle]: + raise NotImplementedError + + def update_skill_bundle( + self, + name: str, + display_name: str | None = None, + description: str | None = None, + ) -> SkillBundle: + raise NotImplementedError + + def delete_skill_bundle(self, name: str) -> None: + raise NotImplementedError + + # --- SkillBundleVersion operations --- + + def create_skill_bundle_version( + self, + name: str, + version: str, + display_name: str | None = None, + skills: list[tuple[str, str]] | None = None, + subagents: list[tuple[str, str]] | None = None, + hooks: list[tuple[str, str]] | None = None, + mcp_servers: list[tuple[str, str]] | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_digest: str | None = None, + ) -> SkillBundleVersion: + raise NotImplementedError + + def get_skill_bundle_version( + self, name: str, version: str, + ) -> SkillBundleVersion: + raise NotImplementedError + + def get_skill_bundle_version_by_alias( + self, name: str, alias: str, + ) -> SkillBundleVersion: + raise NotImplementedError + + def get_latest_skill_bundle_version( + self, name: str, + ) -> SkillBundleVersion: + raise NotImplementedError + + def search_skill_bundle_versions( + self, + name: str, + filter_string: str | None = None, + max_results: int = 100, + order_by: list[str] | None = None, + page_token: str | None = None, + ) -> PagedList[SkillBundleVersion]: + raise NotImplementedError + + def update_skill_bundle_version( + self, + name: str, + version: str, + status: SkillStatus | None = None, + ) -> SkillBundleVersion: + raise NotImplementedError + + def delete_skill_bundle_version( + self, name: str, version: str, + ) -> None: + raise NotImplementedError + + # --- SkillBundle tag operations --- + + def set_skill_bundle_tag( + self, name: str, key: str, value: str, + ) -> None: + raise NotImplementedError + + def delete_skill_bundle_tag( + self, name: str, key: str, + ) -> None: + raise NotImplementedError + + def set_skill_bundle_version_tag( + self, name: str, version: str, + key: str, value: str, + ) -> None: + raise NotImplementedError + + def delete_skill_bundle_version_tag( + self, name: str, version: str, key: str, + ) -> None: + raise NotImplementedError + + # --- SkillBundle alias operations --- + + def set_skill_bundle_alias( + self, name: str, alias: str, version: str, + ) -> None: + raise NotImplementedError + + def delete_skill_bundle_alias( + self, name: str, alias: str, + ) -> None: + raise NotImplementedError + +``` + +## SDK convenience functions + +The `mlflow.genai.skills` namespace provides convenience functions that +combine store operations, matching the pattern established by +`mlflow.genai.register_mcp_server()` in RFC-0004. + +```python +def register_skill( + name: str, + version: str, + display_name: str | None = None, + description: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_path: str | None = None, + content_digest: str | None = None, +) -> SkillVersion: + """Register a skill version. Auto-creates the parent Skill if + it does not exist. If content_path is provided, uploads the + local directory to MLflow artifact storage and sets source_type + and source automatically.""" + + +def register_subagent( + name: str, + version: str, + display_name: str | None = None, + description: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_path: str | None = None, + content_digest: str | None = None, +) -> SubagentVersion: + """Register a subagent version. Auto-creates the parent + Subagent if it does not exist.""" + + +def register_hook( + name: str, + version: str, + display_name: str | None = None, + description: str | None = None, + source_type: str | None = None, + source: str | None = None, + subpath: str | None = None, + content_path: str | None = None, + content_digest: str | None = None, +) -> HookVersion: + """Register a hook version. Auto-creates the parent Hook if + it does not exist.""" + + +def pull( + name: str | None = None, + bundle: str | None = None, + version: str | None = None, + alias: str | None = None, + destination: str = ".", +) -> str: + """Pull skill, subagent, hook, or bundle content from registered + sources to a local directory. Specify name for a single + capability or bundle for a skill bundle.""" +``` + +## REST API + +The REST API uses RESTful nested resource paths, following the pattern +from the MCP Server Registry proposal. + +### Skill endpoints + +All paths relative to `/ajax-api/3.0/mlflow/skills`. + +| Method | Path | Description | +|---|---|---| +| `POST` | `/` | Create a skill | +| `GET` | `/` | Search skills | +| `GET` | `/{name}` | Get skill by name | +| `PATCH` | `/{name}` | Update skill fields | +| `DELETE` | `/{name}` | Hard-delete skill (cascades, subject to references) | +| `POST` | `/{name}/versions` | Create a skill version | +| `GET` | `/{name}/versions` | Search versions | +| `GET` | `/{name}/versions/{version}` | Get a specific version | +| `PATCH` | `/{name}/versions/{version}` | Update version | +| `DELETE` | `/{name}/versions/{version}` | Soft-delete a version (`status='deleted'`) | +| `POST` | `/{name}/tags` | Set a skill-level tag | +| `DELETE` | `/{name}/tags/{key}` | Delete a skill-level tag | +| `POST` | `/{name}/versions/{version}/tags` | Set a version-level tag | +| `DELETE` | `/{name}/versions/{version}/tags/{key}` | Delete a version tag | +| `POST` | `/{name}/aliases` | Set an alias | +| `GET` | `/{name}/aliases/{alias}` | Resolve alias to `SkillVersion` | +| `DELETE` | `/{name}/aliases/{alias}` | Delete an alias | + +### Subagent endpoints + +All paths relative to `/ajax-api/3.0/mlflow/subagents`. Same +structure as skill endpoints: CRUD on subagents and subagent versions, +plus tags and aliases. Parent delete is a hard delete; version delete +sets `status='deleted'`. + +### Hook endpoints + +All paths relative to `/ajax-api/3.0/mlflow/hooks`. Same structure as +skill endpoints: CRUD on hooks and hook versions, plus tags and +aliases. Parent delete is a hard delete; version delete sets +`status='deleted'`. + +### Skill bundle endpoints + +All paths relative to `/ajax-api/3.0/mlflow/skill-bundles`. + +| Method | Path | Description | +|---|---|---| +| `POST` | `/` | Create a skill bundle | +| `GET` | `/` | Search skill bundles | +| `GET` | `/{name}` | Get bundle by name | +| `PATCH` | `/{name}` | Update bundle fields | +| `DELETE` | `/{name}` | Hard-delete bundle (cascades versions and memberships) | +| `POST` | `/{name}/versions` | Create a bundle version with members | +| `GET` | `/{name}/versions` | Search bundle versions | +| `GET` | `/{name}/versions/{version}` | Get a specific bundle version | +| `PATCH` | `/{name}/versions/{version}` | Update bundle version status | +| `DELETE` | `/{name}/versions/{version}` | Soft-delete a bundle version (`status='deleted'`) | +| `POST` | `/{name}/tags` | Set a bundle-level tag | +| `DELETE` | `/{name}/tags/{key}` | Delete a bundle-level tag | +| `POST` | `/{name}/versions/{version}/tags` | Set a bundle version tag | +| `DELETE` | `/{name}/versions/{version}/tags/{key}` | Delete a bundle version tag | +| `POST` | `/{name}/aliases` | Set a bundle alias | +| `GET` | `/{name}/aliases/{alias}` | Resolve bundle alias to version | +| `DELETE` | `/{name}/aliases/{alias}` | Delete a bundle alias | + +### Pagination and filtering + +Search endpoints use page-token-based pagination and `filter_string` +expressions following existing MLflow conventions. + +**Skills, subagents, hooks, and bundles:** `name LIKE '%review%'`, +`status = 'active'`, `tags.team = 'platform'` + +**Versions (all entity types):** `status = 'active'`, +`source_type = 'git'` + +**Skill bundle versions:** `status = 'active'`, +`tags.approved = 'true'` + +## Python SDK and CLI + +The `mlflow.genai.skills` module exposes top-level functions delegating to +`MlflowClient`, with a 1:1 mapping to the store mixin methods above. +CLI command groups (`mlflow skills`, `mlflow subagents`, +`mlflow hooks`, and `mlflow skill-bundles`) provide the same +operations from the command line. See the basic examples in the main +RFC for usage. + +`pull` is implemented in the SDK/CLI layer, not the store mixin. The +client calls `get_skill_version` (or the corresponding subagent/hook +method, or resolves an alias) to obtain the source pointer, then +fetches content locally using source-type-specific logic (git clone, +OCI pull, ZIP download, or MLflow artifact download). This keeps the +store as a pure data-access layer. + +## Skill entity + +A skill is a directory containing a SKILL.md entry point plus +supporting files (scripts, templates, reference material). The +`Skill` entity is the logical governed asset, scoped to a workspace. + +```python +from dataclasses import dataclass, field +from enum import StrEnum + + +class SkillStatus(StrEnum): + DRAFT = "draft" + ACTIVE = "active" + DEPRECATED = "deprecated" + DELETED = "deleted" + + +@dataclass +class Skill: + name: str + display_name: str | None = None + description: str | None = None + workspace: str | None = None + status: SkillStatus | None = None + tags: dict[str, str] = field(default_factory=dict) + aliases: list[SkillAlias] = field(default_factory=list) + latest_version: str | None = None + created_by: str | None = None + last_updated_by: str | None = None + creation_timestamp: int | None = None + last_updated_timestamp: int | None = None +``` + +| Field | Type | Description | +|---|---|---| +| `name` | `str` | Stable logical asset name, unique within a workspace | +| `display_name` | `str` | Mutable human-readable label for UI display | +| `status` | `SkillStatus` | Read-only; derived from the parent-resolved version: highest active semantic version if present, otherwise highest non-deleted non-active semantic version | +| `aliases` | `list[SkillAlias]` | Stable version pointers (e.g., `production` -> `1.2.0`) | +| `latest_version` | `str` | Read-only; highest semantic version among `active` versions | +| `workspace` | `str` | Visibility boundary | + +## SkillVersion entity + +A versioned record containing a typed source pointer, status, and +tags. + +```python +class SkillSourceType(StrEnum): + GIT = "git" + OCI = "oci" + ZIP = "zip" + MLFLOW = "mlflow" + + +@dataclass +class SkillVersion: + name: str + version: str + display_name: str | None = None + source_type: SkillSourceType | None = None + source: str | None = None + subpath: str | None = None + status: SkillStatus = SkillStatus.DRAFT + content_digest: str | None = None + tags: dict[str, str] = field(default_factory=dict) + aliases: list[str] = field(default_factory=list) + workspace: str | None = None + created_by: str | None = None + last_updated_by: str | None = None + creation_timestamp: int | None = None + last_updated_timestamp: int | None = None +``` + +| Field | Type | Description | +|---|---|---| +| `version` | `str` | Publisher-supplied version string. Semantic versioning is required (e.g., `1.0.0`, `2.1.0-beta.1`) | +| `display_name` | `str` | Mutable human-readable label for UI display | +| `source_type` | `SkillSourceType` | Optional distribution mechanism: `git`, `oci`, `zip`, `mlflow` | +| `source` | `str` | Pointer to the content in the source system. Required for standalone pull. May be omitted only when the version's content lives within a bundle-level artifact, in which case the containing bundle membership identifies the embedded content path | +| `subpath` | `str` | Optional path within the artifact where this skill's content lives. Used with OCI and ZIP source types when multiple skills share a single artifact. Not needed for Git (use tree URLs) or MLflow artifacts (path is scoped at upload) | +| `content_digest` | `str` | Optional digest for integrity verification (e.g., `sha256:abc123...`). Aligns with OCI digest terminology | +| `status` | `SkillStatus` | Per-version lifecycle: `draft`, `active`, `deprecated`, `deleted` | +| `aliases` | `list[str]` | Alias names currently pointing at this version (read-only, projected from alias table) | + +## SkillVersion field details + +**Source type extensibility.** The `source_type` enum is intentionally +small for the initial implementation. New source types (e.g., `s3`, +`azure-blob`, `opensharing`) can be added without schema changes +since the column stores a string value. In particular, the +[OpenSharing](https://github.com/OpenSharing-IO/OpenSharing) protocol +(Linux Foundation) defines AgentSkill as a first-class asset type +using the same SKILL.md directory structure. An `opensharing` source +type would let the registry govern and track skills whose content is +shared via OpenSharing's credential-vending protocol. + +**Subpath usage by source type.** The `subpath` field separates "what +to download" from "where inside the downloaded content the relevant +asset lives." Its applicability varies by source type: + +| Source type | `subpath` usage | +|---|---| +| `oci` | Path within the OCI image (e.g., `plugins/code-review`). Used when multiple skills share a single image. | +| `zip` | Path within the archive (e.g., `plugins/code-review`). Used when multiple skills share a single archive. | +| `git` | Not used. Git tree URLs already encode the repository, ref, and path in a single `source` string (e.g., `https://github.com/acme/skills/tree/v1.0.0/code-review`). | +| `mlflow` | Not used. The artifact path is scoped to the specific skill version at upload time. | + +**Git source parsing challenge.** Git tree URLs are convenient for +users, but they are hosting-provider conventions rather than a single +Git transport format. Implementers must decide which Git URL forms are +accepted, how repository, ref, and path are extracted, and how +reproducibility is handled when a ref is mutable. Resolving this +challenge is left to the implementer, but feasible options include the +following non-limiting set: + +- Support only GitHub-style tree URLs in the initial implementation. +- Require Git sources to use immutable commit SHAs rather than branch + or tag names. +- Interpret `source` as the clone URL and use `subpath` for the asset + path, with a future `ref` field if needed. +- Allow mutable refs but require `content_digest` so pulls can detect + content drift. + +**MLflow artifact storage (`source_type="mlflow"`).** In addition to +external source pointers, the registry supports storing skill content +directly in MLflow's artifact storage. This serves users who do not +have external Git/OCI infrastructure, who want agent capabilities +stored alongside their models, or who operate in airgapped +environments where external sources are not reachable. + +Content is stored as a directory tree of individual files under an +artifact path, consistent with how MLflow stores model artifacts. For +example, a skill with a SKILL.md, scripts, and reference material is +stored as separate artifacts under a version-specific prefix: + +``` +skills/code-review/1.0.0/ + SKILL.md + scripts/analyze.sh + scripts/lint-config.json + reference/style-guide.md +``` + +The `source` field contains the artifact URI as resolved by MLflow's +artifact storage (e.g., `mlflow-artifacts:/skills/code-review/1.0.0/` +when using the artifact proxy, or a direct artifact-store URI +otherwise). `source_type="mlflow"` means "stored in MLflow-managed +artifact storage," not a specific URI scheme. Pull downloads the +directory tree from the artifact store. The MLflow UI can browse +individual files within a stored skill version when artifact proxying +is enabled. + +The upload API accepts a local directory path and stores each file as +a separate artifact. The `content_digest` is computed over the full +directory contents at upload time. + +**Version uniqueness.** The combination of `(name, version)` is unique +within a workspace. A skill version represents a single logical +version of a capability; `source_type` and `source` describe where to +find it but are not part of its identity. + +**Content integrity.** The optional `content_digest` field stores a +digest of the skill content at registration time (e.g., +`sha256:abc123...`). For `source_type="mlflow"`, the server computes +the digest at upload time and stores it on the version; on pull, the +client recomputes the digest over the downloaded content and rejects +the result if it does not match, detecting out-of-band modification +of the underlying artifact store. For external source types (git, oci, +zip), `content_digest` is client-supplied: for OCI sources, this is +the native image digest; for Git sources, a digest of the file +contents at the pinned commit; for ZIP sources, a digest of the +archive. The registry stores the digest but does not verify it on +read; verification is the consumer's responsibility. + +**Immutability contract.** `source_type`, `source`, `subpath`, +`content_digest`, and `version` are immutable after creation. To point +to different content, register a new version. Mutable fields (`display_name`, +`status`, `tags`) can be updated independently. + +## SkillBundle entity + +A skill bundle groups related capabilities (skills, subagents, hooks, +and MCP servers) into a governed unit that maps to the "plugin" +concept in agent harnesses. Follows the same top-level pattern as +Skill: versions, tags, and aliases. + +```python +@dataclass +class SkillBundle: + name: str + display_name: str | None = None + description: str | None = None + workspace: str | None = None + status: SkillStatus | None = None + tags: dict[str, str] = field(default_factory=dict) + aliases: list["SkillBundleAlias"] = field(default_factory=list) + latest_version: str | None = None + created_by: str | None = None + last_updated_by: str | None = None + creation_timestamp: int | None = None + last_updated_timestamp: int | None = None +``` + +`SkillBundle.status` is read-only and uses the same parent-resolved +version rule as `Skill`: highest active semantic version if present, +otherwise highest non-deleted non-active semantic version. Latest +version resolution remains active-only: the highest semantic version +in `active` status. + +## SkillBundleVersion entity + +A versioned snapshot of a skill bundle's membership. Each version +captures a specific set of capabilities that work together, organized +by type. + +```python +@dataclass +class SkillBundleVersion: + name: str + version: str + display_name: str | None = None + source_type: SkillSourceType | None = None + source: str | None = None + subpath: str | None = None + content_digest: str | None = None + status: SkillStatus = SkillStatus.DRAFT + tags: dict[str, str] = field(default_factory=dict) + skills: list[tuple[str, str]] = field(default_factory=list) + subagents: list[tuple[str, str]] = field(default_factory=list) + hooks: list[tuple[str, str]] = field(default_factory=list) + mcp_servers: list[tuple[str, str]] = field(default_factory=list) + aliases: list[str] = field(default_factory=list) + workspace: str | None = None + created_by: str | None = None + last_updated_by: str | None = None + creation_timestamp: int | None = None + last_updated_timestamp: int | None = None +``` + +Each member list contains `(name, version)` tuples, with optional +membership metadata such as `member_subpath` stored on the membership row. The +`skills`, `subagents`, and `hooks` lists reference entities in this +registry. The `mcp_servers` list references entries in the MCP Server +Registry (RFC-0004). + +## SkillBundleVersion field details + +**Version uniqueness.** The combination of `(name, version)` is unique +within a workspace. + +**Bundle-level source.** A bundle version is either monolithic or +assembled, never both: + +- **Monolithic:** has its own `source_type`, `source`, `subpath`, + and `content_digest`, pointing to a single artifact (e.g., an OCI + image or Git repo) that contains the complete plugin. `pull` + fetches the bundle artifact as a unit. The bundle version generally + has member references so embedded skills, subagents, and hooks remain + governed and traceable, but those member versions may omit their own + `source` because the bundle artifact is the authoritative source. +- **Assembled:** has individual member references with `(name, + version)` tuples. Each skill, subagent, and hook member has its own + source. `pull` fetches members individually. If a skill, subagent, or + hook member has no source, `pull` fails rather than producing a + partial local bundle. + +A monolithic bundle artifact is a generic package of content (skill +files, agent definitions, hook scripts). It may or may not be +harness-ready; the adapter does not assume either way. Harness +adapters (RFC-0009) generate harness-specific manifests from +registry metadata at install time, since the registry is the +governed source of truth. Correctness of the artifact layout is +the publisher's responsibility; the registry does not validate +artifact contents at registration time. + +**Immutability contract.** The member lists and source fields of a +bundle version are immutable after creation. To change the set of +members or source pointer, register a new bundle version. Mutable +fields (`display_name`, `status`, `tags`) can be updated independently. + +Members with `member_type` of `skill`, `subagent`, or `hook` +reference entities in this registry. For monolithic bundles, those +member versions may omit `source` because their content is embedded in +the bundle artifact. The optional membership `member_subpath` identifies +where the member lives inside the bundle artifact. For assembled bundles, +`member_subpath` is usually empty because the member's own `source` and `subpath` +identify its content. Members with +`member_type='mcp_server'` reference an `MCPServerVersion` in the +MCP server registry (RFC-0004). This cross-registry reference +enables: + +- **Deduplication.** Two bundles that both need `github-mcp` + reference the same MCP registry entry. No duplicate configs. +- **Runtime status.** The MCP registry tracks deployment state via + hosted bindings (`is_deployed`, `endpoint_url`). Install-time + tooling can check whether a referenced MCP server is already + running rather than starting a duplicate. +- **Single source of truth.** MCP server definitions are governed in + the MCP registry; skill bundles reference them rather than carrying + standalone copies. + +A member can appear in multiple bundles and multiple bundle versions. +Membership is at the version level, so a bundle version is a +reproducible snapshot of "these specific asset versions work together." + +**Bundle-level source and embedded MCP configs.** When a bundle +version is monolithic (a single OCI image or Git repo containing a +complete plugin), the artifact may include MCP configs alongside +skills and subagents. Embedded skills, subagents, and hooks should +generally be registered as member versions so they remain governed and +traceable. MCP servers are different because they belong in the MCP +Server Registry (RFC-0004): MCP configs within a monolithic artifact do +not need separate MCP registry entries unless the publisher wants them +independently governed and reusable. Cross-registry MCP references are +for bundles where MCP servers are independently registered and managed. + +## Pull semantics details + +**Source availability.** The registry stores source pointers but does +not cache or proxy content. If a source is unreachable or the content +has been deleted, pull fails with an error that surfaces the +underlying failure from the source system (e.g., Git clone failure, +OCI pull 404, HTTP download error, MLflow artifact download error). +Source availability is the publisher's responsibility. For assembled +bundle pulls, if one member's source is unavailable, the entire pull +fails rather than producing a partial result. + +**Source authentication.** The registry server stores source pointers +but does not validate source accessibility at registration time and is +not involved in content transfer at pull time. Authentication to +external sources is handled entirely by the client environment: + +| Source type | Authentication mechanism | +|---|---| +| `git` | Standard Git credential resolution: SSH keys (`~/.ssh/`), Git credential helpers (`git-credential-manager`, `git-credential-store`), `.netrc`, and `GIT_SSH_COMMAND`. Private repos work if the caller's Git is configured to access them. | +| `oci` | OCI registry credential resolution: Docker config (`~/.docker/config.json`), registry-specific credential helpers, and container runtime auth. Private registries work if the caller has a valid login session. | +| `zip` | No authentication support. ZIP sources must be publicly accessible URLs. For private content, use `git` or `oci` source types instead. | +| `mlflow` | MLflow artifact storage authentication, using the same credentials as other MLflow API calls. | + +The registry does not store, proxy, or manage source credentials. +Pull failures due to authentication errors are surfaced to the caller +with the underlying error from the source system. + +`pull` is harness-agnostic. It downloads content but does not generate +harness-specific manifests or place files in harness-specific +directories. Harness-specific installation is covered in RFC-0009. + +## skill_context() span attributes + +The `skill_context()` context manager creates a span with the +following attributes: + +| Attribute | Value | Description | +|---|---|---| +| `mlflow.skill.name` | Skill name | Registry name of the active skill | +| `mlflow.skill.version` | Version string | Registered version | +| `mlflow.skill.workspace` | Workspace name | MLflow workspace (defaults to `"default"`) | + +These three attributes form the `{workspace, name, version}` +coordinates that link the span back to a specific skill version in +the registry. + +## SDK and CLI code examples + +### Register other capability types + +```python +# Register a subagent +mlflow.genai.skills.register_subagent( + name="security-auditor", + version="1.0.0", + description="Security specialist for auth and payment code", + source_type="git", + source="https://github.com/acme/agent-skills/tree/v1.0.0/security-auditor", +) + +# Register a hook +mlflow.genai.skills.register_hook( + name="pre-commit-scan", + version="1.0.0", + description="Runs security scan before tool commits", + source_type="git", + source="https://github.com/acme/agent-skills/tree/v1.0.0/pre-commit-scan", +) +``` + +### Create a skill bundle with cross-registry references + +```python +# A bundle version can include skills, subagents, hooks, and MCP servers +bundle_version = mlflow.genai.skills.create_skill_bundle_version( + name="pr-workflow", + version="1.0.0", + skills=[ + ("code-review", "1.0.0"), + ], + subagents=[ + ("security-auditor", "1.0.0"), + ], + # Reference MCP servers from the MCP registry (RFC-0004) + mcp_servers=[ + ("github-mcp", "2.0.0"), + ], +) +``` + +### Register skills from an OCI artifact with subpath + +```python +# Register individual skills that live inside a shared OCI image. +# The subpath identifies each skill's location within the image. +mlflow.genai.skills.register_skill( + name="code-review", + version="1.0.0", + source_type="oci", + source="ghcr.io/acme/agent-plugin:v1.0.0", + subpath="skills/code-review", +) + +mlflow.genai.skills.register_skill( + name="test-coverage", + version="2.1.0", + source_type="oci", + source="ghcr.io/acme/agent-plugin:v1.0.0", + subpath="skills/test-coverage", +) + +# Assembled bundle: members reference individually registered skills. +# Each member has its own source. No bundle-level source. +bundle_version = mlflow.genai.skills.create_skill_bundle_version( + name="pr-workflow", + version="1.0.0", + skills=[ + ("code-review", "1.0.0"), + ("test-coverage", "2.1.0"), + ], +) + +# Monolithic bundle: a single OCI image contains the complete plugin. +# Embedded member versions are registered without their own sources. +mlflow.genai.skills.register_skill( + name="embedded-review", + version="1.0.0", +) + +bundle_version = mlflow.genai.skills.create_skill_bundle_version( + name="pr-workflow-mono", + version="1.0.0", + source_type="oci", + source="ghcr.io/acme/agent-plugin:v1.0.0", + skills=[ + ("embedded-review", "1.0.0"), + ], +) +``` + +### Discover and consume skills + +```python +# Search for active skill versions +versions = mlflow.genai.skills.search_skill_versions( + name="code-review", + filter_string="status = 'active'", +) + +# Search for active skill bundles +bundles = mlflow.genai.skills.search_skill_bundles( + filter_string="status = 'active'", +) + +# Get a specific version +version = mlflow.genai.skills.get_skill_version( + name="code-review", + version="1.0.0", +) +# version.source_type == "git" +# version.source == "https://github.com/acme/agent-skills/tree/v1.0.0/code-review" + +# Resolve by alias +version = mlflow.genai.skills.get_skill_version_by_alias( + name="code-review", + alias="production", +) + +# Get a bundle version and its pinned members +bundle_version = mlflow.genai.skills.get_skill_bundle_version( + name="pr-workflow", + version="1.0.0", +) +# bundle_version.skills == [("code-review", "1.0.0"), ...] +# bundle_version.subagents == [("security-auditor", "1.0.0"), ...] +# bundle_version.mcp_servers == [("github-mcp", "2.0.0"), ...] + +# Resolve a bundle alias +bundle_version = mlflow.genai.skills.get_skill_bundle_version_by_alias( + name="pr-workflow", + alias="production", +) +``` + +CLI equivalents for these operations use `mlflow skills`, `mlflow +subagents`, `mlflow hooks`, and `mlflow skill-bundles` command groups. diff --git a/rfcs/0009-skill-harness-integration/0009-skill-harness-integration.md b/rfcs/0009-skill-harness-integration/0009-skill-harness-integration.md new file mode 100644 index 0000000..d866675 --- /dev/null +++ b/rfcs/0009-skill-harness-integration/0009-skill-harness-integration.md @@ -0,0 +1,530 @@ +--- +start_date: 2026-04-27 +mlflow_issue: https://github.com/mlflow/mlflow/issues/22833 +rfc_pr: https://github.com/mlflow/rfcs/pull/10 +--- + +# RFC: Skill Registry Harness Integration + +| Author(s) | Bill Murdock (Red Hat) | +| :--------------------- | :-- | +| **Date Last Modified** | 2026-06-12 | +| **AI Assistant(s)** | Claude Code (Opus 4.6) | + +# Summary + +Add harness-specific installation to the MLflow Skill Registry +(RFC-0008). Where RFC-0008 provides `mlflow skills pull` to fetch +registered content to a local directory, this RFC adds +`mlflow skills install` and `mlflow skills install-bundle` to generate +harness-specific manifests, place files in the correct directories, +and configure the agent harness to use the installed capabilities. + +This bridges the gap between "I found a skill bundle in the registry" +and "my agent harness can use it." + +# Basic example + +## Install a skill bundle for Claude Code + +```bash +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code +``` + +This resolves the `pr-workflow` skill bundle, pulls the bundle content +according to its registered mode, and generates: + +``` +.claude/plugins/pr-workflow/ + .claude-plugin/plugin.json # Generated manifest + skills/ + code-review/SKILL.md # Installed skill content + agents/ + security-auditor.md # Installed subagent content + .mcp.json # Generated from mcp-server members +``` + +## Install for other harnesses + +```bash +# Same command, different --harness: codex-cli, cursor, antigravity +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness cursor + +# Install globally (user scope) instead of project scope +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code --scope user +``` + +## Import an existing plugin as a skill bundle + +```bash +# Register an existing Claude Code plugin as a monolithic skill bundle +mlflow skills import --source https://github.com/acme/plugins/tree/v1.0.0/pr-workflow \ + --harness claude-code --bundle-name my-plugin --version 1.0.0 +``` + +This fetches the artifact, inspects it to identify skills, subagents, +hooks, and MCP servers, and registers a monolithic skill bundle version +with discovered skill, subagent, and hook members. The source must be +remotely accessible (git, OCI, ZIP, or MLflow artifact URI) so that the +registered bundle has a pullable source pointer. + +## Python SDK + +```python +import mlflow + +mlflow.genai.skills.install_bundle( + name="pr-workflow", + alias="production", + harness="claude-code", + scope="project", # or "user" for global install +) + +# Import an existing harness-specific plugin into the registry +mlflow.genai.skills.import_bundle( + source="https://github.com/acme/plugins/tree/v1.0.0/pr-workflow", + harness="claude-code", + bundle_name="my-plugin", + version="1.0.0", +) +``` + +## Motivation + +### The problem + +RFC-0008 provides `pull` for fetching content to a local directory, +but each harness has its own directory layout, manifest format, and +discovery mechanism (see table below). Without harness-specific +installation, users must manually create manifests, place files in +the right directories, and configure discovery. This is error-prone +and discourages adoption. + +### The cross-harness landscape + +The following table summarizes the capability types and installation +conventions across major agent harnesses: + +| Harness | Skills | Agents | MCP | Hooks | Manifest | Install dir | +|---|---|---|---|---|---|---| +| Claude Code | SKILL.md | agent .md | .mcp.json | settings.json | plugin.json | `.claude/plugins/` | +| Codex CLI | SKILL.md | agent .md | .mcp.json | hooks | plugin.json | `.codex/plugins/` | +| Cursor | SKILL.md | agent .md | mcp.json | -- | -- | `.cursor/skills/`, `.cursor/agents/` | +| GitHub Copilot | skills/ | agents/ | .mcp.json | hooks/*.json | plugin.json | project | +| Lola | SKILL.md | agents/*.md | mcps.json | lola.yaml | (auto-discovered) | per-harness | +| OpenClaw | SKILL.md | -- | -- | plugin hooks | openclaw.plugin.json | `skills/` | +| Kilo Code | SKILL.md | custom modes | mcp.json | -- | -- | project | +| Antigravity | SKILL.md | -- | -- | -- | -- | `.agent/skills/` | +| OpenCode | .md/.ts | agent configs | config | JS events | -- | `.opencode/` | +| Continue | -- | config.yaml | mcpServers/ | -- | -- | `.continue/` | +| Windsurf | -- | -- | mcp_config.json | -- | -- | project | +| Amazon Q | -- | -- | mcp.json | -- | -- | `.amazonq/` | +| Goose | -- | -- | MCP only | -- | -- | config | +| Zed | -- | profiles | settings.json | -- | -- | config | + +Key insight: the SKILL.md file format is portable across harnesses. +Only the directory placement and manifest format differ. + +### Out of scope + +- Registry operations (covered in RFC-0008). +- Extending harness functionality (e.g., adding hook support). +- Automatic harness detection (follow-up). + +## Detailed design + +### Harness adapters + +Each supported harness has an adapter that knows how to: + +1. **Map member types to harness paths.** Given the bundle's member + types (skill, subagent, hook, mcp_server) and the install scope, + determine where each member's content should be placed. +2. **Declare install paths per scope.** Each adapter knows both the + project-level path (e.g., `.claude/plugins/`) and the user-level + global path (e.g., `~/.claude/plugins/`). The `scope` parameter + selects which one to use. +3. **Generate manifests.** Create harness-specific manifest files + (e.g., `plugin.json`, `.mcp.json`) from registry metadata. +4. **Handle unsupported types.** Skip member types the harness does + not support, with a warning by default. If `strict` mode is enabled, + fail the install instead of producing a partial harness artifact. +5. **Introspect existing bundles.** Given a harness-specific artifact + (e.g., a Claude Code plugin directory), identify the individual + capabilities it contains and their types. + +The adapter does not download content from sources. The MLflow client +handles all source fetching (Git clone, OCI pull, ZIP download, etc.) +via the same pull logic as `mlflow skills pull` (RFC-0008), then +passes pre-fetched local paths to the adapter. This keeps adapters +simple: they only need to know about directory layout and manifest +generation, not about source types. + +```python +from abc import abstractmethod +from typing import Literal + + +@dataclass +class PulledMember: + name: str + kind: str # "skill", "subagent", "hook", "mcp_server" + local_path: str # pre-fetched content on local filesystem + version: str + metadata: dict[str, str] | None = None + + +@dataclass +class PulledBundle: + name: str + version: str + mode: Literal["assembled", "monolithic"] + # Assembled: pulled member content. Monolithic: registered embedded members. + members: list[PulledMember] = field(default_factory=list) + mcp_servers: list[tuple[str, dict]] = field(default_factory=list) + bundle_path: str | None = None # pulled monolithic bundle artifact + metadata: dict[str, str] | None = None + + +@dataclass +class IntrospectedMember: + name: str + kind: str # "skill", "subagent", "hook", "mcp_server" + source_path: str + description: str | None = None + metadata: dict[str, str] | None = None + + +@dataclass +class IntrospectedBundle: + name: str + description: str | None = None + members: list[IntrospectedMember] = field(default_factory=list) + + +class HarnessAdapter: + @abstractmethod + def install_skill( + self, + member: PulledMember, + scope: str = "project", # "project" or "user" + ) -> str: ... + + @abstractmethod + def install_skill_bundle( + self, + bundle: PulledBundle, + scope: str = "project", # "project" or "user" + ) -> str: ... + + @abstractmethod + def introspect_bundle( + self, source: str, + ) -> IntrospectedBundle: ... + + @abstractmethod + def supported_member_types(self) -> set[str]: ... +``` + +### Adapter summaries + +Each builtin adapter maps member types to harness-specific paths, +generates manifests, and skips unsupported types with warnings. See +[implementation-details.md: Adapter +summaries](implementation-details.md#adapter-summaries) for +per-adapter behavior (Claude Code / Codex CLI, Cursor, Antigravity, +and harness-agnostic bundle formats). + +Detailed directory layouts, MCP config generation rules, and +hook handling behavior are in +[implementation-details.md](implementation-details.md). + +### Other harness adapters + +Additional adapters (OpenClaw, GitHub Copilot, Kilo Code, OpenCode, +Continue, etc.) follow the same pattern: map member types to paths, +generate manifests, skip unsupported types with warnings. + +New adapters can be contributed without changes to the registry or +the adapter interface. Adapters are registered via Python entrypoints +(group `mlflow.skill_harness_adapters`), so third-party adapters can +be installed via `pip install` without modifying MLflow core. MLflow +ships builtin adapters for Claude Code, Codex CLI, and Cursor; +additional harnesses are community-contributed. + +### Bundle import + +Installation takes registry metadata and produces a harness-specific +artifact. Bundle import is the reverse: it takes an existing artifact +in any supported format (e.g., a Claude Code plugin or a cross-harness +module), introspects it to discover individual capabilities, and +registers the artifact as a monolithic skill bundle version with member +references. + +#### Contract + +The import operation takes four inputs: + +- **source**: a reference to the artifact: git URL, OCI reference, + ZIP URL, or MLflow artifact URI. The import operation fetches the + artifact from the source before introspection. The source must be + a remotely accessible location so that the registered bundle version + has a source pointer other users can pull from. To import from a local + directory, first upload it to MLflow artifact storage + (`source_type="mlflow"`) or push it to a Git/OCI/ZIP source, then + import using that remote reference. +- **harness**: the harness format to interpret the artifact as (e.g., + `claude-code`, `cursor`). Required in the initial release. + Automatic detection is a follow-up feature. +- **bundle_name**: the name for the resulting skill bundle. If + omitted, the adapter derives a name from the artifact (e.g., from + `plugin.json` or the directory name). +- **version**: the semantic version for the resulting skill bundle + version. If omitted, the adapter may derive it from artifact metadata + when available. Import fails if neither the caller nor the artifact + provides a valid semantic version. + +The import operation: + +1. Calls the adapter's `introspect_bundle` method, which parses the + artifact and returns an `IntrospectedBundle` listing each + discovered member with its type, source path, and any metadata the + adapter can extract. +2. Creates or updates the parent skill bundle. +3. Creates a monolithic skill bundle version with the import source as + the bundle-level source pointer. +4. Registers discovered skills, subagents, and hooks as member versions + in the skill registry. These member versions may omit `source` + because their content lives inside the bundle-level artifact. +5. Adds the discovered member references to the bundle version and + records each member's `source_path` as the membership `member_subpath` inside + the bundle artifact. Returns the created bundle version plus an + introspection summary. + +Embedded MCP configs remain in the source artifact unless the import +can match them to existing MCP Registry entries. If matching MCP server +versions exist, the bundle may reference them through `mcp_servers`; +otherwise the embedded config remains bundle artifact content. + +#### Conflict handling + +When a bundle version with the same `(bundle_name, version)` already +exists, import reports the conflict and does not overwrite it. The +caller can resolve the conflict by choosing a different bundle name or +version. + +Import also never overwrites existing skill, subagent, or hook versions. +If a discovered member's `(type, name, version)` already exists, import +may reuse that version only when it is compatible with the discovered +member. For example, it can reuse a source-less embedded member version +that already belongs to the same bundle artifact. If the existing +version points to different content, has an incompatible source model, +or cannot be proven compatible, import reports a conflict and fails +rather than binding the new bundle to the wrong governed member. The +caller can resolve the conflict by changing the imported bundle version, +renaming the member, or registering the embedded member under a new +version. + +#### SDK + +```python +# Preview what an artifact contains (read-only, no registry writes) +# Introspect works on local paths or remote sources +preview = mlflow.genai.skills.introspect_bundle( + source="./my-claude-plugin", + harness="claude-code", +) +# preview.members lists discovered skills, subagents, hooks, MCP configs + +# Import the artifact as a monolithic bundle (source must be remotely accessible) +mlflow.genai.skills.import_bundle( + source="https://github.com/acme/plugins/tree/v1.0.0/pr-workflow", + harness="claude-code", + bundle_name="my-plugin", + version="1.0.0", +) +``` + +#### CLI + +```bash +# Preview what an artifact contains (read-only, works on local paths) +mlflow skills introspect --source ./my-claude-plugin \ + --harness claude-code + +# Import from git, OCI, ZIP, or MLflow artifact sources +mlflow skills import --source https://github.com/acme/plugins/tree/v1.0.0/pr-workflow \ + --harness claude-code --bundle-name my-plugin --version 1.0.0 +``` + +Import is a CLI and SDK operation only. There is no UI for import. +Import requires fetching artifacts from user-supplied URLs, which the +server should not do on behalf of clients. + +### Future marketplace integration + +Some harnesses (Claude Code, Codex CLI) support marketplace catalogs: +a JSON endpoint that lists available plugins so users can browse and +install them natively from within the harness. Marketplace catalog +generation is useful, but it is follow-up work outside the initial +release of this RFC. The initial installation path is the adapter-based +CLI/SDK flow (`mlflow skills install` / `install-bundle`). + +A future marketplace integration could expose published skill bundles +through a harness-specific catalog endpoint such as: + +``` +GET /ajax-api/3.0/mlflow/skill-bundles/marketplace.json?harness=claude-code +``` + +That endpoint would need to define the harness-specific response schema, +authentication behavior, packaging or redirect strategy, and how entries +map to monolithic versus assembled bundle versions. Those details are +deferred to the follow-up marketplace work in the adoption strategy. + +Until marketplace integration exists, the MLflow Skills page +(RFC-0008) serves as the browsing interface. Users search and filter +registered bundles in the MLflow UI, then copy the install command from +the bundle detail page: + +``` +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness cursor +``` + +The bundle detail page in the MLflow UI displays a ready-to-copy +install command for each supported harness, reducing the manual steps +required. + +### Implementation details + +SDK function signatures (`install_skill`, `install_bundle`, +`import_bundle`) and CLI commands are in +[implementation-details.md](implementation-details.md). + +### Lock file + +A project can check in an `mlflow-skills.lock` file that records the +harness, registry URI, workspace, install scope, exact resolved +versions, source URIs, and content digests so that `mlflow skills +install` with no arguments reproduces the same local setup (analogous +to `package-lock.json` or `poetry.lock`). + +```bash +# First install: resolves from registry and writes lock file +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code --lock + +# Subsequent installs: reads lock file, no alias or version resolution needed +mlflow skills install + +# Update: re-resolves the explicit selector and updates lock file +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code --lock --update +``` + +The lock file records resolved versions, not aliases, version ranges, +or other selectors. This ensures reproducible installs and avoids +stale, non-authoritative selector metadata. The `--update` flag uses +the explicit selector supplied to that command, such as `--alias +production`, to resolve a new target and write the new resolved version +to the lock file. + +Lock file replay still contacts the registry to verify version +status, so governance actions (deprecation, deletion) take effect +even for existing lock files. +By default, replay uses the registry URI and workspace recorded in the +lock file for verification. Implementations may provide an explicit +override for environment promotion, but should not silently resolve +against ambient registry configuration. + +Lock file format and SDK functions are in +[implementation-details.md](implementation-details.md). + +### Trace integration + +RFC-0008 defines `mlflow.skill_context()`, a context manager that +creates SKILL spans in MLflow traces (see RFC-0008, Trace +integration). The install commands can automate this: they write a +manifest with installed registry coordinates, and harnesses with hook +support (Claude Code, Codex CLI) can use pre/post tool-use hooks to +create SKILL spans automatically. + +For monolithic bundle installs, the manifest records the installed +bundle version and any registered embedded skill versions discovered +during import. Automatic per-skill SKILL spans require the local skill +name to resolve to a registered skill version in the manifest. Users of +harnesses without hook support can still use +`mlflow.skill_context()` manually in SDK-based agent code. + +Manifest format, hook configuration examples, and per-harness +instrumentation details are in +[implementation-details.md](implementation-details.md). + +## Drawbacks + +- **Adapter maintenance.** Each harness adapter must be maintained as + harness plugin formats evolve. This is ongoing work. +- **Incomplete coverage.** Not all harnesses support all capability + types. By default, installs skip unsupported types with warnings. + Users who need fail-fast behavior can use strict mode. Even with + warnings, users need to understand that the installed harness artifact + can be a subset of the governed bundle. +- **Manifest format drift.** Generated manifests may not cover all + features of a harness's native plugin format (e.g., Codex CLI's + `interface` block with branding, or OpenClaw's `requires` field). + +# Alternatives + +## Let users write their own install scripts + +Provide only `pull` (RFC-0008) and let users or third parties build +harness-specific tooling. + +Rejected because the gap between "pull" and "working in my harness" +is the main adoption barrier. A first-party install experience is +critical for driving adoption. + +## Delegate installation to an existing skill package manager + +Several open-source projects already handle skill installation: + +- **skills.sh** ([vercel-labs/skills](https://github.com/vercel-labs/skills)): + CLI for installing individual SKILL.md files. Supports 70+ harnesses. +- **Lola** ([LobsterTrap/lola](https://github.com/LobsterTrap/lola)): + Cross-harness package manager. Its "AI Context Module" format bundles + skills, subagents, commands, hooks, and MCP servers. +- **SkillHub** ([iflytek/skillhub](https://github.com/iflytek/skillhub)): + Self-hosted skill registry with CLI installation. Individual skills + only, 14 harnesses. + +We considered delegating installation to one of these tools rather +than implementing our own adapters. skills.sh and SkillHub operate on +individual skills in isolation and have no bundle concept, so they +cannot handle the general case of installing a skill bundle with +skills, subagents, hooks, and MCP server configurations together. +Lola is closer: its AI Context Module format supports all the member +types we need. However, delegating installation to the Lola CLI +would introduce a third-party runtime dependency for a relatively +narrow special case (Lola-format bundles targeting Lola-supported +harnesses) while still requiring our own implementation for the +general problem (any bundle format, any harness, with registry +governance and trace integration). Instead, we implement installation +ourselves via the adapter interface. The adapter interface is +extensible to harness-agnostic bundle formats (see "Harness-agnostic +bundle formats" above), so support for formats like Lola's can be +added as demand warrants without architectural changes. + +# Adoption strategy + +**Initial release:** Claude Code, Codex CLI, and Cursor adapters. +Bundle import. Install-time trace manifest and Claude Code trace +hooks. + +**Follow-up:** Marketplace catalog generation for Claude Code / +Codex CLI. Additional adapters based on demand (including +harness-agnostic bundle formats), automatic harness detection, +bi-directional sync (detect local plugins and register them). diff --git a/rfcs/0009-skill-harness-integration/implementation-details.md b/rfcs/0009-skill-harness-integration/implementation-details.md new file mode 100644 index 0000000..30cd57b --- /dev/null +++ b/rfcs/0009-skill-harness-integration/implementation-details.md @@ -0,0 +1,462 @@ +# RFC-0009: Harness Integration Implementation Details + +This document contains implementation-level specifications for +RFC-0009 (Skill Registry Harness Integration). It covers detailed +adapter directory layouts and manifest generation, MCP server config +generation, SDK interface and function signatures, +CLI commands, lock file SDK functions, and trace instrumentation +details. The main RFC covers the design rationale. + +## Claude Code / Codex CLI adapter details + +These two harnesses share nearly identical plugin formats. The adapter +generates: + +**`plugin.json`:** +```json +{ + "name": "pr-workflow", + "version": "1.0.0", + "description": "End-to-end pull request review workflow", + "author": { "name": "Generated by MLflow Skill Registry" } +} +``` + +**Directory layout:** +``` +{destination}/.claude/plugins/{bundle-name}/ + .claude-plugin/plugin.json + skills/{skill-name}/SKILL.md # skill members + agents/{agent-name}.md # subagent members + hooks/{hook-name}/ # hook member content + .mcp.json # mcp_server members, merged +``` + +For Codex CLI, the path uses `.codex/plugins/` instead. + +**MCP server config generation.** When a bundle references MCP servers +in its `mcp_servers` member list, the adapter generates `.mcp.json` +entries from MCP registry metadata. For each referenced server, the +adapter resolves the `MCPServerVersion` from the MCP registry +(RFC-0004) and looks for an `MCPAccessBinding` targeting that version +or alias. If a binding exists, the adapter uses its `endpoint_url` and +`transport_type` as the connection target. If multiple bindings exist +for the same server, the adapter uses the first binding targeting the +referenced version or alias. If no binding exists, the adapter falls +back to the connection details in `server_json` (e.g., `remotes[]`). + +Entries are merged into a single `.mcp.json` using server name as key: + +```json +{ + "mcpServers": { + "github-mcp": { ... }, + "jira-mcp": { ... } + } +} +``` + +**Embedded MCP configs.** When a bundle has a bundle-level source and +the artifact already contains a `.mcp.json`, those embedded configs +are used as-is for any MCP servers not in the `mcp_servers` member +list. If the same server name appears in both the embedded config and +the `mcp_servers` list, the registry-generated entry takes precedence +(the registry is the governed source of truth). + +**MCP server credentials.** The adapter generates connection config +but does not configure credentials, certificates, or authorization +headers. These are the user's responsibility. The adapter logs a +warning when it generates an entry for a server that uses +authenticated transport, so users know to complete the setup +manually. + +**Hook handling.** Hook member content is placed in the +`hooks/{hook-name}/` directory within the plugin. Since Claude Code +hooks are configured in `settings.json` (not discovered from +directories), the adapter prints the `settings.json` hook entries +needed to activate the installed hooks. Users can opt in with +`--install-hooks` to have the adapter merge these entries into +`settings.json` automatically. The adapter does not modify +`settings.json` by default for security reasons. + +## Cursor adapter details + +Cursor does not have a plugin bundle format. The adapter places +capabilities directly into Cursor's discovery directories: + +``` +{destination}/.cursor/skills/{skill-name}/SKILL.md # skill members +{destination}/.cursor/agents/{agent-name}.md # subagent members +``` + +For MCP servers, the adapter merges entries into the project's +`.cursor/mcp.json`, adding new servers without overwriting existing +ones. + +Hooks are skipped with a warning (Cursor does not support hooks). + +## Antigravity adapter details + +``` +{destination}/.agent/skills/{skill-name}/SKILL.md # skill members +``` + +Subagents, MCP servers, and hooks are skipped with a warning. + +## SDK interface + +Installation is a client-side operation: the SDK resolves the skill or +bundle from the registry, pulls the registered content according to the +skill or bundle pull semantics, and writes harness-specific manifests +and files to the local filesystem. No server-side install endpoint is +needed. + +```python +def install_skill( + name: str | None = None, + harness: str = "claude-code", + scope: str = "project", + version: str | None = None, + alias: str | None = None, + lock: bool = False, + update: bool = False, + strict: bool = False, +) -> str: + """Install a single skill for a specific harness. Resolves + from the registry, pulls content, generates harness-specific + files, and places them in the correct directories. Scope is + 'project' (e.g., .claude/plugins/) or 'user' (e.g., + ~/.claude/plugins/) for global install. If lock is True, + writes resolved versions to mlflow-skills.lock. If update + is True, re-resolves the explicit version or alias selector + supplied to this call and updates the lock file. + If strict is True, fails when the target harness cannot install + all required content. + If name is omitted, replays from an existing lock file.""" + + +def install_bundle( + name: str | None = None, + harness: str = "claude-code", + scope: str = "project", + version: str | None = None, + alias: str | None = None, + lock: bool = False, + update: bool = False, + strict: bool = False, +) -> str: + """Install a skill bundle for a specific harness. Resolves + the bundle from the registry. For assembled bundles, pulls each + member and passes a PulledBundle with mode='assembled' to the + harness adapter. For monolithic bundles, pulls the bundle-level + artifact and passes mode='monolithic' with bundle_path set. + If lock is True, writes resolved versions to mlflow-skills.lock. + If update is True, re-resolves the explicit version or alias + selector supplied to this call and updates the lock file. If name + is omitted, replays from an existing lock file. If strict is True, + fails when the target harness would otherwise skip unsupported + member types.""" + + +def introspect_bundle( + source: str, + harness: str, +) -> IntrospectedBundle: + """Inspect a harness-specific artifact without modifying the + registry. Returns an IntrospectedBundle listing discovered + members with their types, names, and source paths. Works on + local paths or remote sources. Use this to preview what the + imported monolithic bundle artifact contains.""" + + +def import_bundle( + source: str, + harness: str, + bundle_name: str | None = None, + version: str | None = None, +) -> SkillBundleVersion: + """Import a harness-specific artifact into the registry. + Source must be remotely accessible (git URL, OCI reference, + ZIP URL, or MLflow artifact URI) so the monolithic bundle version + has a pullable source pointer. To import local content, first + upload to MLflow artifact storage, then import using the + artifact URI. The imported bundle version uses source as its + bundle-level source and references discovered skill, subagent, + and hook member versions whose own sources may be omitted. + If version is omitted, the adapter may derive it from artifact + metadata. Import fails if no valid semantic version is available.""" +``` + +Import is append-only with respect to governed versions. It must not +overwrite an existing bundle, skill, subagent, or hook version. If a +discovered member's `(type, name, version)` already exists, the +implementation may reuse it only when it is compatible with the +discovered embedded member. If compatibility cannot be established, +import fails with a conflict rather than binding the imported bundle to +an unrelated governed member. + +For monolithic imports, each `IntrospectedMember.source_path` is stored +as the bundle membership `member_subpath`, relative to the root of the imported +artifact. This lets install and trace-manifest generation locate +embedded members without giving those member versions independent +sources. + +## CLI + +```bash +# Install a single skill (project scope, the default) +mlflow skills install --name code-review --alias production \ + --harness claude-code + +# Install a skill bundle (separate command) +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code + +# Install a skill bundle globally (user scope) +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness claude-code --scope user + +# Fail instead of skipping member types unsupported by the harness +mlflow skills install-bundle --name pr-workflow --alias production \ + --harness cursor --strict + +# Preview what an artifact contains (read-only) +mlflow skills introspect --source ./my-claude-plugin \ + --harness claude-code + +# Import the artifact as a monolithic bundle (remote source required) +mlflow skills import \ + --source https://github.com/acme/plugins/tree/v1.0.0/pr-workflow \ + --harness claude-code --bundle-name my-plugin --version 1.0.0 + +# List supported harnesses +mlflow skills harnesses +``` + +## Lock file format + +```json +{ + "harness": "claude-code", + "registry_uri": "https://mlflow.example.com", + "workspace": "default", + "locked_at": "2026-05-17T21:00:00Z", + "entries": [ + { + "type": "bundle", + "name": "pr-workflow", + "version": "1.0.0", + "scope": "project", + "mode": "assembled", + "members": [ + { + "kind": "skill", + "name": "code-review", + "version": "1.0.0", + "source_type": "git", + "source": "https://github.com/acme/agent-skills/tree/v1.0.0/code-review", + "content_digest": "sha256:a3f2b8c..." + }, + { + "kind": "subagent", + "name": "security-auditor", + "version": "1.0.0", + "source_type": "git", + "source": "https://github.com/acme/agent-skills/tree/v1.0.0/security-auditor", + "content_digest": "sha256:d7e4a1b..." + }, + { + "kind": "mcp_server", + "name": "github-mcp", + "version": "2.0.0", + "registry": "mcp" + } + ] + }, + { + "type": "bundle", + "name": "imported-plugin", + "version": "1.0.0", + "scope": "project", + "mode": "monolithic", + "source_type": "git", + "source": "https://github.com/acme/plugins/tree/v1.0.0/imported-plugin", + "content_digest": "sha256:f4c9d2e...", + "members": [ + { + "kind": "skill", + "name": "embedded-review", + "version": "1.0.0", + "member_subpath": "skills/embedded-review" + } + ] + } + ] +} +``` + +Bundle lock entries use `mode` to distinguish assembled and monolithic +bundle versions. Assembled entries record resolved member versions and +their sources. Monolithic entries record the resolved bundle version +and bundle-level source pointer. They may also include `members` for +embedded skill, subagent, and hook versions whose content is supplied by +the bundle artifact. For monolithic entries, `member_subpath` identifies +where the member lives inside that artifact. +Each entry records `scope` (`project` or `user`) so lock replay installs +to the same harness location as the original command. +The top-level `registry_uri` and `workspace` identify the registry +context used to resolve and verify entries during replay. + +## Lock file SDK + +```python +mlflow.genai.skills.install_bundle( + name="pr-workflow", + alias="production", + harness="claude-code", + lock=True, +) + +# Install from lock file +mlflow.genai.skills.install_bundle() +``` + +## Trace instrumentation details + +### Install-time manifest + +When `mlflow skills install` or `mlflow skills install-bundle` places +files for a harness, it also writes a manifest that records installed +registry coordinates. For individual skill installs and assembled +bundles, the manifest maps installed skill names to their +`SkillVersion` coordinates. For monolithic bundles, the manifest records +the installed `SkillBundleVersion` coordinates and any registered +embedded `SkillVersion` coordinates discovered during import. + +**`mlflow-skills-manifest.json`:** +```json +{ + "manifest_version": "1.0", + "skills": { + "code-review": { + "name": "code-review", + "version": "1.0.0", + "workspace": "default" + }, + "style-check": { + "name": "style-check", + "version": "2.0.0", + "workspace": "default" + } + }, + "bundles": { + "imported-plugin": { + "name": "imported-plugin", + "version": "1.0.0", + "workspace": "default", + "mode": "monolithic" + } + } +} +``` + +The `skills` section is keyed by the skill's local name (the name the +harness uses to invoke it). The `bundles` section is keyed by the +installed bundle's local name. Each value provides the +`{workspace, name, version}` coordinates that link back to the +registry. This file is used by automatic instrumentation to annotate +spans or invocation events with registry coordinates without requiring +a registry lookup at runtime. + +Monolithic bundle entries provide install provenance for the bundle as +a whole. Automatic SKILL spans require a local skill name to resolve to +a registered `SkillVersion` in the manifest. Imported monolithic bundles +should generally provide those entries for embedded skills discovered by +the adapter. + +### Automatic skill-span instrumentation challenge + +The manifest above gives automatic instrumentation enough information +to map a harness-local skill invocation back to registry coordinates. +The remaining implementation challenge is attaching that invocation to +the correct MLflow trace and parent span. + +This is non-trivial because some harness hook mechanisms run shell +commands in separate processes. A separate process cannot rely on an +MLflow thread-local trace context owned by the process that created the +trace. The implementation therefore must choose an explicit strategy +for discovering or propagating trace context before it can create +accurately nested SKILL spans. + +The recommended approach is **in-process autologger instrumentation**. +The Claude Code autologger, or another in-process harness integration, +observes skill tool invocations and calls `mlflow.skill_context()` in +the same process that owns the active trace. This avoids the +cross-process problem entirely and keeps parent-child span +relationships on MLflow's normal tracing path. + +For harnesses where in-process integration is not available, other +approaches may be feasible. These are not exhaustive; the implementer +may choose another approach if it satisfies the same trace-parenting +requirements. + +- **Explicit trace correlation for hook commands.** Harness hooks may + pass trace context to external commands through stdin, environment + variables, or a temporary context file. That context could include a + trace ID, parent span ID, or opaque MLflow correlation token. Hook + commands would use this context to create SKILL spans under the + correct parent and would need a way to preserve span lifecycle state + between start and end events. + +- **Invocation event annotation.** Instead of opening and closing live + spans from hook commands, hooks can emit timestamped skill invocation + events containing registry coordinates and any available trace + correlation data. MLflow can then attach those events to traces or + materialize derived SKILL spans during ingestion or display. + +- **Harness-native extension.** If a harness exposes an in-process + extension or plugin API for skill invocation events, the installer + can configure that extension to call the MLflow tracing SDK directly. + This has similar trace-context advantages to autologger + instrumentation while using the harness's native extension surface. + +For cases where automatic instrumentation is unavailable or not enabled, +developers can still call `mlflow.skill_context()` directly in +SDK-based agent code (see RFC-0008). + +### Other harnesses + +Trace integration depends on each harness exposing a hook, event, or +in-process extension mechanism for skill invocation. Harnesses that +support pre/post tool use hooks (Codex CLI, GitHub Copilot) can use one +of the strategies above if they can provide enough trace correlation +data. Harnesses without a suitable integration point cannot be +automatically instrumented; users of those harnesses can still use +`mlflow.skill_context()` manually in SDK-based agent code. + +## Adapter summaries + +**Claude Code / Codex CLI:** generates a plugin directory under +`.claude/plugins/` (or `.codex/plugins/`) with `plugin.json`, skill +files, subagent files, merged `.mcp.json` from MCP registry metadata, +and hook entries. MCP server credentials are the user's +responsibility. Hooks require explicit user opt-in. + +**Cursor:** places skills and subagents in `.cursor/skills/` and +`.cursor/agents/`. Merges MCP entries into `.cursor/mcp.json`. Hooks +are skipped (unsupported). + +**Antigravity:** places skills in `.agent/skills/`. Subagents, MCP +servers, and hooks are skipped. + +**Harness-agnostic bundle formats.** The adapter interface is not +limited to harness-specific formats. Cross-harness bundle formats +that package skills, subagents, hooks, and MCP servers together are +also valid adapter targets. For example, Lola +([LobsterTrap/lola](https://github.com/LobsterTrap/lola)) defines +an "AI Context Module" format that bundles these capability types +using directory auto-discovery and targets multiple harnesses from +a single module. An adapter for a format like this would support +both directions: `install` generates the cross-harness format from +registry metadata, and `import` introspects an existing module before +registering it as a monolithic bundle source with member versions.