Skip to content

[arch] template extensions / add-ons mechanism for domain-specific capabilities #170

@rdwj

Description

@rdwj

Summary

Domain-specific capabilities — FHIR for healthcare, OData and OPC UA for industrial, code-tools for software engineering, retail catalog tools, finance compliance helpers — do not belong in the main template. But they do belong somewhere first-class, not as one-off forks. This issue opens the design discussion for an extension / add-on mechanism so domains can contribute first-class without bloating the core.

This is a tracking / discussion issue, not an implementation. The goal is to align on the shape before we start building extensions. Sibling discussion to the cross-agent platform service issue (#112).

Motivation

Every domain ends up needing the same shape:

  • A bundle of tools (@tool-decorated functions specific to the domain).
  • One or more domain-specific prompts.
  • Sometimes a memory backend or scoping convention.
  • Sometimes custom config (e.g. FHIR endpoint URLs, OPC UA topology).
  • Occasionally an MCP server bundled alongside.

Without a first-class extension model, every team forks the template, copies the same five tools into it, edits agent.yaml, and we end up with N divergent forks instead of N domain extensions composing on a shared core. The core then drifts because every fork's improvements have to be back-ported manually.

The extension model is the right answer. Getting the shape right early — before we have ten extensions making incompatible assumptions — is worth taking time on.

Open design space

This issue is for discussion, not decision. The questions below define the scope.

Layout

What is the physical shape of an extension?

  • Separate template repos under fips-agents/, cloned alongside the main one by fips-agents-cli. The CLI composes them at scaffold time.
  • Packaged Python distributions (fipsagents-ext-fhir, fipsagents-ext-code-tools) installed into the agent's image at build time, registered via entry points.
  • Subdirectories in a single extensions/ directory in this repo. Simpler, but couples extensions to this repo's release cadence.
  • Git submodules. Powerful but operationally fragile.

Discovery

How does an agent declare which extensions it uses?

extensions:
  - name: fhir
    version: ^1.0
  - name: audit-trail
    version: ^0.3
    config:
      retention_days: 90

Does the fips-agents CLI compose extensions at scaffold time, or does the framework discover them at runtime? Composing at scaffold time fits the immutable-image model better. Runtime discovery is more flexible but harder to audit.

Surface

What can an extension contribute?

  • Tools (decorated with @tool, registered into the existing tool registry).
  • Prompts (Markdown + YAML frontmatter, dropped into prompts/).
  • Skills (agentskills.io directories).
  • Memory backends (custom MemoryClientBase implementations).
  • Custom tracing emitters or attributes.
  • Config blocks (extension-scoped sections in agent.yaml).
  • Scaffolded HTTP routes (rare; usually an MCP server is the right home for this).
  • Helm chart fragments (sidecar containers, ConfigMaps, etc.).

The wider the surface, the more interesting extensions can be — and the more namespacing and versioning matter.

Versioning

  • Pinned per agent vs. shared library.
  • Semver guarantees on extension surface.
  • Compatibility matrix with fipsagents package version.

Namespacing

  • Tool names: fhir.search_patient vs. search_patient?
  • Config keys: fhir: block vs. flat keys?
  • MCP server identifiers: who guarantees uniqueness?
  • Logger names: fipsagents.ext.fhir.*?

Deployment

  • Does the chart need to know about extensions, or are they fully baked into the image at scaffold time?
  • Do extensions ship their own ConfigMaps / Secrets / sidecars, or are they purely in-process?
  • How are extension-specific env vars surfaced in the chart?

The first extension to validate the model

Picking a candidate to drive the design concretely:

  • Code-tools — glob, scoped read/write, AST navigation, sandboxed code execution. Real demand from teams building coding-adjacent agents. Naturally tests the "don't bake this into core" boundary that docs/responsibilities.md draws.
  • FHIR for healthcare — patient/encounter/observation tools, terminology services, deidentification utilities. Stresses the domain-specific config and namespacing parts of the model.
  • OData / OPC UA for industrial — different shape (more streaming, more topology) than FHIR. Stresses the streaming + sidecar parts.

A well-chosen first extension forces decisions on the harder corners.

Adjacent existing concepts

The framework already has plugin-shaped surfaces that an extension model should generalise rather than duplicate:

  • tools/ — single-tool plugin model with @tool decorator and auto-discovery.
  • skills/ — agentskills.io directory-per-skill convention.
  • MemoryClientBase — pluggable memory backend ABC.
  • SessionStore and TraceStore — pluggable persistence backend ABCs.

Extensions generalise these — one declaration brings several of them at once, with configuration, versioning, and lifecycle handled coherently.

Out of scope (for this discussion)

  • Picking the answer. The point of this issue is to align on questions, not decisions.
  • Building any specific extension. Each will be its own follow-on issue once the shape lands.
  • Cross-extension dependencies (extension A requires extension B). Possible follow-on once the basics work.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions