Skip to content

Spike: Ampere SDK audit for embedded-library consumers #492

@wow-miley

Description

@wow-miley

Context

Ampere's spark-based agent unification is complete (#482 through #489 landed), the CLI client is stable, and the SDK is approaching readiness for its first non-CLI consumer: an embedded-library use case where a host application constructs Ampere instances on-demand to power its own domain-specific agents.

This spike opens a new phase of Ampere work: maturing ampere-core from "library that powers the CLI" into "library that any JVM-based host application can embed to run custom agents." The audit is generic — the implementation epic that follows will produce a stable public API contract that any embedded consumer can build against.

Architectural inputs (not questions for the spike):

  • The first embedded consumer is a JVM backend, not a mobile binary; no self-modifying-code constraints apply to Ampere's API surface itself.
  • The embedded consumer constructs Ampere instances on-demand and disposes them when work completes. Per-instance scoping is a hard requirement; no persistent state across instances.
  • The embedded consumer implements its own CognitiveRelay routing using Ampere's defaults as a starting point. The audit confirms the seams support this without forking internals.
  • Event emission is the lifecycle. The audit confirms EventSerializerBus is consumable from outside ampere-core for state observation and metrics.
  • Embedded consumers define their own agent types with their own role sparks and tools. Ampere is the framework; consumer-domain agents are first-class extensions, not modifications to bundled agents.

Objective

Audit ampere-core's API surface against an embedded-library consumer's needs. Identify every place a CLI-shape assumption has leaked into types, scoping, or registration patterns. Prototype the minimum integration slice that proves an externally-defined agent (with consumer-domain tools, role spark, and event subscription) works end-to-end. Produce a ratified public API surface ready to become an implementation epic.

This is the full audit — not a first-use-case-prerequisites pass. Discovering surface gaps via "ad-hoc make-public-as-needed" creates the worst kind of API contract; promotions happen without holistic review, and rolling them back is harder than rolling them forward. The spike's outcome is a stable API contract any embedded consumer can build against indefinitely.

Expected Outcomes

  • Audit doc enumerating every CLI-shape assumption with file:line evidence
  • Per-instance isolation verdict: which currently-singleton/object/top-level-var types must become per-instance, with proposed refactor scope
  • Public API surface proposal: every type an embedded consumer touches, with visibility recommendation (public / @RestrictTo / stays internal)
  • Extension-seam verification: bundled-library and registration paths accept third-party tools, sparks, agent types
  • Event consumption verification: external consumers subscribe cleanly, with serialization shape documented for future cross-service fan-out
  • Prototype: minimum slice constructing an externally-defined agent using only the proposed-public API, running one PROPEL cycle, capturing emitted events from a simulated external consumer
  • ≥1000 word reflection doc
  • Sized implementation epic: ticket count, wave structure, parallel-vs-sequential dependencies

Technical Constraints

  • KMP, commonMain for the audit; prototype lands in commonTest or a temporary ampere-embedded-prototype module on the spike branch
  • No production code lands from this spike — prototype is deleted before the implementation epic begins
  • Spike must not change existing internal modifiers; the proposal is which should become public, not the actual flip
  • All audit findings cite file:line; assertions without evidence don't count
  • Spike branch only; no merge to main

Sequential Tasks

Task 1 — Per-instance isolation audit

The most failure-prone class of bug for embedded library consumers. If anything in ampere-core holds process-scoped state, two host-driven instances in the same JVM can leak into each other — a data corruption bug at minimum, a security bug at worst.

  • Enumerate every object declaration in ampere-core/src/commonMain
  • Enumerate every top-level var and val with mutable state
  • Enumerate every companion object with state
  • For each, classify:
    • Safe (immutable constants, pure functions)
    • Per-instance required (mutable state that must be scoped to one Ampere instance — refactor target)
    • Process-wide intentional (genuinely process-scoped state with explicit justification, e.g. a JSON serializer config)
  • AmpereSpikeFlags is the canonical red flag — confirm its scope and identify siblings
  • Output: docs/spikes/2026-embedded-consumer-audit.md with classification table

Validation: Every object and top-level mutable in commonMain is classified. Per-instance-required count is the rough scope estimate for the refactor portion of the implementation epic. If count is >15, STOP and re-pitch — the refactor may need its own ticket separate from the public-API work.

Task 2 — Public API surface enumeration

Walk through an embedded consumer's expected usage and identify every ampere-core type, function, and property an external host has to touch:

  • Instance construction: how does a host create an Ampere instance configured for one execution context? What types does it pass in? Currently AgentFactory is the construction surface — is that the right shape, or does the host need an AmpereSession / AmpereContext wrapper?
  • Agent construction: how does a host build a SparkBasedAgent<HostState> for its custom agent? SparkBasedAgent itself is open, but is <S : AgentState> a public extension point? Are the factory builder patterns reachable?
  • Spark authoring: can a host ship its own .spark.md fixtures from its own resources? Does PhaseSparkLibrary accept additional source paths, or is DEFAULT_SPARKS compile-time-bound to Ampere's bundled list? (Per Implement: JSON frontmatter for .spark.md + role spark registry #489's library widening, this should be possible — confirm the seam works for external consumers.)
  • Tool registration: can a host register its own tools without modifying ampere-core? The tool registry seam needs to be open to third-party tools with their own ParameterStrategy.
  • Role spark definition: can a host define its own role spark and have it participate in stack composition and capability narrowing?
  • CognitiveRelay configuration: the host implements CognitiveRelay routing. What does the configuration surface look like — does the host pass in RoutingRule lists, a full CognitiveRelay implementation, or extend a default?
  • Event subscription: how does an external consumer subscribe to the EventSerializerBus? Subscription API, filtering, lifecycle (auto-detach on instance disposal)?
  • Disposal: how does a host cleanly tear down an Ampere instance when its work completes?

For each type the host touches, recommend visibility:

  • public — stable, contractually exposed surface
  • @RestrictTo(LIBRARY_GROUP) or equivalent — accessible to trusted consumers but not general SDK users
  • Stays internal — the host can reach its goal through a different seam; document the alternate path

Output: appended to audit doc. Table format: type | current visibility | proposed visibility | host's reason for touching it | alternate path if any.

Validation: Every host-required type has a visibility recommendation with justification. Miley reviews and may push back on individual proposals before prototype work begins.

Checkpoint: Miley reviews per-instance isolation audit + public API surface proposal before Task 3.

Task 3 — Third-party extension verification

Confirm the registration seams identified in Task 2 actually work for external extensions:

  • Spark library: can PhaseSparkLibrary accept an additional List<DeclarativeSparkSource> or a second resource-path list from outside ampere-core? If not, what's the smallest change?
  • Tool registration: can a host-defined tool with its own ParameterStrategy be registered with ToolExecutionEngine from outside the library? Currently tools are likely constructed inside agent factories — what's the third-party path?
  • Role spark: can a host define a declarative role spark (per Implement: JSON frontmatter for .spark.md + role spark registry #489's role-spark mapper) and have a host-side factory resolve it from a host-supplied library?
  • Agent type: does AgentFactory need to know about every agent type, or can a host define a SparkBasedAgent<HostState> entirely outside the factory? If the factory is mandatory, that's a coupling problem — hosts shouldn't have to fork AgentFactory.

For each seam: prototype enough to confirm "it works as-is" or "minimum change is X." Don't ship the prototype; ship the verdict.

Output: appended to audit doc. Per-seam table: works / partial / blocked, with minimum change scoped for each non-working seam.

Validation: All four extension seams have a verdict backed by code. "Probably works" doesn't count — actually wire it.

Task 4 — Event consumption from outside the library

External consumers will subscribe to events for state observation, metrics, observability fan-out, and potentially cross-service event streaming:

  • Confirm EventSerializerBus subscription API is reachable from outside ampere-core
  • Confirm subscription supports filtering (a consumer may only care about lifecycle events, not every ProviderCallStartedEvent)
  • Document serialization shape: what does a SparkAppliedEvent look like when serialized for cross-service transmission? Does Ampere's bus assume in-process Kotlin consumers, or is JSON serialization round-trippable?
  • Confirm subscription auto-detaches on instance disposal (or document the manual-detach API)
  • Identify any events that should fire but currently don't (e.g., AmpereInstanceConstructed, AmpereInstanceDisposed — hosts need these for usage observability)

Output: appended to audit doc.

Validation: External-consumer pattern proven with a fixture: spike code outside ampere-core package subscribes, captures, and serializes a SparkAppliedEvent from one prototype PROPEL cycle.

Task 5 — Prototype: external-consumer integration slice

Smallest end-to-end demonstration that the proposed API surface holds. The prototype uses a generic domain example (not tied to any specific host product) — a WeatherAgent that uses a ToolFetchForecast would do.

  • Spike-branch-only module or test package: ampere-embedded-prototype (deleted before implementation epic)
  • Define WeatherState : AgentState (an externally-defined state class)
  • Define one externally-defined tool: ToolFetchForecast with its own ParameterStrategy
  • Author role-weather.spark.md and weather-agent.spark.md declaratively
  • Construct SparkBasedAgent<WeatherState> using only the proposed-public API surface — if anything internal is needed, that's a gap to escalate
  • Run one PROPEL cycle against a fake LLM provider (custom-provider capture path)
  • Subscribe externally to the EventSerializerBus, capture all emitted events, serialize them
  • Assert: agent constructed without touching internals; PROPEL cycle completes; events captured externally; serialized event payload round-trips
  • If any step requires internal access, stop, escalate to Miley, update the Task 2 proposal — the gap is signal

Output: prototype source + appended audit-doc section showing the captured event trace.

Validation: Prototype works using only proposed-public API. Any internal reach is a documented gap, not a workaround.

Task 6 — Reflection doc

File: docs/spikes/2026-embedded-consumer-reflection.md, ≥1000 words.

Required bullets:

  • Per-instance isolation: where did CLI-shape assumptions hide? Patterns to watch for in future single-consumer-then-multi-consumer transitions.
  • Public API surface: which host-required types surprised you? Cases where an embedded consumer needed something Ampere never intended to expose — and what that says about the original CLI-shaped design.
  • Extension seams: where was the library "almost ready" vs "actively hostile to third parties"? The difference matters for ticket-sizing.
  • Event consumption: was the bus designed for external consumers, or just in-process ones? Implications for future serialization / cross-service work.
  • CognitiveRelay: did the audit confirm hosts can implement routing without forking Ampere, or did seams need widening? Most architectural-risk question; deserves a clear verdict.
  • The "host defines its own agents" thesis under load. Does the spark-based pattern actually let hosts define their own agents cleanly, or does the prototype expose framework assumptions that bake-in CLI's agent shape?
  • Sized implementation epic. Concrete ticket count + wave structure for the API ratification work. Identify parallel vs sequential dependencies. Identify which tickets could be parallelized with host-side adoption work happening simultaneously.
  • Deferred audit areas. What did the spike not cover that future embedded-consumer adoption work will surface? (E.g., performance under concurrent multi-instance load, memory profile for long-running backends, observability tooling integration.)

Validation: Doc exists, meets length, addresses all 8 bullets. Miley signs off.

Checkpoints

  • After Task 1 (isolation): per-instance scope estimate confirms refactor scale before public API work
  • After Task 2 (API surface): visibility proposals reviewed before prototyping commits to them
  • After Task 4 (event consumption): external-consumer pattern proven before the larger prototype
  • After Task 6 (reflection): final sign-off before implementation epic drafting

Scope additions surfaced during the spike → separate tickets, not inline. Be especially watchful for "this needs expect/actual reshaping for backend-only usage" — that's a sibling concern, not this ticket.

Out of Scope

  • Specific host-product domain modeling (handled in the host-side companion ticket)
  • Multi-instance state sharing or cross-instance learning (future phase; per-instance constraint stays for now)
  • Multi-tenancy concerns beyond per-instance isolation
  • Cross-service event streaming infrastructure (only verify serialization shape supports it; don't build it)
  • Performance optimization for concurrent multi-instance load (audit only; optimization is future work)
  • Actual public-API visibility flips (the implementation epic does this; this spike proposes)
  • Host-side adoption work (separate ticket on the host team after the implementation epic lands the public API)

Reference Patterns

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions