Spike: Ampere SDK audit for embedded-library consumers

## Context

Ampere's spark-based agent unification is complete (socket-link/ampere#482 through socket-link/ampere#489 landed), the CLI client is stable, and the SDK is approaching readiness for its first non-CLI consumer: an embedded-library use case where a host application constructs Ampere instances on-demand to power its own domain-specific agents.

This spike opens a new phase of Ampere work: maturing `ampere-core` from "library that powers the CLI" into "library that any JVM-based host application can embed to run custom agents." The audit is generic — the implementation epic that follows will produce a stable public API contract that any embedded consumer can build against.

**Architectural inputs (not questions for the spike):**

* The first embedded consumer is a JVM backend, not a mobile binary; no self-modifying-code constraints apply to Ampere's API surface itself.
* The embedded consumer constructs Ampere instances on-demand and disposes them when work completes. Per-instance scoping is a hard requirement; no persistent state across instances.
* The embedded consumer implements its own `CognitiveRelay` routing using Ampere's defaults as a starting point. The audit confirms the seams support this without forking internals.
* Event emission is the lifecycle. The audit confirms `EventSerializerBus` is consumable from outside `ampere-core` for state observation and metrics.
* Embedded consumers define their own agent types with their own role sparks and tools. Ampere is the framework; consumer-domain agents are first-class extensions, not modifications to bundled agents.

## Objective

Audit `ampere-core`'s API surface against an embedded-library consumer's needs. Identify every place a CLI-shape assumption has leaked into types, scoping, or registration patterns. Prototype the minimum integration slice that proves an externally-defined agent (with consumer-domain tools, role spark, and event subscription) works end-to-end. Produce a ratified public API surface ready to become an implementation epic.

This is the *full* audit — not a first-use-case-prerequisites pass. Discovering surface gaps via "ad-hoc make-public-as-needed" creates the worst kind of API contract; promotions happen without holistic review, and rolling them back is harder than rolling them forward. The spike's outcome is a stable API contract any embedded consumer can build against indefinitely.

## Expected Outcomes

* Audit doc enumerating every CLI-shape assumption with file:line evidence
* Per-instance isolation verdict: which currently-singleton/object/top-level-`var` types must become per-instance, with proposed refactor scope
* Public API surface proposal: every type an embedded consumer touches, with visibility recommendation (`public` / `@RestrictTo` / stays internal)
* Extension-seam verification: bundled-library and registration paths accept third-party tools, sparks, agent types
* Event consumption verification: external consumers subscribe cleanly, with serialization shape documented for future cross-service fan-out
* Prototype: minimum slice constructing an externally-defined agent using only the proposed-public API, running one PROPEL cycle, capturing emitted events from a simulated external consumer
* ≥1000 word reflection doc
* Sized implementation epic: ticket count, wave structure, parallel-vs-sequential dependencies

## Technical Constraints

* KMP, `commonMain` for the audit; prototype lands in `commonTest` or a temporary `ampere-embedded-prototype` module on the spike branch
* No production code lands from this spike — prototype is deleted before the implementation epic begins
* Spike must not change existing `internal` modifiers; the proposal is *which* should become public, not the actual flip
* All audit findings cite file:line; assertions without evidence don't count
* Spike branch only; no merge to main

## Sequential Tasks

### Task 1 — Per-instance isolation audit

The most failure-prone class of bug for embedded library consumers. If anything in `ampere-core` holds process-scoped state, two host-driven instances in the same JVM can leak into each other — a data corruption bug at minimum, a security bug at worst.

* Enumerate every `object` declaration in `ampere-core/src/commonMain`
* Enumerate every top-level `var` and `val` with mutable state
* Enumerate every `companion object` with state
* For each, classify:
  * **Safe** (immutable constants, pure functions)
  * **Per-instance required** (mutable state that must be scoped to one Ampere instance — refactor target)
  * **Process-wide intentional** (genuinely process-scoped state with explicit justification, e.g. a JSON serializer config)
* `AmpereSpikeFlags` is the canonical red flag — confirm its scope and identify siblings
* Output: `docs/spikes/2026-embedded-consumer-audit.md` with classification table

**Validation:** Every `object` and top-level mutable in `commonMain` is classified. Per-instance-required count is the rough scope estimate for the refactor portion of the implementation epic. If count is >15, STOP and re-pitch — the refactor may need its own ticket separate from the public-API work.

### Task 2 — Public API surface enumeration

Walk through an embedded consumer's expected usage and identify every `ampere-core` type, function, and property an external host has to touch:

* **Instance construction:** how does a host create an Ampere instance configured for one execution context? What types does it pass in? Currently `AgentFactory` is the construction surface — is that the right shape, or does the host need an `AmpereSession` / `AmpereContext` wrapper?
* **Agent construction:** how does a host build a `SparkBasedAgent<HostState>` for its custom agent? `SparkBasedAgent` itself is open, but is `<S : AgentState>` a public extension point? Are the factory builder patterns reachable?
* **Spark authoring:** can a host ship its own `.spark.md` fixtures from its own resources? Does `PhaseSparkLibrary` accept additional source paths, or is `DEFAULT_SPARKS` compile-time-bound to Ampere's bundled list? (Per socket-link/ampere#489's library widening, this should be possible — confirm the seam works for external consumers.)
* **Tool registration:** can a host register its own tools without modifying `ampere-core`? The tool registry seam needs to be open to third-party tools with their own `ParameterStrategy`.
* **Role spark definition:** can a host define its own role spark and have it participate in stack composition and capability narrowing?
* **CognitiveRelay configuration:** the host implements CognitiveRelay routing. What does the configuration surface look like — does the host pass in `RoutingRule` lists, a full `CognitiveRelay` implementation, or extend a default?
* **Event subscription:** how does an external consumer subscribe to the `EventSerializerBus`? Subscription API, filtering, lifecycle (auto-detach on instance disposal)?
* **Disposal:** how does a host cleanly tear down an Ampere instance when its work completes?

For each type the host touches, recommend visibility:

* `public` — stable, contractually exposed surface
* `@RestrictTo(LIBRARY_GROUP)` **or equivalent** — accessible to trusted consumers but not general SDK users
* **Stays** `internal` — the host can reach its goal through a different seam; document the alternate path

Output: appended to audit doc. Table format: type | current visibility | proposed visibility | host's reason for touching it | alternate path if any.

**Validation:** Every host-required type has a visibility recommendation with justification. Miley reviews and may push back on individual proposals before prototype work begins.

**Checkpoint: Miley reviews per-instance isolation audit + public API surface proposal before Task 3.**

### Task 3 — Third-party extension verification

Confirm the registration seams identified in Task 2 actually work for external extensions:

* **Spark library:** can `PhaseSparkLibrary` accept an additional `List<DeclarativeSparkSource>` or a second resource-path list from outside `ampere-core`? If not, what's the smallest change?
* **Tool registration:** can a host-defined tool with its own `ParameterStrategy` be registered with `ToolExecutionEngine` from outside the library? Currently tools are likely constructed inside agent factories — what's the third-party path?
* **Role spark:** can a host define a declarative role spark (per socket-link/ampere#489's role-spark mapper) and have a host-side factory resolve it from a host-supplied library?
* **Agent type:** does `AgentFactory` need to know about every agent type, or can a host define a `SparkBasedAgent<HostState>` entirely outside the factory? If the factory is mandatory, that's a coupling problem — hosts shouldn't have to fork `AgentFactory`.

For each seam: prototype enough to confirm "it works as-is" or "minimum change is X." Don't ship the prototype; ship the verdict.

Output: appended to audit doc. Per-seam table: works / partial / blocked, with minimum change scoped for each non-working seam.

**Validation:** All four extension seams have a verdict backed by code. "Probably works" doesn't count — actually wire it.

### Task 4 — Event consumption from outside the library

External consumers will subscribe to events for state observation, metrics, observability fan-out, and potentially cross-service event streaming:

* Confirm `EventSerializerBus` subscription API is reachable from outside `ampere-core`
* Confirm subscription supports filtering (a consumer may only care about lifecycle events, not every `ProviderCallStartedEvent`)
* Document serialization shape: what does a `SparkAppliedEvent` look like when serialized for cross-service transmission? Does Ampere's bus assume in-process Kotlin consumers, or is JSON serialization round-trippable?
* Confirm subscription auto-detaches on instance disposal (or document the manual-detach API)
* Identify any events that should fire but currently don't (e.g., `AmpereInstanceConstructed`, `AmpereInstanceDisposed` — hosts need these for usage observability)

Output: appended to audit doc.

**Validation:** External-consumer pattern proven with a fixture: spike code outside `ampere-core` package subscribes, captures, and serializes a `SparkAppliedEvent` from one prototype PROPEL cycle.

### Task 5 — Prototype: external-consumer integration slice

Smallest end-to-end demonstration that the proposed API surface holds. The prototype uses a generic domain example (not tied to any specific host product) — a `WeatherAgent` that uses a `ToolFetchForecast` would do.

* Spike-branch-only module or test package: `ampere-embedded-prototype` (deleted before implementation epic)
* Define `WeatherState : AgentState` (an externally-defined state class)
* Define one externally-defined tool: `ToolFetchForecast` with its own `ParameterStrategy`
* Author `role-weather.spark.md` and `weather-agent.spark.md` declaratively
* Construct `SparkBasedAgent<WeatherState>` using only the proposed-public API surface — if anything `internal` is needed, that's a gap to escalate
* Run one PROPEL cycle against a fake LLM provider (custom-provider capture path)
* Subscribe externally to the EventSerializerBus, capture all emitted events, serialize them
* Assert: agent constructed without touching internals; PROPEL cycle completes; events captured externally; serialized event payload round-trips
* **If any step requires** `internal` **access**, stop, escalate to Miley, update the Task 2 proposal — the gap is signal

Output: prototype source + appended audit-doc section showing the captured event trace.

**Validation:** Prototype works using only proposed-public API. Any `internal` reach is a documented gap, not a workaround.

### Task 6 — Reflection doc

File: `docs/spikes/2026-embedded-consumer-reflection.md`, ≥1000 words.

Required bullets:

* **Per-instance isolation: where did CLI-shape assumptions hide?** Patterns to watch for in future single-consumer-then-multi-consumer transitions.
* **Public API surface: which host-required types surprised you?** Cases where an embedded consumer needed something Ampere never intended to expose — and what that says about the original CLI-shaped design.
* **Extension seams: where was the library "almost ready" vs "actively hostile to third parties"?** The difference matters for ticket-sizing.
* **Event consumption: was the bus designed for external consumers, or just in-process ones?** Implications for future serialization / cross-service work.
* **CognitiveRelay: did the audit confirm hosts can implement routing without forking Ampere, or did seams need widening?** Most architectural-risk question; deserves a clear verdict.
* **The "host defines its own agents" thesis under load.** Does the spark-based pattern actually let hosts define their own agents cleanly, or does the prototype expose framework assumptions that bake-in CLI's agent shape?
* **Sized implementation epic.** Concrete ticket count + wave structure for the API ratification work. Identify parallel vs sequential dependencies. Identify which tickets could be parallelized with host-side adoption work happening simultaneously.
* **Deferred audit areas.** What did the spike *not* cover that future embedded-consumer adoption work will surface? (E.g., performance under concurrent multi-instance load, memory profile for long-running backends, observability tooling integration.)

**Validation:** Doc exists, meets length, addresses all 8 bullets. Miley signs off.

## Checkpoints

* After Task 1 (isolation): per-instance scope estimate confirms refactor scale before public API work
* After Task 2 (API surface): visibility proposals reviewed before prototyping commits to them
* After Task 4 (event consumption): external-consumer pattern proven before the larger prototype
* After Task 6 (reflection): final sign-off before implementation epic drafting

Scope additions surfaced during the spike → separate tickets, not inline. Be especially watchful for "this needs `expect`/`actual` reshaping for backend-only usage" — that's a sibling concern, not this ticket.

## Out of Scope

* Specific host-product domain modeling (handled in the host-side companion ticket)
* Multi-instance state sharing or cross-instance learning (future phase; per-instance constraint stays for now)
* Multi-tenancy concerns beyond per-instance isolation
* Cross-service event streaming infrastructure (only verify serialization shape supports it; don't build it)
* Performance optimization for concurrent multi-instance load (audit only; optimization is future work)
* Actual public-API visibility flips (the implementation epic does this; this spike proposes)
* Host-side adoption work (separate ticket on the host team after the implementation epic lands the public API)

## Reference Patterns

* `AmpereSpikeFlags` — canonical example of CLI-shape singleton that needs per-instance scoping
* `AgentFactory.kt` — current construction surface, audit subject
* `SparkBasedAgent.kt` — open class, role spark composition pattern that hosts extend
* `PhaseSparkLibrary` / `DefaultPhaseSparkLibrary` — bundled-source registration pattern (socket-link/ampere#489 widening is the relevant prior art)
* `EventSerializerBus` — subscription/serialization audit subject
* `CognitiveRelay` + `RoutingContext` + `RoutingRule` — hosts implement this; confirm the seams
* `Tool` + `ToolExecutionEngine` + `ParameterStrategy` — third-party tool registration audit subject
* Prior spike-then-implement pattern: socket-link/ampere#481 → socket-link/ampere#482, socket-link/ampere#484 → socket-link/ampere#486, socket-link/ampere#487 → socket-link/ampere#489

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: Ampere SDK audit for embedded-library consumers #492

Context

Objective

Expected Outcomes

Technical Constraints

Sequential Tasks

Task 1 — Per-instance isolation audit

Task 2 — Public API surface enumeration

Task 3 — Third-party extension verification

Task 4 — Event consumption from outside the library

Task 5 — Prototype: external-consumer integration slice

Task 6 — Reflection doc

Checkpoints

Out of Scope

Reference Patterns

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Spike: Ampere SDK audit for embedded-library consumers #492

Description

Context

Objective

Expected Outcomes

Technical Constraints

Sequential Tasks

Task 1 — Per-instance isolation audit

Task 2 — Public API surface enumeration

Task 3 — Third-party extension verification

Task 4 — Event consumption from outside the library

Task 5 — Prototype: external-consumer integration slice

Task 6 — Reflection doc

Checkpoints

Out of Scope

Reference Patterns

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions