From 275bba27610b46197a238c579e64a456868aea2a Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Mon, 12 Jan 2026 15:40:19 -0500 Subject: [PATCH 01/11] docs(rfd): Add Elicitation specification for structured user input Signed-off-by: Yordis Prieto --- docs/rfds/elicitation.mdx | 403 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 403 insertions(+) create mode 100644 docs/rfds/elicitation.mdx diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx new file mode 100644 index 00000000..f794d033 --- /dev/null +++ b/docs/rfds/elicitation.mdx @@ -0,0 +1,403 @@ +--- +title: "Elicitation: Structured User Input During Sessions" +--- + +Author(s): [@yordis](https://github.com/yordis) + +## Elevator pitch + +Add support for agents to request structured information from users during a session through a standardized elicitation mechanism, aligned with [MCP's elicitation feature](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation). This allows agents to ask follow-up questions, collect authentication credentials, gather preferences, and request required information without side-channel communication or ad-hoc client UI implementations. + +## Status quo + +Currently, agents have two limited mechanisms for gathering user input: + +1. **Session Config Options** (PR #210): Pre-declared, persistent configuration (model, mode, etc.) with default values required. These are available at session initialization and changes are broadcast to the client. + +2. **Unstructured text in turn responses**: Agents can include prompts in their responses, but clients have no standardized way to recognize auth requests, form inputs, or structured selections, leading to inconsistent UX across agents. + +However, there is no mechanism for agents to: + +- Request ad-hoc information during a turn (e.g., "Which of these approaches should I proceed with?" from PR #340) +- Ask for authentication credentials in a recognized, secure way (pain point from PR #330) +- Collect open-ended text input with validation constraints +- Handle decision points that weren't anticipated at session initialization +- Request sensitive information via out-of-band mechanisms (browser-based OAuth) + +The community has already identified the need for this: PR #340 explored a `session/select` mechanism but concluded that leveraging an MCP-like elicitation pattern would be more aligned with how clients will already support MCP servers. PR #330 recognized that authentication requests specifically need special handling separate from regular session data. + +This gap limits the richness of agent-client interaction and forces both agents and clients to implement ad-hoc solutions for structured user input. + +## What we propose to do about it + +We propose introducing an elicitation mechanism for agents to request structured information from users, aligned with [MCP's established elicitation patterns](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation). This addresses discussions from PR #340 about standardizing user selection flows and PR #330 about secure authentication handling. + +The mechanism would: + +1. **Use restricted JSON Schema** (as discussed in PR #210): Like MCP, constrain JSON Schema to a useful subset for `type`, `enum`, `minimum`, `maximum`, `minLength`, `maxLength`, `pattern`, `default`, and `description`. This aligns with how Session Config Options already think about schema. + +2. **Support multiple input modalities**: + - **Simple inputs**: text, number, boolean + - **Selections**: select (single), multiselect (multiple) with enum-based options + - **Sensitive inputs**: password, URL-mode for out-of-band OAuth flows (addressing PR #330 authentication pain points) + +3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific. + +4. **Support client capability negotiation**: Clients declare what elicitation types they support (similar to the client capabilities pattern emerging in the protocol). Agents handle gracefully when clients don't support elicitation. + +5. **Provide rich context**: Agents can include title, description, detailed constraints, and examples—helping clients render consistent, helpful UI without custom implementations. + +6. **Enable out-of-band flows**: Support URL-mode elicitation (like MCP) for sensitive operations like authentication, where credentials bypass the agent entirely (addressing the core pain point in PR #330). + +## Shiny future + +Once implemented, agents can: + +- Ask users "Which approach would you prefer: A or B?" and receive a structured response +- Request text input: "What's the name for this function?" +- Collect multiple related pieces of information in a single request +- Guide users through decision trees with follow-up questions +- Provide rich context (descriptions, examples, constraints) for what they're asking for + +Clients can: + +- Present a consistent, standardized UI for elicitation across all agents +- Validate user input against constraints before sending to the agent +- Cache elicitation history and offer suggestions based on previous responses +- Provide keyboard shortcuts and accessibility features for common elicitation types + +## Implementation details and plan + +### Alignment with MCP + +This proposal follows MCP's established elicitation patterns. See [MCP Elicitation Specification](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation) for detailed guidance. ACP will use the same JSON Schema constraint approach, but adapted for our session/turn-based architecture. + +Key differences from MCP: +- MCP elicitation is tool-call-scoped; ACP elicitation is session/turn-scoped +- ACP must integrate with existing Session Config Options (which also use schema constraints) +- ACP should support out-of-band flows for sensitive data (authentication from PR #330) + +### Elicitation Request Structure + +When a turn ends with `stopReason: "elicitation_requested"`, the agent sends a separate elicitation request (following the same pattern as permission requests). Example 1 (User Selection - from PR #340): + +```json +{ + "elicitation": { + "id": "strategy-choice-42", + "type": "select", + "title": "Choose a Refactoring Strategy", + "description": "How would you like me to approach this refactoring?", + "schema": { + "type": "string", + "enum": ["conservative", "balanced", "aggressive"], + "default": "balanced" + }, + "options": [ + { + "value": "conservative", + "label": "Conservative", + "description": "Minimal changes, heavily tested approach" + }, + { + "value": "balanced", + "label": "Balanced (Recommended)", + "description": "Good balance of progress and safety" + }, + { + "value": "aggressive", + "label": "Aggressive", + "description": "Maximum optimization, requires review" + } + ] + } +} +``` + +Example 2 (Authentication Request - from PR #330, out-of-band OAuth): + +```json +{ + "elicitation": { + "id": "github-oauth-123", + "type": "url", + "title": "Authenticate with GitHub", + "description": "Please authorize this agent to access your GitHub repositories", + "schema": { + "type": "string", + "default": null + }, + "url": "https://github.com/login/oauth/authorize?client_id=...", + "returnValueFormat": "token" + } +} +``` + +Example 3 (Text Input with Constraints): + +```json +{ + "elicitation": { + "id": "function-name", + "type": "text", + "title": "Function Name", + "description": "What should this function be named?", + "schema": { + "type": "string", + "minLength": 1, + "maxLength": 64, + "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$", + "default": "processData" + } + } +} +``` + +### Input Types + +Following MCP's approach, we would start with these types. Clients should gracefully degrade unknown types to `text`: + +- `text` - Open-ended text input +- `number` - Numeric input +- `select` - Single-choice selection from a list +- `multiselect` - Multiple-choice selection +- `boolean` - Yes/no choice +- `password` - Masked text input (for sensitive credentials) +- `url` - URL-based out-of-band authentication (browser-opened flows like OAuth) + +### Restricted JSON Schema + +Aligning with MCP and building on [Session Config Options discussions](https://github.com/agentclientprotocol/agent-client-protocol/pull/210) about schema constraints, agents use a restricted JSON Schema subset: + +**Required fields:** +- `type` (string) - One of the input types above + +**Optional constraint fields:** +- `default` - Default value if user doesn't respond (agents should always provide this, even if `null`) +- `description` - Help text explaining what's being requested +- `enum` - Array of allowed values (for select/multiselect) +- `minLength`, `maxLength` - String length constraints +- `minimum`, `maximum` - Numeric range constraints +- `pattern` - Regex pattern for validation + +**Not supported** (to keep initial implementation simple): +- Complex nested objects/arrays +- `allOf`, `anyOf`, `oneOf` +- Conditional validation +- Custom formats + +This constraint list can expand in future versions based on community feedback. + +### Turn Response with Elicitation Stop Reason + +When an agent reaches a decision point and needs structured user input, it ends the turn with `stopReason: "elicitation_requested"`: + +```json +{ + "jsonrpc": "2.0", + "id": 42, + "result": { + "content": [ + { + "type": "text", + "text": "I can refactor this code in several ways. Each approach has different tradeoffs. Which strategy would you prefer?" + } + ], + "stopReason": "elicitation_requested" + } +} +``` + +### Elicitation Request + +After the turn completes with `stopReason: "elicitation_requested"`, the agent immediately sends a separate `session/elicitation` request (following the same pattern as `session/request_permission`): + +```json +{ + "jsonrpc": "2.0", + "id": 43, + "method": "session/elicitation", + "params": { + "sessionId": "...", + "elicitation": { + "id": "refactor-strategy-001", + "type": "select", + "title": "Choose Refactoring Strategy", + "description": "How would you like me to approach this refactoring?", + "schema": { + "type": "string", + "enum": ["conservative", "balanced", "aggressive"], + "default": "balanced" + }, + "options": [ + { + "value": "conservative", + "label": "Conservative", + "description": "Minimal changes, heavily tested approach" + }, + { + "value": "balanced", + "label": "Balanced (Recommended)", + "description": "Good balance of progress and safety" + }, + { + "value": "aggressive", + "label": "Aggressive", + "description": "Maximum optimization, requires review" + } + ] + } + } +} +``` + +The client presents the elicitation UI to the user based on the input type and constraints. + +### User Response + +When the user responds to an elicitation request, the client sends a separate `session/elicitation` response: + +```json +{ + "jsonrpc": "2.0", + "id": 43, + "result": { + "elicitationResponse": { + "id": "refactor-strategy-001", + "value": "balanced" + } + } +} +``` + +The agent then continues processing with the user's input in the next turn or takes immediate action based on the response. + +### Client Capabilities + +Clients declare whether they support elicitation during the `initialize` phase via `ClientCapabilities`, following the same pattern as `fs` and `terminal` capabilities: + +```json +{ + "jsonrpc": "2.0", + "method": "initialize", + "params": { + "protocolVersion": "2025-11-25", + "clientCapabilities": { + "fs": { + "readTextFile": true, + "writeTextFile": true + }, + "terminal": true, + "elicitation": { + "supported": true, + "supportedTypes": ["text", "number", "select", "multiselect", "boolean"] + } + }, + "clientInfo": { + "name": "my-client", + "version": "1.0.0" + } + } +} +``` + +This tells the agent which elicitation input types the client can render. Agents must gracefully handle clients that don't include this field (assumed to have no elicitation support). + +### Backward Compatibility + +- If a client doesn't support elicitation, agents must provide a default value and continue +- Agents should not require elicitation responses to continue operating +- Clients that don't understand an elicitation type should treat it as requesting text input + +## Frequently asked questions + +### Can an agent request multiple pieces of information in one turn? + +For v1, we recommend a **single elicitation per turn**. This keeps the design simple and predictable for both clients and agents. It also follows the Session Config Options pattern of having agents send full state updates. + +If an agent needs to collect multiple pieces of information, it can: +1. Ask one question per turn (with sensible defaults) +2. Incorporate the user's response in the context for the next turn +3. Ask the next question in a subsequent turn + +This approach: +- Keeps client UI logic simple +- Allows agents to adapt follow-up questions based on previous answers +- Can be extended to array-based multi-elicitation in future versions if compelling use cases emerge + +### How does this differ from session config options? + +Excellent question from PR #210 discussions. Both use restricted JSON Schema, but serve different purposes: + +| Aspect | Session Config Options | Elicitation | +|--------|------------------------|-------------| +| **Lifecycle** | Persistent, pre-declared at session init | Transient, appears during turns | +| **Scope** | Session-wide configuration | Single turn/decision point | +| **Defaults** | Required (agents must have defaults) | Required (agents should always provide) | +| **State management** | Client maintains full state, broadcast on changes | Agent provides response in next turn | +| **Use cases** | Model selection, session mode, persistent settings | Authentication, step-by-step decisions, one-time questions | + +Session Config Options are great for "how should you run this session?" Elicitation is for "what should I do next?" + +### Why align with MCP's elicitation instead of creating something different? + +As identified in PR #340, clients will already implement MCP elicitation support for MCP servers. Aligning ACP's elicitation with MCP: +- Reduces client implementation burden +- Creates consistent UX across MCP and ACP agents +- Lets code be shared or reused +- Follows the protocol design principle of only constraining when necessary + +PR #340 specifically concluded: "I think we'd rather have an MCP elicitation story in general, and maybe offer the same interface outside of tool calls." + +### How does authentication flow work with URL-mode elicitation? + +From PR #330: URL-mode elicitation allows agents to request authentication without exposing credentials to the protocol. While inspired by MCP's URL-mode elicitation, ACP's implementation focuses specifically on out-of-band credential handling: + +1. Agent sends elicitation request with `type: "url"` and OAuth authorization URL +2. Client opens URL in user's browser (out-of-band process) +3. User authenticates and grants permission in the browser +4. Browser returns token/credential to client (e.g., via redirect or callback) +5. Client includes token in next `session/turn` via `elicitationResponse` + +**Key guarantee**: Credentials never flow through the agent or LLM, addressing the core pain point from PR #330. + +The exact semantics of how tokens are returned from the browser and how `returnValueFormat` is handled will be specified in detail during the implementation phase of this RFD. + +### Can agents use elicitation for information required before responding? + +Yes. An agent can include an elicitation request in a turn response with a default value and continue, then incorporate the user's response into the next turn. This is how agents can guide users through multi-step workflows. + +### What if a user doesn't respond to an elicitation request? + +The agent's default value is used (which agents must always provide). If an agent truly requires user input and wants to block, it should fail the turn and let the client handle retry logic. + +### Should elicitation support complex nested data structures? + +For the initial version: no. We're focusing on simple types (strings, numbers, booleans, arrays of those). Complex nested structures can be added in future versions if use cases emerge. This keeps the initial scope manageable and lets us learn from real-world usage. + +### How should agents handle clients that don't support elicitation? + +Agents should always design to gracefully degrade: +- Provide sensible default values +- Describe what they're requesting in turn content (text) +- Proceed with the defaults +- Clients declare capabilities so agents can make informed decisions + +### Can we extend this to replace the existing Permission-Request mechanism? + +Potentially, but that's out of scope for this RFD. PR #210 discussed that elicitation "could potentially even replace the Permission-Request mechanism" (Phil65), but that requires separate analysis of the permission request use cases and whether elicitation's constraints (no complex nesting, simpler lifecycle) are sufficient. + +### What about validating user input on the client side? + +Clients should validate user input against the provided JSON Schema **before** sending the response to the agent. This prevents invalid data from reaching the agent and provides immediate feedback to the user. + +If the agent requires additional validation beyond what's expressible in JSON Schema: +1. Agent validates the received value in the next turn +2. If validation fails, agent can fail the turn with an error +3. Client can then re-prompt the user (or fall back to the original default) + +For v1, we recommend starting with JSON Schema validation only. If more complex validation patterns emerge from real-world usage, a future RFD can specify additional validation mechanisms. + +## Revision history + +- 2026-01-12: Initial draft based on community discussions in PR #340 (user selection), PR #210 (session config alignment), and PR #330 (authentication use cases). Aligned with MCP elicitation patterns. From 1502687f3d74c1c3912a9fdb1698cc9557be5bc0 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:28:59 -0500 Subject: [PATCH 02/11] docs(rfd): Enhance elicitation specification to support structured client capabilities --- docs/rfds/elicitation.mdx | 63 ++++++++++++++++++++++++++++++++------- 1 file changed, 52 insertions(+), 11 deletions(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index f794d033..b21c10ac 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -43,7 +43,7 @@ The mechanism would: 3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific. -4. **Support client capability negotiation**: Clients declare what elicitation types they support (similar to the client capabilities pattern emerging in the protocol). Agents handle gracefully when clients don't support elicitation. +4. **Support client capability negotiation**: Clients declare elicitation support via a structured capability object that distinguishes between `form`-based and `url`-based elicitation (following MCP's capability model). This allows clients to support one or both modalities, enables agents to pass capabilities along to MCP servers, and handles graceful degradation when clients have limited elicitation support. 5. **Provide rich context**: Agents can include title, description, detailed constraints, and examples—helping clients render consistent, helpful UI without custom implementations. @@ -155,16 +155,21 @@ Example 3 (Text Input with Constraints): ### Input Types -Following MCP's approach, we would start with these types. Clients should gracefully degrade unknown types to `text`: +Following MCP's approach, we would start with these types, organized into two categories: +**Form-based types** (rendered inline by the client): - `text` - Open-ended text input - `number` - Numeric input - `select` - Single-choice selection from a list - `multiselect` - Multiple-choice selection - `boolean` - Yes/no choice - `password` - Masked text input (for sensitive credentials) + +**URL-based types** (out-of-band browser flows): - `url` - URL-based out-of-band authentication (browser-opened flows like OAuth) +This distinction is reflected in the client capabilities model, allowing clients to declare support for one or both modalities. Clients should gracefully degrade unknown form types to `text`. + ### Restricted JSON Schema Aligning with MCP and building on [Session Config Options discussions](https://github.com/agentclientprotocol/agent-client-protocol/pull/210) about schema constraints, agents use a restricted JSON Schema subset: @@ -274,7 +279,7 @@ The agent then continues processing with the user's input in the next turn or ta ### Client Capabilities -Clients declare whether they support elicitation during the `initialize` phase via `ClientCapabilities`, following the same pattern as `fs` and `terminal` capabilities: +Clients declare elicitation support during the `initialize` phase via `ClientCapabilities`, following MCP's capability model pattern. The capability distinguishes between `form`-based and `url`-based elicitation: ```json { @@ -289,8 +294,8 @@ Clients declare whether they support elicitation during the `initialize` phase v }, "terminal": true, "elicitation": { - "supported": true, - "supportedTypes": ["text", "number", "select", "multiselect", "boolean"] + "form": {}, + "url": {} } }, "clientInfo": { @@ -301,13 +306,47 @@ Clients declare whether they support elicitation during the `initialize` phase v } ``` -This tells the agent which elicitation input types the client can render. Agents must gracefully handle clients that don't include this field (assumed to have no elicitation support). +**Capability structure:** +- `elicitation.form` - Present if the client can render form-based input types (`text`, `number`, `select`, `multiselect`, `boolean`, `password`) +- `elicitation.url` - Present if the client can open URLs for out-of-band flows (OAuth, etc.) + +**Example: Headless client (no browser access):** +```json +"elicitation": { + "form": {} +} +``` + +**Example: Simple terminal with URL support only:** +```json +"elicitation": { + "url": {} +} +``` + +**Example: Full-featured client:** +```json +"elicitation": { + "form": {}, + "url": {} +} +``` + +This structure: +1. Allows clients to declare partial support based on their environment +2. Enables agents to pass capabilities along to MCP servers they connect to +3. Maps cleanly to MCP's elicitation capability model +4. Provides clear semantics for graceful degradation + +Agents must gracefully handle clients that don't include this field (assumed to have no elicitation support) or that only include one of `form` or `url`. ### Backward Compatibility -- If a client doesn't support elicitation, agents must provide a default value and continue +- If a client doesn't declare `elicitation` capabilities, agents must provide a default value and continue +- If a client only declares `elicitation.form`, agents must not send `url`-type elicitation requests (or provide defaults and continue) +- If a client only declares `elicitation.url`, agents must not send form-type elicitation requests (or provide defaults and continue) - Agents should not require elicitation responses to continue operating -- Clients that don't understand an elicitation type should treat it as requesting text input +- Clients that don't understand a specific form type should treat it as requesting text input ## Frequently asked questions @@ -378,10 +417,11 @@ For the initial version: no. We're focusing on simple types (strings, numbers, b ### How should agents handle clients that don't support elicitation? Agents should always design to gracefully degrade: -- Provide sensible default values -- Describe what they're requesting in turn content (text) +- Check `elicitation.form` and `elicitation.url` capabilities before sending requests +- If the required capability is missing, provide sensible default values +- Describe what they're requesting in turn content (text) as fallback - Proceed with the defaults -- Clients declare capabilities so agents can make informed decisions +- For agents connecting to MCP servers: pass the client's elicitation capabilities to the MCP server so it can also make informed decisions ### Can we extend this to replace the existing Permission-Request mechanism? @@ -400,4 +440,5 @@ For v1, we recommend starting with JSON Schema validation only. If more complex ## Revision history +- 2026-02-05: Updated capability model to distinguish between `form` and `url` elicitation types, following MCP's capability pattern. This enables partial support (form-only or url-only clients) and better mapping to MCP servers. - 2026-01-12: Initial draft based on community discussions in PR #340 (user selection), PR #210 (session config alignment), and PR #330 (authentication use cases). Aligned with MCP elicitation patterns. From 437a0bed3b92427c7c1d882ebba353ef1765a31f Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:37:13 -0500 Subject: [PATCH 03/11] docs(rfd): Update elicitation specification to reference draft and clarify structured input modes --- docs/rfds/elicitation.mdx | 503 +++++++++++++++++++++++++++----------- 1 file changed, 366 insertions(+), 137 deletions(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index b21c10ac..d3640c6a 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -6,7 +6,7 @@ Author(s): [@yordis](https://github.com/yordis) ## Elevator pitch -Add support for agents to request structured information from users during a session through a standardized elicitation mechanism, aligned with [MCP's elicitation feature](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation). This allows agents to ask follow-up questions, collect authentication credentials, gather preferences, and request required information without side-channel communication or ad-hoc client UI implementations. +Add support for agents to request structured information from users during a session through a standardized elicitation mechanism, aligned with [MCP's elicitation feature](https://modelcontextprotocol.io/specification/draft/client/elicitation). This allows agents to ask follow-up questions, collect authentication credentials, gather preferences, and request required information without side-channel communication or ad-hoc client UI implementations. ## Status quo @@ -30,16 +30,15 @@ This gap limits the richness of agent-client interaction and forces both agents ## What we propose to do about it -We propose introducing an elicitation mechanism for agents to request structured information from users, aligned with [MCP's established elicitation patterns](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation). This addresses discussions from PR #340 about standardizing user selection flows and PR #330 about secure authentication handling. +We propose introducing an elicitation mechanism for agents to request structured information from users, aligned with [MCP's draft elicitation specification](https://modelcontextprotocol.io/specification/draft/client/elicitation). This addresses discussions from PR #340 about standardizing user selection flows and PR #330 about secure authentication handling. The mechanism would: -1. **Use restricted JSON Schema** (as discussed in PR #210): Like MCP, constrain JSON Schema to a useful subset for `type`, `enum`, `minimum`, `maximum`, `minLength`, `maxLength`, `pattern`, `default`, and `description`. This aligns with how Session Config Options already think about schema. +1. **Use restricted JSON Schema** (as discussed in PR #210): Like MCP, constrain JSON Schema to a useful subset—flat objects with primitive properties (`string`, `number`, `integer`, `boolean`) plus supported formats and enum values. Clients decide how to render UI based on the schema. -2. **Support multiple input modalities**: - - **Simple inputs**: text, number, boolean - - **Selections**: select (single), multiselect (multiple) with enum-based options - - **Sensitive inputs**: password, URL-mode for out-of-band OAuth flows (addressing PR #330 authentication pain points) +2. **Support two elicitation modes** (following [MCP SEP-1036](https://modelcontextprotocol.io/community/seps/1036-url-mode-elicitation-for-secure-out-of-band-intera)): + - **Form mode** (in-band): Structured data collection via JSON Schema forms + - **URL mode** (out-of-band): Browser-based flows for sensitive operations like OAuth (addressing PR #330 authentication pain points) 3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific. @@ -70,128 +69,255 @@ Clients can: ### Alignment with MCP -This proposal follows MCP's established elicitation patterns. See [MCP Elicitation Specification](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation) for detailed guidance. ACP will use the same JSON Schema constraint approach, but adapted for our session/turn-based architecture. +This proposal follows MCP's draft elicitation specification. See [MCP Elicitation Specification](https://modelcontextprotocol.io/specification/draft/client/elicitation) for detailed guidance. ACP uses the same JSON Schema constraint approach and capability model, adapted for our session/turn-based architecture. Key differences from MCP: - MCP elicitation is tool-call-scoped; ACP elicitation is session/turn-scoped +- ACP uses `session/elicitation` method; MCP uses `elicitation/create` - ACP must integrate with existing Session Config Options (which also use schema constraints) -- ACP should support out-of-band flows for sensitive data (authentication from PR #330) +- ACP elicitation is triggered by `stopReason: "elicitation_requested"` in turn responses ### Elicitation Request Structure -When a turn ends with `stopReason: "elicitation_requested"`, the agent sends a separate elicitation request (following the same pattern as permission requests). Example 1 (User Selection - from PR #340): +When a turn ends with `stopReason: "elicitation_requested"`, the agent sends a separate elicitation request (following the same pattern as permission requests). + +**Example 1: Form Mode - User Selection (from PR #340)** ```json { - "elicitation": { - "id": "strategy-choice-42", - "type": "select", - "title": "Choose a Refactoring Strategy", - "description": "How would you like me to approach this refactoring?", - "schema": { - "type": "string", - "enum": ["conservative", "balanced", "aggressive"], - "default": "balanced" - }, - "options": [ - { - "value": "conservative", - "label": "Conservative", - "description": "Minimal changes, heavily tested approach" - }, - { - "value": "balanced", - "label": "Balanced (Recommended)", - "description": "Good balance of progress and safety" - }, - { - "value": "aggressive", - "label": "Aggressive", - "description": "Maximum optimization, requires review" + "mode": "form", + "message": "How would you like me to approach this refactoring?", + "requestedSchema": { + "type": "object", + "properties": { + "strategy": { + "type": "string", + "title": "Refactoring Strategy", + "description": "Choose how aggressively to refactor", + "oneOf": [ + { "const": "conservative", "title": "Conservative - Minimal changes" }, + { "const": "balanced", "title": "Balanced (Recommended)" }, + { "const": "aggressive", "title": "Aggressive - Maximum optimization" } + ], + "default": "balanced" } - ] + }, + "required": ["strategy"] } } ``` -Example 2 (Authentication Request - from PR #330, out-of-band OAuth): +**Example 2: URL Mode - Authentication (from PR #330, out-of-band OAuth)** ```json { - "elicitation": { - "id": "github-oauth-123", - "type": "url", - "title": "Authenticate with GitHub", - "description": "Please authorize this agent to access your GitHub repositories", - "schema": { - "type": "string", - "default": null + "mode": "url", + "elicitationId": "github-oauth-123", + "url": "https://github.com/login/oauth/authorize?client_id=abc123&state=xyz789&scope=repo", + "message": "Please authorize access to your GitHub repositories to continue." +} +``` + +**Example 3: Form Mode - Text Input with Constraints** + +```json +{ + "mode": "form", + "message": "What should this function be named?", + "requestedSchema": { + "type": "object", + "properties": { + "name": { + "type": "string", + "title": "Function Name", + "description": "Must be a valid identifier", + "minLength": 1, + "maxLength": 64, + "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$", + "default": "processData" + } }, - "url": "https://github.com/login/oauth/authorize?client_id=...", - "returnValueFormat": "token" + "required": ["name"] } } ``` -Example 3 (Text Input with Constraints): +**Example 4: Form Mode - Multiple Fields** ```json { - "elicitation": { - "id": "function-name", - "type": "text", - "title": "Function Name", - "description": "What should this function be named?", - "schema": { - "type": "string", - "minLength": 1, - "maxLength": 64, - "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$", - "default": "processData" - } + "mode": "form", + "message": "Please provide configuration details", + "requestedSchema": { + "type": "object", + "properties": { + "name": { + "type": "string", + "title": "Project Name" + }, + "port": { + "type": "integer", + "title": "Port Number", + "minimum": 1024, + "maximum": 65535, + "default": 3000 + }, + "enableLogging": { + "type": "boolean", + "title": "Enable Logging", + "default": true + } + }, + "required": ["name"] } } ``` -### Input Types +### Elicitation Modes -Following MCP's approach, we would start with these types, organized into two categories: +Following MCP's approach (specifically [SEP-1036](https://modelcontextprotocol.io/community/seps/1036-url-mode-elicitation-for-secure-out-of-band-intera)), elicitation supports two modes: -**Form-based types** (rendered inline by the client): -- `text` - Open-ended text input -- `number` - Numeric input -- `select` - Single-choice selection from a list -- `multiselect` - Multiple-choice selection -- `boolean` - Yes/no choice -- `password` - Masked text input (for sensitive credentials) +**Form mode** (in-band): Servers request structured data from users using restricted JSON Schema. The client decides how to render the form UI based on the schema. -**URL-based types** (out-of-band browser flows): -- `url` - URL-based out-of-band authentication (browser-opened flows like OAuth) +**URL mode** (out-of-band): Servers direct users to external URLs for sensitive interactions that must not pass through the agent or client (OAuth flows, payments, credential collection, etc.). -This distinction is reflected in the client capabilities model, allowing clients to declare support for one or both modalities. Clients should gracefully degrade unknown form types to `text`. +This distinction is reflected in the client capabilities model, allowing clients to declare support for one or both modalities. ### Restricted JSON Schema -Aligning with MCP and building on [Session Config Options discussions](https://github.com/agentclientprotocol/agent-client-protocol/pull/210) about schema constraints, agents use a restricted JSON Schema subset: +Aligning with [MCP's draft elicitation specification](https://modelcontextprotocol.io/specification/draft/client/elicitation), form mode elicitation uses a restricted subset of JSON Schema. Schemas are limited to flat objects with primitive properties only—the client decides how to render appropriate input UI based on the schema. + +**Supported primitive types:** -**Required fields:** -- `type` (string) - One of the input types above +1. **String Schema** +```json +{ + "type": "string", + "title": "Display Name", + "description": "Description text", + "minLength": 3, + "maxLength": 50, + "pattern": "^[A-Za-z]+$", + "format": "email", + "default": "user@example.com" +} +``` +Supported formats: `email`, `uri`, `date`, `date-time` -**Optional constraint fields:** -- `default` - Default value if user doesn't respond (agents should always provide this, even if `null`) -- `description` - Help text explaining what's being requested -- `enum` - Array of allowed values (for select/multiselect) -- `minLength`, `maxLength` - String length constraints -- `minimum`, `maximum` - Numeric range constraints -- `pattern` - Regex pattern for validation +2. **Number Schema** +```json +{ + "type": "number", + "title": "Display Name", + "description": "Description text", + "minimum": 0, + "maximum": 100, + "default": 50 +} +``` +Also supports `"type": "integer"` for whole numbers. + +3. **Boolean Schema** +```json +{ + "type": "boolean", + "title": "Display Name", + "description": "Description text", + "default": false +} +``` + +4. **Enum Schema** (for selections) + +Single-select enum (without titles): +```json +{ + "type": "string", + "title": "Color Selection", + "description": "Choose your favorite color", + "enum": ["Red", "Green", "Blue"], + "default": "Red" +} +``` + +Single-select enum (with titles): +```json +{ + "type": "string", + "title": "Color Selection", + "description": "Choose your favorite color", + "oneOf": [ + { "const": "#FF0000", "title": "Red" }, + { "const": "#00FF00", "title": "Green" }, + { "const": "#0000FF", "title": "Blue" } + ], + "default": "#FF0000" +} +``` + +Multi-select enum (without titles): +```json +{ + "type": "array", + "title": "Color Selection", + "description": "Choose your favorite colors", + "minItems": 1, + "maxItems": 2, + "items": { + "type": "string", + "enum": ["Red", "Green", "Blue"] + }, + "default": ["Red", "Green"] +} +``` + +Multi-select enum (with titles): +```json +{ + "type": "array", + "title": "Color Selection", + "description": "Choose your favorite colors", + "minItems": 1, + "maxItems": 2, + "items": { + "anyOf": [ + { "const": "#FF0000", "title": "Red" }, + { "const": "#00FF00", "title": "Green" }, + { "const": "#0000FF", "title": "Blue" } + ] + }, + "default": ["#FF0000", "#00FF00"] +} +``` + +**Request schema structure:** +```json +"requestedSchema": { + "type": "object", + "properties": { + "propertyName": { + "type": "string", + "title": "Display Name", + "description": "Description of the property" + }, + "anotherProperty": { + "type": "number", + "minimum": 0, + "maximum": 100 + } + }, + "required": ["propertyName"] +} +``` -**Not supported** (to keep initial implementation simple): -- Complex nested objects/arrays -- `allOf`, `anyOf`, `oneOf` +**Not supported** (to simplify client implementation): +- Complex nested objects/arrays (beyond enum arrays) - Conditional validation -- Custom formats +- Custom formats beyond the supported list + +Clients use this schema to generate appropriate input forms, validate user input before sending, and provide better guidance to users. All primitive types support optional default values; clients SHOULD pre-populate form fields with these values. -This constraint list can expand in future versions based on community feedback. +**Security note:** Following MCP, servers MUST NOT use form mode elicitation to request sensitive information (passwords, API keys, credentials). Sensitive data collection MUST use URL mode elicitation, which bypasses the agent and client entirely. ### Turn Response with Elicitation Stop Reason @@ -217,6 +343,7 @@ When an agent reaches a decision point and needs structured user input, it ends After the turn completes with `stopReason: "elicitation_requested"`, the agent immediately sends a separate `session/elicitation` request (following the same pattern as `session/request_permission`): +**Form mode example:** ```json { "jsonrpc": "2.0", @@ -224,58 +351,142 @@ After the turn completes with `stopReason: "elicitation_requested"`, the agent i "method": "session/elicitation", "params": { "sessionId": "...", - "elicitation": { - "id": "refactor-strategy-001", - "type": "select", - "title": "Choose Refactoring Strategy", - "description": "How would you like me to approach this refactoring?", - "schema": { - "type": "string", - "enum": ["conservative", "balanced", "aggressive"], - "default": "balanced" - }, - "options": [ - { - "value": "conservative", - "label": "Conservative", - "description": "Minimal changes, heavily tested approach" - }, - { - "value": "balanced", - "label": "Balanced (Recommended)", - "description": "Good balance of progress and safety" - }, - { - "value": "aggressive", - "label": "Aggressive", - "description": "Maximum optimization, requires review" + "mode": "form", + "message": "How would you like me to approach this refactoring?", + "requestedSchema": { + "type": "object", + "properties": { + "strategy": { + "type": "string", + "title": "Refactoring Strategy", + "oneOf": [ + { "const": "conservative", "title": "Conservative" }, + { "const": "balanced", "title": "Balanced (Recommended)" }, + { "const": "aggressive", "title": "Aggressive" } + ], + "default": "balanced" } - ] + }, + "required": ["strategy"] } } } ``` -The client presents the elicitation UI to the user based on the input type and constraints. +**URL mode example:** +```json +{ + "jsonrpc": "2.0", + "id": 44, + "method": "session/elicitation", + "params": { + "sessionId": "...", + "mode": "url", + "elicitationId": "github-oauth-001", + "url": "https://github.com/login/oauth/authorize?client_id=abc123&state=xyz789", + "message": "Please authorize access to your GitHub repositories." + } +} +``` + +The client presents the elicitation UI to the user. For form mode, the client generates appropriate input UI based on the JSON Schema. For URL mode, the client opens the URL in a secure browser context. ### User Response -When the user responds to an elicitation request, the client sends a separate `session/elicitation` response: +Elicitation responses use a three-action model (following MCP) to clearly distinguish between different user actions: +**Accept** - User explicitly approved and submitted with data: ```json { "jsonrpc": "2.0", "id": 43, "result": { - "elicitationResponse": { - "id": "refactor-strategy-001", - "value": "balanced" + "action": "accept", + "content": { + "strategy": "balanced" } } } ``` -The agent then continues processing with the user's input in the next turn or takes immediate action based on the response. +**Decline** - User explicitly declined the request: +```json +{ + "jsonrpc": "2.0", + "id": 43, + "result": { + "action": "decline" + } +} +``` + +**Cancel** - User dismissed without making an explicit choice (closed dialog, pressed Escape, etc.): +```json +{ + "jsonrpc": "2.0", + "id": 43, + "result": { + "action": "cancel" + } +} +``` + +For URL mode elicitation, the response with `action: "accept"` indicates that the user consented to the interaction. It does not mean the interaction is complete—the interaction occurs out-of-band and the client is not aware of the outcome until the agent sends a completion notification. + +Agents should handle each state appropriately: +- **Accept**: Process the submitted data +- **Decline**: Handle explicit decline (e.g., use default, offer alternatives) +- **Cancel**: Handle dismissal (e.g., use default, prompt again later) + +### Completion Notifications for URL Mode + +Following MCP, agents MAY send a `notifications/elicitation/complete` notification when an out-of-band interaction started by URL mode elicitation is completed: + +```json +{ + "jsonrpc": "2.0", + "method": "notifications/elicitation/complete", + "params": { + "elicitationId": "github-oauth-001" + } +} +``` + +Agents sending notifications: +- MUST only send the notification to the client that initiated the elicitation request +- MUST include the `elicitationId` established in the original request + +Clients: +- MUST ignore notifications referencing unknown or already-completed IDs +- MAY use this notification to automatically retry requests, update UI, or continue an interaction +- SHOULD provide manual controls for the user to retry or cancel if the notification never arrives + +### URL Elicitation Required Error + +When a request cannot be processed until a URL mode elicitation is completed, the agent MAY return a `URLElicitationRequiredError` (code `-32042`). This allows clients to understand that a specific elicitation is required before retrying the original request. + +```json +{ + "jsonrpc": "2.0", + "id": 2, + "error": { + "code": -32042, + "message": "This request requires authorization.", + "data": { + "elicitations": [ + { + "mode": "url", + "elicitationId": "github-oauth-001", + "url": "https://agent.example.com/connect?elicitationId=github-oauth-001", + "message": "Authorization is required to access your GitHub repositories." + } + ] + } + } +} +``` + +Any elicitations returned in the error MUST be URL mode elicitations with an `elicitationId`. Clients may automatically retry the failed request after receiving a completion notification. ### Client Capabilities @@ -307,8 +518,8 @@ Clients declare elicitation support during the `initialize` phase via `ClientCap ``` **Capability structure:** -- `elicitation.form` - Present if the client can render form-based input types (`text`, `number`, `select`, `multiselect`, `boolean`, `password`) -- `elicitation.url` - Present if the client can open URLs for out-of-band flows (OAuth, etc.) +- `elicitation.form` - Present if the client can render form UI from restricted JSON Schema (strings, numbers, integers, booleans, enums) +- `elicitation.url` - Present if the client can open URLs for out-of-band flows (OAuth, payments, credential collection) **Example: Headless client (no browser access):** ```json @@ -343,10 +554,10 @@ Agents must gracefully handle clients that don't include this field (assumed to ### Backward Compatibility - If a client doesn't declare `elicitation` capabilities, agents must provide a default value and continue -- If a client only declares `elicitation.form`, agents must not send `url`-type elicitation requests (or provide defaults and continue) -- If a client only declares `elicitation.url`, agents must not send form-type elicitation requests (or provide defaults and continue) +- If a client only declares `elicitation.form`, agents must not send URL-mode elicitation requests (or provide defaults and continue) +- If a client only declares `elicitation.url`, agents must not send form-mode elicitation requests (or provide defaults and continue) - Agents should not require elicitation responses to continue operating -- Clients that don't understand a specific form type should treat it as requesting text input +- Following MCP: an empty capability object (`"elicitation": {}`) is equivalent to declaring support for form mode only ## Frequently asked questions @@ -390,17 +601,34 @@ PR #340 specifically concluded: "I think we'd rather have an MCP elicitation sto ### How does authentication flow work with URL-mode elicitation? -From PR #330: URL-mode elicitation allows agents to request authentication without exposing credentials to the protocol. While inspired by MCP's URL-mode elicitation, ACP's implementation focuses specifically on out-of-band credential handling: - -1. Agent sends elicitation request with `type: "url"` and OAuth authorization URL -2. Client opens URL in user's browser (out-of-band process) -3. User authenticates and grants permission in the browser -4. Browser returns token/credential to client (e.g., via redirect or callback) -5. Client includes token in next `session/turn` via `elicitationResponse` - -**Key guarantee**: Credentials never flow through the agent or LLM, addressing the core pain point from PR #330. - -The exact semantics of how tokens are returned from the browser and how `returnValueFormat` is handled will be specified in detail during the implementation phase of this RFD. +From PR #330: URL-mode elicitation allows agents to request authentication without exposing credentials to the protocol. Following [MCP's draft elicitation specification](https://modelcontextprotocol.io/specification/draft/client/elicitation): + +1. Agent sends elicitation request with `mode: "url"`, an `elicitationId`, and a URL to the agent's own connect endpoint (not directly to the OAuth provider) +2. Client displays the URL to the user and requests consent to open it +3. Client responds with `action: "accept"` to indicate the user consented +4. User opens URL in their browser (out-of-band process) +5. Agent's connect page verifies the user identity matches the elicitation request +6. Agent redirects user to the OAuth provider's authorization endpoint +7. User authenticates and grants permission +8. OAuth provider redirects back to the agent's redirect_uri +9. Agent exchanges the authorization code for tokens and stores them bound to the user's identity +10. Agent sends a `notifications/elicitation/complete` notification to inform the client + +**Key guarantees**: +- Credentials never flow through the agent LLM or client +- The agent is responsible for securely storing third-party tokens +- The agent MUST verify user identity to prevent phishing attacks + +**Security requirements** (from MCP draft spec): +- Agents MUST NOT include sensitive information in the URL +- Agents MUST NOT provide pre-authenticated URLs +- Agents SHOULD use HTTPS URLs +- Clients MUST NOT open URLs without explicit user consent +- Clients MUST show the full URL to the user before consent +- Clients MUST open URLs in a secure context that prevents inspection (e.g., SFSafariViewController on iOS, not WKWebView) +- Clients SHOULD highlight the domain to mitigate subdomain spoofing + +**Phishing prevention**: The agent MUST verify that the user who started the elicitation request is the same user who completes the OAuth flow. This is typically done by checking session cookies against the user identity from the MCP authorization. ### Can agents use elicitation for information required before responding? @@ -418,7 +646,7 @@ For the initial version: no. We're focusing on simple types (strings, numbers, b Agents should always design to gracefully degrade: - Check `elicitation.form` and `elicitation.url` capabilities before sending requests -- If the required capability is missing, provide sensible default values +- If the required mode is not supported, provide sensible default values - Describe what they're requesting in turn content (text) as fallback - Proceed with the defaults - For agents connecting to MCP servers: pass the client's elicitation capabilities to the MCP server so it can also make informed decisions @@ -440,5 +668,6 @@ For v1, we recommend starting with JSON Schema validation only. If more complex ## Revision history -- 2026-02-05: Updated capability model to distinguish between `form` and `url` elicitation types, following MCP's capability pattern. This enables partial support (form-only or url-only clients) and better mapping to MCP servers. +- 2026-02-05: Major revision to align with MCP draft elicitation specification. Updated enum schema to use `oneOf`/`anyOf` with `const`/`title` instead of `enumNames`. Added multi-select array support. Added `pattern` field for strings. Added URLElicitationRequiredError (-32042) section. Added completion notifications section. Expanded security considerations including phishing prevention. Updated all examples to match MCP draft spec format. +- 2026-02-05: Initial MCP alignment. Removed explicit "input types" in favor of restricted JSON Schema (client decides rendering). Added `mode` field (`form`/`url`). Updated capability model to use `form`/`url` sub-objects per MCP SEP-1036. Added three-action response model (`accept`/`decline`/`cancel`). Removed `password` type (MCP prohibits sensitive data in form mode). - 2026-01-12: Initial draft based on community discussions in PR #340 (user selection), PR #210 (session config alignment), and PR #330 (authentication use cases). Aligned with MCP elicitation patterns. From 64db5208c32cb8244c1accd46f89acf6f6d51bd0 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:38:20 -0500 Subject: [PATCH 04/11] docs(rfd): Clarify elicitation schema restrictions to align with MCP design principles --- docs/rfds/elicitation.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index d3640c6a..ef1d9162 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -640,7 +640,7 @@ The agent's default value is used (which agents must always provide). If an agen ### Should elicitation support complex nested data structures? -For the initial version: no. We're focusing on simple types (strings, numbers, booleans, arrays of those). Complex nested structures can be added in future versions if use cases emerge. This keeps the initial scope manageable and lets us learn from real-world usage. +We follow MCP's design here. MCP intentionally restricts elicitation schemas to flat objects with primitive properties to simplify client implementation and user experience. Complex nested structures, arrays of objects (beyond enum arrays), and advanced JSON Schema features are explicitly not supported. If MCP expands this in the future, ACP would follow suit. ### How should agents handle clients that don't support elicitation? From 68b78c3fca288b46ada5fe1241ce462bd099a0fe Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:39:05 -0500 Subject: [PATCH 05/11] docs(rfd): Recommend separation of permission requests and elicitation mechanisms for clarity and security --- docs/rfds/elicitation.mdx | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index ef1d9162..ffe3801f 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -653,7 +653,13 @@ Agents should always design to gracefully degrade: ### Can we extend this to replace the existing Permission-Request mechanism? -Potentially, but that's out of scope for this RFD. PR #210 discussed that elicitation "could potentially even replace the Permission-Request mechanism" (Phil65), but that requires separate analysis of the permission request use cases and whether elicitation's constraints (no complex nesting, simpler lifecycle) are sufficient. +We recommend keeping them separate. Permission requests are fundamentally security decisions—allowing a tool call to proceed is distinct from the model asking for clarification or collecting user preferences. Keeping these separate allows clients to: + +- Offer a consistent, recognizable UX for security-sensitive decisions (permissions) +- Clearly distinguish "the agent needs approval to do something" from "the agent needs information to continue" +- Apply different policies (e.g., "always allow file reads" vs. per-request elicitation responses) + +This is the same reasoning behind keeping authentication flows (URL mode) distinct from data collection (form mode). While we may reuse some types between these mechanisms, conflating the features would blur important security boundaries. ### What about validating user input on the client side? From 1102b40dd32dbce4df30fcbfb3517d6eeaedcad1 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:39:51 -0500 Subject: [PATCH 06/11] docs(rfd): Clarify handling of user responses to elicitation requests and cancellation logic --- docs/rfds/elicitation.mdx | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index ffe3801f..cd5bbf23 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -636,7 +636,9 @@ Yes. An agent can include an elicitation request in a turn response with a defau ### What if a user doesn't respond to an elicitation request? -The agent's default value is used (which agents must always provide). If an agent truly requires user input and wants to block, it should fail the turn and let the client handle retry logic. +Elicitation requests require a response. If the user dismisses the elicitation without making an explicit choice (closes the dialog, presses Escape, etc.), the client responds with `action: "cancel"`. The agent then decides how to proceed—it may use a default value, prompt again later, or fail the turn. + +This ties into the broader request cancellation work: elicitation requests can be cancelled like any other request, and the `cancel` action provides a clear signal that the user chose not to engage rather than explicitly declining. ### Should elicitation support complex nested data structures? From c0cf9b0bbd2bf485516ea4ff820fe777ce9809b6 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:40:29 -0500 Subject: [PATCH 07/11] docs(rfd): Expand explanation of elicitation request/response pattern for agent workflows --- docs/rfds/elicitation.mdx | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index cd5bbf23..262b249d 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -632,7 +632,13 @@ From PR #330: URL-mode elicitation allows agents to request authentication witho ### Can agents use elicitation for information required before responding? -Yes. An agent can include an elicitation request in a turn response with a default value and continue, then incorporate the user's response into the next turn. This is how agents can guide users through multi-step workflows. +Yes. By modeling elicitation as a request/response pattern (like MCP's `elicitation/create`), the agent controls its own flow. The agent can: + +- Send an elicitation request and wait for the response before proceeding +- Continue with other work while waiting for user input +- Chain multiple elicitations as needed for multi-step workflows + +This flexibility is why elicitation is modeled as a separate request/response rather than being tightly coupled to turns. ### What if a user doesn't respond to an elicitation request? From 4fab19024f00b4ee7ea59d6c891752d7b2ef7d99 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:41:13 -0500 Subject: [PATCH 08/11] docs(rfd): Update defaults description in elicitation schema to clarify optionality of fields --- docs/rfds/elicitation.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index 262b249d..c784d5b9 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -583,7 +583,7 @@ Excellent question from PR #210 discussions. Both use restricted JSON Schema, bu |--------|------------------------|-------------| | **Lifecycle** | Persistent, pre-declared at session init | Transient, appears during turns | | **Scope** | Session-wide configuration | Single turn/decision point | -| **Defaults** | Required (agents must have defaults) | Required (agents should always provide) | +| **Defaults** | Required (agents must have defaults) | Optional (schema's `required` array determines mandatory fields) | | **State management** | Client maintains full state, broadcast on changes | Agent provides response in next turn | | **Use cases** | Model selection, session mode, persistent settings | Authentication, step-by-step decisions, one-time questions | From b9c183af8a4de0142371bbb0b38e31109f00b49d Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:42:20 -0500 Subject: [PATCH 09/11] docs(rfd): Revise elicitation FAQ to clarify multi-field requests and response handling --- docs/rfds/elicitation.mdx | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index c784d5b9..a7ea0942 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -561,19 +561,13 @@ Agents must gracefully handle clients that don't include this field (assumed to ## Frequently asked questions -### Can an agent request multiple pieces of information in one turn? +### Can an agent request multiple pieces of information at once? -For v1, we recommend a **single elicitation per turn**. This keeps the design simple and predictable for both clients and agents. It also follows the Session Config Options pattern of having agents send full state updates. +Yes—a single form mode elicitation request can include multiple fields in its `requestedSchema`. The schema is an object with multiple properties, and the client renders a form with all requested fields. -If an agent needs to collect multiple pieces of information, it can: -1. Ask one question per turn (with sensible defaults) -2. Incorporate the user's response in the context for the next turn -3. Ask the next question in a subsequent turn +For sequential information gathering, agents can send multiple elicitation requests and wait for each response before proceeding. This allows agents to adapt follow-up questions based on previous answers. -This approach: -- Keeps client UI logic simple -- Allows agents to adapt follow-up questions based on previous answers -- Can be extended to array-based multi-elicitation in future versions if compelling use cases emerge +The request/response model gives agents flexibility: they control when to send elicitation requests and whether to wait for responses or continue with other work. ### How does this differ from session config options? @@ -581,11 +575,11 @@ Excellent question from PR #210 discussions. Both use restricted JSON Schema, bu | Aspect | Session Config Options | Elicitation | |--------|------------------------|-------------| -| **Lifecycle** | Persistent, pre-declared at session init | Transient, appears during turns | -| **Scope** | Session-wide configuration | Single turn/decision point | +| **Lifecycle** | Persistent, pre-declared at session init | Transient, request/response | +| **Scope** | Session-wide configuration | Single decision point or data collection | | **Defaults** | Required (agents must have defaults) | Optional (schema's `required` array determines mandatory fields) | -| **State management** | Client maintains full state, broadcast on changes | Agent provides response in next turn | -| **Use cases** | Model selection, session mode, persistent settings | Authentication, step-by-step decisions, one-time questions | +| **State management** | Client maintains full state, broadcast on changes | Agent receives response and decides how to proceed | +| **Use cases** | Model selection, session mode, persistent settings | Authentication, clarifying questions, one-time data collection | Session Config Options are great for "how should you run this session?" Elicitation is for "what should I do next?" From 6f6879f467fb38b099387a0ba5c413fb8415cdf8 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 21:46:05 -0500 Subject: [PATCH 10/11] docs(rfd): Remove turn-based framing from elicitation specification Elicitation is a request/response pattern - agents send requests when they need information and control their own flow. Removed stopReason and turn lifecycle dependencies. --- docs/rfds/elicitation.mdx | 29 ++++------------------------- 1 file changed, 4 insertions(+), 25 deletions(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index a7ea0942..d2d73648 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -40,7 +40,7 @@ The mechanism would: - **Form mode** (in-band): Structured data collection via JSON Schema forms - **URL mode** (out-of-band): Browser-based flows for sensitive operations like OAuth (addressing PR #330 authentication pain points) -3. **Work in turn context**: Elicitation requests are triggered when a turn ends with `stopReason: "elicitation_requested"`, allowing agents to ask questions naturally within the conversation flow. Agents send elicitation requests via a separate `session/elicitation` method (following the same request/response pattern as `session/request_permission`). Unlike Session Config Options (which are persistent), elicitation requests are transient and turn-specific. +3. **Request/response pattern**: Agents send elicitation requests via a `session/elicitation` method and receive responses. The agent controls when to send requests and whether to wait for responses before proceeding. Unlike Session Config Options (which are persistent), elicitation requests are transient. 4. **Support client capability negotiation**: Clients declare elicitation support via a structured capability object that distinguishes between `form`-based and `url`-based elicitation (following MCP's capability model). This allows clients to support one or both modalities, enables agents to pass capabilities along to MCP servers, and handles graceful degradation when clients have limited elicitation support. @@ -72,14 +72,13 @@ Clients can: This proposal follows MCP's draft elicitation specification. See [MCP Elicitation Specification](https://modelcontextprotocol.io/specification/draft/client/elicitation) for detailed guidance. ACP uses the same JSON Schema constraint approach and capability model, adapted for our session/turn-based architecture. Key differences from MCP: -- MCP elicitation is tool-call-scoped; ACP elicitation is session/turn-scoped +- MCP elicitation is tool-call-scoped; ACP elicitation is session-scoped - ACP uses `session/elicitation` method; MCP uses `elicitation/create` - ACP must integrate with existing Session Config Options (which also use schema constraints) -- ACP elicitation is triggered by `stopReason: "elicitation_requested"` in turn responses ### Elicitation Request Structure -When a turn ends with `stopReason: "elicitation_requested"`, the agent sends a separate elicitation request (following the same pattern as permission requests). +Agents send elicitation requests when they need information from the user. This is a request/response pattern—the agent sends the request and waits for the client's response. **Example 1: Form Mode - User Selection (from PR #340)** @@ -319,29 +318,9 @@ Clients use this schema to generate appropriate input forms, validate user input **Security note:** Following MCP, servers MUST NOT use form mode elicitation to request sensitive information (passwords, API keys, credentials). Sensitive data collection MUST use URL mode elicitation, which bypasses the agent and client entirely. -### Turn Response with Elicitation Stop Reason - -When an agent reaches a decision point and needs structured user input, it ends the turn with `stopReason: "elicitation_requested"`: - -```json -{ - "jsonrpc": "2.0", - "id": 42, - "result": { - "content": [ - { - "type": "text", - "text": "I can refactor this code in several ways. Each approach has different tradeoffs. Which strategy would you prefer?" - } - ], - "stopReason": "elicitation_requested" - } -} -``` - ### Elicitation Request -After the turn completes with `stopReason: "elicitation_requested"`, the agent immediately sends a separate `session/elicitation` request (following the same pattern as `session/request_permission`): +The agent sends a `session/elicitation` request when it needs information from the user: **Form mode example:** ```json From 8f78b5243bf9bbba413c0061e638147a444497b0 Mon Sep 17 00:00:00 2001 From: Yordis Prieto Date: Thu, 5 Feb 2026 22:52:13 -0500 Subject: [PATCH 11/11] docs(rfd): Update elicitation specification to include normative requirements and message flow diagrams --- docs/rfds/elicitation.mdx | 135 +++++++++++++++++++++++++++++++++++--- 1 file changed, 125 insertions(+), 10 deletions(-) diff --git a/docs/rfds/elicitation.mdx b/docs/rfds/elicitation.mdx index d2d73648..5f23d7fe 100644 --- a/docs/rfds/elicitation.mdx +++ b/docs/rfds/elicitation.mdx @@ -112,7 +112,7 @@ Agents send elicitation requests when they need information from the user. This { "mode": "url", "elicitationId": "github-oauth-123", - "url": "https://github.com/login/oauth/authorize?client_id=abc123&state=xyz789&scope=repo", + "url": "https://agent.example.com/connect?elicitationId=github-oauth-123", "message": "Please authorize access to your GitHub repositories to continue." } ``` @@ -182,6 +182,12 @@ Following MCP's approach (specifically [SEP-1036](https://modelcontextprotocol.i This distinction is reflected in the client capabilities model, allowing clients to declare support for one or both modalities. +**Normative requirements:** +- Clients declaring the `elicitation` capability MUST support at least one mode (`form` or `url`). +- Agents MUST NOT send elicitation requests with modes that are not supported by the client. +- For URL mode, the `url` parameter MUST contain a valid URL. +- Agents MUST NOT return the `URLElicitationRequiredError` (code `-32042`) except when URL mode elicitation is required. + ### Restricted JSON Schema Aligning with [MCP's draft elicitation specification](https://modelcontextprotocol.io/specification/draft/client/elicitation), form mode elicitation uses a restricted subset of JSON Schema. Schemas are limited to flat objects with primitive properties only—the client decides how to render appropriate input UI based on the schema. @@ -362,7 +368,7 @@ The agent sends a `session/elicitation` request when it needs information from t "sessionId": "...", "mode": "url", "elicitationId": "github-oauth-001", - "url": "https://github.com/login/oauth/authorize?client_id=abc123&state=xyz789", + "url": "https://agent.example.com/connect?elicitationId=github-oauth-001", "message": "Please authorize access to your GitHub repositories." } } @@ -417,6 +423,81 @@ Agents should handle each state appropriately: - **Decline**: Handle explicit decline (e.g., use default, offer alternatives) - **Cancel**: Handle dismissal (e.g., use default, prompt again later) +### Message Flow + +#### Form Mode Flow + +```mermaid +sequenceDiagram + participant User + participant Client + participant Agent + + Note over Agent: Agent initiates elicitation + Agent->>Client: session/elicitation (mode: form) + + Note over User,Client: Present elicitation UI + User-->>Client: Provide requested information + + Note over Agent,Client: Complete request + Client->>Agent: Return user response + + Note over Agent: Continue processing with new information +``` + +#### URL Mode Flow + +```mermaid +sequenceDiagram + participant UserAgent as User Agent (Browser) + participant User + participant Client + participant Agent + + Note over Agent: Agent initiates elicitation + Agent->>Client: session/elicitation (mode: url) + + Client->>User: Present consent to open URL + User-->>Client: Provide consent + + Client->>UserAgent: Open URL + Client->>Agent: Accept response + + Note over User,UserAgent: User interaction + UserAgent-->>Agent: Interaction complete + Agent-->>Client: notifications/elicitation/complete (optional) + + Note over Agent: Continue processing with new information +``` + +#### URL Mode With Elicitation Required Error Flow + +```mermaid +sequenceDiagram + participant UserAgent as User Agent (Browser) + participant User + participant Client + participant Agent + + Client->>Agent: Request (e.g., tool call) + + Note over Agent: Agent needs authorization + Agent->>Client: URLElicitationRequiredError + Note over Client: Client notes the original request can be retried after elicitation + + Client->>User: Present consent to open URL + User-->>Client: Provide consent + + Client->>UserAgent: Open URL + + Note over User,UserAgent: User interaction + + UserAgent-->>Agent: Interaction complete + Agent-->>Client: notifications/elicitation/complete (optional) + + Client->>Agent: Retry original request (optional) +``` + ### Completion Notifications for URL Mode Following MCP, agents MAY send a `notifications/elicitation/complete` notification when an out-of-band interaction started by URL mode elicitation is completed: @@ -467,6 +548,16 @@ When a request cannot be processed until a URL mode elicitation is completed, th Any elicitations returned in the error MUST be URL mode elicitations with an `elicitationId`. Clients may automatically retry the failed request after receiving a completion notification. +### Error Handling + +Agents MUST return standard JSON-RPC errors for common failure cases: + +- When a request cannot be processed until a URL mode elicitation is completed: `-32042` (`URLElicitationRequiredError`) + +Clients MUST return standard JSON-RPC errors for common failure cases: + +- When the agent sends a `session/elicitation` request with a mode not declared in client capabilities: `-32602` (Invalid params) + ### Client Capabilities Clients declare elicitation support during the `initialize` phase via `ClientCapabilities`, following MCP's capability model pattern. The capability distinguishes between `form`-based and `url`-based elicitation: @@ -538,6 +629,21 @@ Agents must gracefully handle clients that don't include this field (assumed to - Agents should not require elicitation responses to continue operating - Following MCP: an empty capability object (`"elicitation": {}`) is equivalent to declaring support for form mode only +### Statefulness + +Most practical uses of elicitation require that the agent maintain state about users: + +- Whether required information has been collected (e.g., the user's display name via form mode elicitation) +- Status of resource access (e.g., API keys or a payment flow via URL mode elicitation) + +Agents implementing elicitation MUST securely associate this state with individual users. Specifically: + +- State MUST NOT be associated with session IDs alone +- State storage MUST be protected against unauthorized access +- For remote agents, user identification MUST be derived from credentials acquired during authorization when possible (e.g., `sub` claim) + +Agents MUST NOT rely on client-provided user identification without agent-side verification, as this can be forged. + ## Frequently asked questions ### Can an agent request multiple pieces of information at once? @@ -593,13 +699,21 @@ From PR #330: URL-mode elicitation allows agents to request authentication witho - The agent MUST verify user identity to prevent phishing attacks **Security requirements** (from MCP draft spec): -- Agents MUST NOT include sensitive information in the URL -- Agents MUST NOT provide pre-authenticated URLs -- Agents SHOULD use HTTPS URLs -- Clients MUST NOT open URLs without explicit user consent -- Clients MUST show the full URL to the user before consent -- Clients MUST open URLs in a secure context that prevents inspection (e.g., SFSafariViewController on iOS, not WKWebView) -- Clients SHOULD highlight the domain to mitigate subdomain spoofing + +Agents requesting URL mode elicitation: +- MUST NOT include sensitive information about the end-user (credentials, PII, etc.) in the URL +- MUST NOT provide a URL which is pre-authenticated to access a protected resource +- SHOULD NOT include URLs intended to be clickable in any field of a form mode elicitation request +- SHOULD use HTTPS URLs for non-development environments + +Clients implementing URL mode elicitation: +- MUST NOT automatically pre-fetch the URL or any of its metadata +- MUST NOT open the URL without explicit consent from the user +- MUST show the full URL to the user for examination before consent +- MUST open the URL in a secure manner that does not enable the client or LLM to inspect the content or user inputs (e.g., SFSafariViewController on iOS, not WKWebView) +- SHOULD highlight the domain of the URL to mitigate subdomain spoofing +- SHOULD have warnings for ambiguous/suspicious URIs (e.g., containing Punycode) +- SHOULD NOT render URLs as clickable in any field of an elicitation request, except for the `url` field in a URL mode elicitation request (with the restrictions detailed above) **Phishing prevention**: The agent MUST verify that the user who started the elicitation request is the same user who completes the OAuth flow. This is typically done by checking session cookies against the user identity from the MCP authorization. @@ -644,7 +758,7 @@ This is the same reasoning behind keeping authentication flows (URL mode) distin ### What about validating user input on the client side? -Clients should validate user input against the provided JSON Schema **before** sending the response to the agent. This prevents invalid data from reaching the agent and provides immediate feedback to the user. +Clients SHOULD validate user input against the provided JSON Schema **before** sending the response to the agent. This prevents invalid data from reaching the agent and provides immediate feedback to the user. Agents SHOULD also validate received data matches the requested schema, as defense-in-depth against malformed or malicious responses. If the agent requires additional validation beyond what's expressible in JSON Schema: 1. Agent validates the received value in the next turn @@ -655,6 +769,7 @@ For v1, we recommend starting with JSON Schema validation only. If more complex ## Revision history +- 2026-02-06: Spec alignment review. Fixed OAuth URL examples to use agent connect endpoints (not direct OAuth provider URLs) per MCP phishing prevention guidance. Added normative requirements section (MUST support at least one mode, MUST NOT send unsupported modes, url MUST be valid). Added Error Handling section with `-32042` and `-32602` error codes. Added message flow diagrams (form mode, URL mode, URL mode with error). Expanded safe URL handling requirements (pre-fetch prohibition, Punycode warnings, non-clickable URLs in form fields). Added server-side schema validation SHOULD requirement. Added Statefulness subsection with normative requirements for state association and user identification. - 2026-02-05: Major revision to align with MCP draft elicitation specification. Updated enum schema to use `oneOf`/`anyOf` with `const`/`title` instead of `enumNames`. Added multi-select array support. Added `pattern` field for strings. Added URLElicitationRequiredError (-32042) section. Added completion notifications section. Expanded security considerations including phishing prevention. Updated all examples to match MCP draft spec format. - 2026-02-05: Initial MCP alignment. Removed explicit "input types" in favor of restricted JSON Schema (client decides rendering). Added `mode` field (`form`/`url`). Updated capability model to use `form`/`url` sub-objects per MCP SEP-1036. Added three-action response model (`accept`/`decline`/`cancel`). Removed `password` type (MCP prohibits sensitive data in form mode). - 2026-01-12: Initial draft based on community discussions in PR #340 (user selection), PR #210 (session config alignment), and PR #330 (authentication use cases). Aligned with MCP elicitation patterns.