-
Notifications
You must be signed in to change notification settings - Fork 146
RFD: Agent-to-Client Logging #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6fcc8fa
7565112
c09a6ca
230cf35
ede77c9
4b731e3
6809b53
2e2268d
0625898
cce5266
ac9c755
55cd96f
5945eed
cabe114
ada9f5e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,178 @@ | ||
| --- | ||
| title: "Agent-to-Client Logging" | ||
| --- | ||
|
|
||
| Author(s): [@chazcb](https://github.com/chazcb) | ||
|
|
||
| ## Elevator pitch | ||
|
|
||
| > What are you proposing to change? | ||
|
|
||
| Introduce a capability-gated `log` notification (agent → client) so agents can share diagnostic messages without polluting conversation history. | ||
|
|
||
| ## Status quo | ||
|
|
||
| > How do things work today and what problems does this cause? Why would we change things? | ||
|
|
||
| Today, agents have limited ways to inform clients about status that might impact their experience. The two options are: | ||
|
|
||
| 1. **JSON-RPC errors**: Terminate the request immediately with an informative error message the client can display to the user | ||
| 2. **`session/update`**: Update conversation history with diagnostic information in the `agent_message_chunk` or other chat history notification | ||
|
|
||
| But neither option works when: | ||
|
|
||
| - There's no active JSON RPC request to attach an error response to | ||
| - We don't want to fail the request (e.g., retries, rate limiting, fallback selection) | ||
| - There's no session yet (diagnostics after `initialize` but before `session/new`) | ||
| - We don't want to put diagnostics in chat history, or to force clients to filter non-chat content, or to fake chat content just to send diagnostic logs, etc. | ||
|
|
||
| Without a way to surface these situations, users can be left confused when their ACP connection or session seem to stall or behave unexpectedly. | ||
|
|
||
| ## What we propose to do about it | ||
|
|
||
| > What are you proposing to improve the situation? | ||
|
|
||
| Add a `log` JSON-RPC notification that is explicitly capability-gated. Clients opt in via `clientCapabilities.logging`; agents only send logs to clients that declare the capability. Clients can optionally specify a minimum log level. | ||
|
|
||
| ```json | ||
| { | ||
| "method": "initialize", | ||
| "params": { | ||
| "clientCapabilities": { | ||
| "logging": { | ||
| "level": "warning" | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| If `level` is omitted, agents should default to `info`. Agents MUST NOT send logs below the client's requested level. | ||
|
|
||
| ### Method | ||
|
|
||
| ```json | ||
| { | ||
| "jsonrpc": "2.0", | ||
| "method": "log", | ||
| "params": { | ||
| "level": "warning", | ||
| "message": "Backing model rate limited, retrying in 5 seconds...", | ||
| "sessionId": "abc-123", | ||
| "logger": "model", | ||
| "timestamp": "2025-01-21T10:30:00Z", | ||
| "data": { "model": "claude-3", "retryIn": 5 } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### Fields | ||
|
|
||
| | Field | Type | Required | Description | | ||
| |-------|------|----------|-------------| | ||
| | `level` | `LogLevel` | Yes | RFC 5424 severity: `debug`, `info`, `notice`, `warning`, `error`, `critical`, `alert`, `emergency` | | ||
| | `message` | `string` | Yes | Human-readable summary safe for display | | ||
| | `sessionId` | `SessionId` | No | Omit for connection-wide messages | | ||
| | `logger` | `string` | No | Component name (e.g., "model", "auth") | | ||
| | `timestamp` | `string` | No | ISO 8601 timestamp if provided | | ||
| | `data` | `object` | No | Opaque context (clients must not depend on structure) | | ||
| | `_meta` | `object` | No | Extensibility metadata | | ||
|
|
||
| ### Semantics | ||
|
|
||
| - **Capability-gated**: Agents MUST NOT send `log` notification to clients that did not declare `clientCapabilities.logging`. | ||
| - **Level filtering**: Agents MUST NOT send logs below the client's requested level (default: `info`). | ||
| - **Informational only**: Clients MAY display logs but MUST NOT treat them as protocol-affecting or control flow signals. | ||
| - **Best-effort delivery**: Logs are not reliable transport and are not replayed on reconnect. | ||
| - **Session optional**: `sessionId` is optional; omitted logs are connection-wide. | ||
| - **Manageable volume**: Implementations should keep volume low and user-relevant. | ||
|
|
||
| ### Method naming | ||
|
|
||
| `log` follows ACP's convention for connection-level operations (e.g., `initialize`, `authenticate`) rather than introducing a new namespace. | ||
|
|
||
| ## Alternatives considered | ||
|
|
||
| ### Add a notification type to `session/update` | ||
|
|
||
| Extend `session/update` with a new notification type for diagnostics. This keeps diagnostics within the existing session machinery but has drawbacks: it requires a session (can't send connection-wide diagnostics), risks polluting chat history unless clients explicitly filter, and overloads `session/update` with non-conversation concerns. Additionally, ACP specifies that session history is replayed on `session/load`, but diagnostic logs are transient and shouldn't be replayed—they're not part of the conversation. | ||
|
|
||
| ### Structured `status` notification | ||
|
|
||
| Instead of general-purpose logging, define a more structured `status` notification explicitly for lightweight status info—similar to Claude Code's interim status messages ("Thinking...", "Searching files..."). This would be scoped to either the current session or the agent/connection level. | ||
|
|
||
| **Tradeoffs**: More constrained semantics could be clearer for clients, but less flexible. Logging with severity levels is a well-understood pattern; inventing a new "status" abstraction may not add value over `log` with `level: info`. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense. I do feel like this more structured status would be super helpful, but understand that you are trying to get at a different use case here. I think the clients would benefit a lot from these more interactive status messages, but you are proposing something else here
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah this proposal is really about debug and / or error logging, specifically for developers trying to get a better sense of why something went wrong or what is happening under the hood. I would imagine some UIs might provide the "show logs" pane so end users can also debug, but this is really about getting the info to the client and not about structuring dynamic UIs.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool yeah then we are aligned here. Basically allows a client to offer some sort of log UI It does now beg the question for me though 😄 :
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (this can totally be a "we do this later" thing, just a thought I had) |
||
|
|
||
| ### Explicit progress or heartbeat notification | ||
|
|
||
| Define a `progress` or `heartbeat` notification specifically for long-running operations, with structured fields like `percentComplete`, `estimatedTimeRemaining`, etc. | ||
|
|
||
| **Tradeoffs**: Progress is better suited to `session/update` since it's about task state. Heartbeats could be useful but solve a different problem (connection liveness) than diagnostics. A `log` notification can express "retrying in 5s" without requiring structured progress semantics. | ||
|
|
||
| ### Transport-level mechanisms | ||
|
|
||
| Use HTTP headers, WebSocket ping payloads, or other transport-level channels for status. | ||
|
|
||
| **Tradeoffs**: ACP is transport-agnostic. Relying on transport-specific mechanisms would fragment implementations and lose capability negotiation. | ||
|
|
||
| ## Shiny future | ||
|
|
||
| > How will things play out once this feature exists? | ||
|
|
||
| - **Clear connection feedback**: Clients can surface warnings (rate limits, retries, fallbacks) so users understand what's happening. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regarding this: I think for clients to give a meaningful, user-facing status update, we may want to go with the more structured status approach? With the current proposal, speaking from the client perspective, I don't know that we would surface these as they would potentially be too noisy/unpredictable.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree this should get removed from proposal language and proposal should focus more around logging for client visibility into internals, rather than for UIs.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perfect thanks! |
||
| - **No more mysterious stalls**: Users see why things are slow (retries, rate limits) rather than assuming a hang. | ||
| - **Better developer experience**: Diagnostics are visible without requiring OTEL or external logging. | ||
| - **No compatibility risk**: Capability gating means legacy clients are unaffected. | ||
|
|
||
| ## Implementation details and plan | ||
|
|
||
| > Tell me more about your implementation. What is your detailed implementation plan? | ||
|
|
||
| 1. **Schema**: Add a `LogLevel` enum and a `LogNotification` params schema with the fields above. | ||
| 2. **Capabilities**: Add `clientCapabilities.logging` with optional `level` field for minimum severity filtering. | ||
| 3. **Protocol**: Add `log` to method tables and route it through notification handling. | ||
| 4. **Docs**: Update protocol docs and examples to show capability negotiation and sample logs. | ||
|
|
||
| ## Frequently asked questions | ||
|
|
||
| > What questions have arisen over the course of authoring this document or during subsequent discussions? | ||
|
|
||
| ### Why not use `session/update`? | ||
|
|
||
| `session/update` represents conversation state. Logs are diagnostic metadata and should not appear in chat history or require clients to filter out non-conversation content. `session/update` also can't represent connection-wide issues because it requires `sessionId`. | ||
|
|
||
| ### Why not send diagnostics as agent text messages? | ||
|
|
||
| Agent messages are persistent conversation content. Logs are ephemeral status and should not be reloaded or forked with the session. Agent text also lacks severity levels and would require ad-hoc parsing to separate real answers from diagnostics. | ||
|
|
||
| ### Why not use a separate channel (stderr, SSE side channel, etc.)? | ||
|
|
||
| ACP is transport-agnostic. A protocol-level log works uniformly across stdio, WebSocket, and HTTP, reuses capability negotiation, and allows optional session scoping without inventing a parallel channel. | ||
|
|
||
| ### Why not return errors on `prompt()`? | ||
|
|
||
| JSON-RPC errors terminate the request. Many conditions (rate limiting, retries, fallback selection) are non-fatal and should not end the run. Logs allow notification without aborting. | ||
|
|
||
| ### Are logs ordered relative to other notifications? | ||
|
|
||
| No strict guarantees. Implementations may keep logs ordered with other notifications for readability, but clients must treat them as best-effort informational messages. | ||
|
|
||
| ### Are logs replayed after reconnect? | ||
|
|
||
| No. Logs are not part of session state and are not replayed. | ||
|
|
||
| ### How does this relate to Agent Telemetry Export? | ||
|
|
||
| They are complementary: `log` is low-volume, user-facing diagnostics in-band; OTEL, as currently proposed, is for high-volume, developer/ops telemetry out-of-band. See `/docs/rfds/agent-telemetry-export`. | ||
|
|
||
| ### Is this a breaking change? | ||
|
|
||
| No. It is opt-in via capability negotiation; older clients won't receive notifications they don't understand. | ||
|
|
||
| ### Why RFC 5424 log levels instead of error/warning/info? | ||
|
|
||
| RFC 5424 is widely used and aligns with MCP and common logging libraries. Clients can map to simpler categories in their UI. | ||
|
|
||
| ## Revision history | ||
|
|
||
| - **2025-01-21**: Initial draft | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say as a notification we could just add this, but having some sort of filter totally makes sense here.
Another option would be the default filter level is "error" and the client can opt-in to allowing more?
This also gives me some pause as it feels like we are mixing configuration with capabilities, so I wonder if this needs to live somewhere else, but just a gut reaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, agree on mixing config and capabilities. At the same time, I don't like needing to add new methods for setting logging levels, especially when you want to init a session with a specific logging level.
For our debug logging, it's important that we can get logs immediately after initialization. Other ways to accomplish that:
Thoughts on those options?