Skip to content

[codex-analytics] replay protocol-native item timing#20239

Closed
rhan-oai wants to merge 2 commits intorhan/app-server-item-timingfrom
pr20239
Closed

[codex-analytics] replay protocol-native item timing#20239
rhan-oai wants to merge 2 commits intorhan/app-server-item-timingfrom
pr20239

Conversation

@rhan-oai
Copy link
Copy Markdown
Collaborator

@rhan-oai rhan-oai commented Apr 29, 2026

Why

Once protocol-native lifecycle timing is available on the public app-server surface, replay consumers need to preserve it instead of rebuilding partial legacy-only events. Keeping that adapter layer separate makes it clear which changes are about consumption rather than production or wire shape.

What changed

  • Carries native timing through the exec JSONL bridge.
  • Replays timestamped app-server items into the TUI legacy event path for command execution, collaboration, web search, image generation, and patch flows.
  • Updates TUI and exec tests so replayed items retain the new lifecycle fields.

Verification

  • cargo test -p codex-exec event_processor_with_json_output
  • cargo test -p codex-tui --no-run

Stack created with Sapling. Best reviewed with ReviewStack.

@rhan-oai rhan-oai force-pushed the pr20239 branch 4 times, most recently from 93b611d to a5f2318 Compare April 29, 2026 18:53
rhan-oai added a commit that referenced this pull request Apr 29, 2026
## Why

Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.

This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.

## What changed

- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.

## Follow-up

This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:

- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.

---

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* #18748
* #18747
* #17090
* #17089
* #20241
* #20239
* __->__ #17088
Base automatically changed from pr17088 to main April 29, 2026 19:56
@rhan-oai rhan-oai force-pushed the pr20239 branch 5 times, most recently from e238b73 to 51e71ee Compare April 29, 2026 22:54
@rhan-oai rhan-oai force-pushed the pr20239 branch 2 times, most recently from 5210475 to d5bedde Compare April 30, 2026 00:08
@rhan-oai rhan-oai force-pushed the pr20239 branch 2 times, most recently from 4dfe7b0 to e256fae Compare April 30, 2026 01:24
@rhan-oai rhan-oai changed the base branch from main to rhan/thread-analytics-state April 30, 2026 01:41
@rhan-oai rhan-oai force-pushed the rhan/thread-analytics-state branch from 3fc7386 to d877e6a Compare April 30, 2026 20:58
@rhan-oai rhan-oai force-pushed the pr20239 branch 4 times, most recently from 3ba9aa2 to effe7c6 Compare April 30, 2026 22:28
@rhan-oai rhan-oai changed the title [codex-analytics] add protocol-native item timestamps [codex-analytics] replay protocol-native item timing Apr 30, 2026
@rhan-oai rhan-oai changed the base branch from rhan/thread-analytics-state to rhan/app-server-item-timing April 30, 2026 22:29
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from 65ef7bc to 84e66f0 Compare April 30, 2026 22:57
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from 84e66f0 to 0cf1a11 Compare April 30, 2026 23:09
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from 0cf1a11 to e15f9f5 Compare April 30, 2026 23:32
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from e15f9f5 to e77df6b Compare May 1, 2026 00:17
rhan-oai added a commit that referenced this pull request May 1, 2026
## Why

Several analytics event families need the same per-thread attribution
state: the app-server client/runtime associated with a thread and, for
lifecycle-oriented events, the thread metadata captured during
initialization. Keeping connection ids and lifecycle metadata in
separate maps made each consumer rebuild the same thread context and
made subagent attribution harder to resolve consistently.

## What changed

- Replaces the separate thread connection and metadata maps with one
reducer-owned `threads` map.
- Routes guardian, compaction, turn-steer, and turn analytics through
shared thread-state lookups while preserving turn-origin attribution for
turn events and request-origin attribution for steer events.
- Lets newly observed spawned subagent threads inherit their parent
thread connection so later thread-scoped analytics can resolve through
the same state model.
- Adds regression coverage for standalone `SubAgentThreadStarted`
publication plus the `SubAgentSource::ThreadSpawn` parent fallback
through a thread-scoped consumer that depends on inherited connection
state.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300).
* #18748
* #18747
* #17090
* #17089
* #20239
* #20515
* #20514
* __->__ #20300
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch 3 times, most recently from 087ce59 to a862056 Compare May 1, 2026 04:49
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from a862056 to 158fe58 Compare May 1, 2026 06:14
@rhan-oai rhan-oai force-pushed the rhan/app-server-item-timing branch from 158fe58 to 27e04d5 Compare May 1, 2026 15:28
@rhan-oai
Copy link
Copy Markdown
Collaborator Author

rhan-oai commented May 1, 2026

Superseded by the collapsed two-PR timing split; the remaining compile-required changes now live in #20515.

@rhan-oai rhan-oai closed this May 1, 2026
manoelcalixto added a commit to Electivus/electivus-codex that referenced this pull request May 3, 2026
* [codex-analytics] ingest server requests and responses (openai#17088)

## Why

Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.

This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.

## What changed

- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.

## Follow-up

This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:

- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.

---

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* openai#18748
* openai#18747
* openai#17090
* openai#17089
* openai#20241
* openai#20239
* __->__ openai#17088

* [tool_suggest] Improve tool_suggest triggering conditions. (openai#20091)

## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`

* app-server: fix outgoing sender test setup (openai#20258)

## Why

[openai#17088](openai#17088) changed
`OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but
one `command_exec` test added earlier on `main` still called the old
one-argument constructor. That leaves current `main` failing to compile
in Bazel and argument-comment-lint jobs.

## What changed

- Pass `AnalyticsEventsClient::disabled()` to the missed
`OutgoingMessageSender::new` test call site in `command_exec.rs`.

## Verification

- `cargo test -p codex-app-server
timeout_or_cancellation_reports_cancellation_without_timeout_exit_code`

* [app-server] type client response payloads (openai#20050)

## Why

`pr17088` adds typed server-originated request/response plumbing, but
successful client responses are still erased into bare JSON-RPC `result`
values before app-server can make any typed decision about them.

This precursor PR keeps successful client responses typed until the
outgoing response seam. It is intentionally limited to
protocol/app-server plumbing so the analytics behavior change can review
separately on top.

## What changed

- Add `ClientResponsePayload` as the pre-serialization client response
body type.
- Route app-server successful response paths through the typed payload
seam while preserving existing handler-local analytics behavior.
- Keep `InterruptConversation` JSON-RPC-only because it has no
`ClientResponse` variant.
- Move the new payload conversion tests into a dedicated protocol test
module.

## Verification

- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server-protocol`

* Require remote plugin detail before uninstall (openai#19966)

## Summary
- Fetch remote plugin detail before sending the uninstall request.
- Use the detail response to derive the marketplace namespace and plugin
name for cache cleanup.
- Stop the uninstall before the backend POST if detail lookup fails, so
backend state and local cache state do not diverge.

## Testing
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo test -p codex-core-plugins`
- `git diff --check`

* [app-server] centralize client response analytics (openai#20059)

## Why

The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.

## What changed

- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`

* Fallback login callback port when default is busy (openai#19334)

## Summary
- Keep the preferred ChatGPT login callback port `1455` first.
- Preserve the existing `/cancel` recovery for stale Codex login
servers.
- Fall back to the registered localhost callback port `1457` when `1455`
remains unavailable.

## Why
Cursor and Codex Desktop both use the ChatGPT account login callback
server. On Windows, Cursor can already be listening on `127.0.0.1:1455`
/ `[::1]:1455`, causing Codex Desktop sign-in to fail with:

`Local callback port 1455 is already in use on this machine.`

Codex already attempted to cancel a stale Codex login server on that
port, but if the listener does not release the port, the old behavior
was to fail. The new behavior falls back to `1457`, which matches the
fixed redirect URI being registered server-side in
`openai/openai#863817`. This keeps the OAuth `redirect_uri` inside
Hydra's exact allow-list instead of choosing an arbitrary ephemeral
port.

## Validation
- `just fmt`
- `cargo test -p codex-login`
- `git diff --check HEAD~1..HEAD`

* [apps] Add apps MCP path override (openai#20231)

Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`

* docs: discourage `#[async_trait]` and `#[allow(async_fn_in_trait)]` (openai#20242)

## Why

We have run into two avoidable problems when introducing async trait
APIs in Rust:

- `#[async_trait]` has caused materially worse build times in this
repository.
- `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait
without spelling out whether the returned future is `Send`, which hides
an important part of the trait contract.

We already have a good example of the preferred alternative in
[openai#16630](openai#16630) /
[`3c7f013f9735`](openai@3c7f013f9735),
but that guidance currently lives only as prior art in the codebase.
This PR documents the rule in `AGENTS.md` so contributors are more
likely to follow the native RPITIT pattern before these two shortcuts
spread further.

## What Changed

- added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]`
and `#[allow(async_fn_in_trait)]`
- pointed contributors to the native RPITIT pattern with explicit `Send`
bounds on the returned future
- clarified that implementations may still use `async fn` when they
satisfy that trait contract

## Verification

- docs-only change; no tests run

* Escape turn metadata headers as ASCII JSON (openai#19620)

## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes openai#17468. Also addresses the duplicate WebSocket report in openai#19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`

* [mcp] Fix plugin MCP approval policy. (openai#19537)

Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`

* Add agent graph store interface (openai#19229)

## Summary

Persisted subagent parent/child topology currently leaks through
`StateRuntime`'s SQLite-specific thread-spawn helpers. This PR
introduces a narrow `AgentGraphStore` boundary so follow-up work can
route graph operations through a local or remote store without coupling
orchestration code directly to the state DB graph API.

## Changes

- Adds the new `codex-agent-graph-store` crate.
- Defines a flat `AgentGraphStore` trait for the v1 graph surface:
upsert edge, set edge status, list direct children, and list
descendants.
- Adds public graph types for `ThreadSpawnEdgeStatus`,
`AgentGraphStoreError`, and `AgentGraphStoreResult`.
- Implements `LocalAgentGraphStore` on top of an existing
`codex_state::StateRuntime`, preserving today's SQLite-backed
`thread_spawn_edges` behavior.
- Registers the crate in Cargo/Bazel metadata.

This PR only adds the local contract and implementation; call-site
migration and the remote gRPC store are left to the follow-up PRs in the
stack.

## Testing

- `cargo test -p codex-agent-graph-store`

The new unit tests cover local parity with the existing `StateRuntime`
graph methods, `Open`/`Closed` filtering, status updates, and stable
breadth-first descendant ordering.

---------

Co-authored-by: rhan-oai <rhan@openai.com>
Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: sayan-oai <sayan@openai.com>
Co-authored-by: xli-oai <xli@openai.com>
Co-authored-by: Alex Daley <adaley@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
Co-authored-by: Calixto <manoel.calixto.neto@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant