[codex-analytics] add item lifecycle timing#20514
Conversation
5c272b8 to
c6e0ea5
Compare
7a61984 to
88c1fb0
Compare
## Why Several analytics event families need the same per-thread attribution state: the app-server client/runtime associated with a thread and, for lifecycle-oriented events, the thread metadata captured during initialization. Keeping connection ids and lifecycle metadata in separate maps made each consumer rebuild the same thread context and made subagent attribution harder to resolve consistently. ## What changed - Replaces the separate thread connection and metadata maps with one reducer-owned `threads` map. - Routes guardian, compaction, turn-steer, and turn analytics through shared thread-state lookups while preserving turn-origin attribution for turn events and request-origin attribution for steer events. - Lets newly observed spawned subagent threads inherit their parent thread connection so later thread-scoped analytics can resolve through the same state model. - Adds regression coverage for standalone `SubAgentThreadStarted` publication plus the `SubAgentSource::ThreadSpawn` parent fallback through a thread-scoped consumer that depends on inherited connection state. ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300). * #18748 * #18747 * #17090 * #17089 * #20239 * #20515 * #20514 * __->__ #20300
d55b667 to
42cdca0
Compare
53976e4 to
96a73c3
Compare
92bf80c to
2d55681
Compare
2d55681 to
db23f16
Compare
73ff402 to
24af98e
Compare
24af98e to
4c982ac
Compare
owenlin0
left a comment
There was a problem hiding this comment.
let's see if we can make these timestamps non-optional. first at the core layer, then at app-server (if possible). want to give it a shot? this would be much nicer instead of keeping around optional forever 🙏
| pub item: TurnItem, | ||
| #[serde(default, skip_serializing_if = "Option::is_none")] | ||
| #[ts(optional)] | ||
| pub started_at_ms: Option<i64>, |
There was a problem hiding this comment.
let's make this non-optional. it's safe because we never persisted ItemStartedEvent in rollouts :)
| pub item: TurnItem, | ||
| #[serde(default, skip_serializing_if = "Option::is_none")] | ||
| #[ts(optional)] | ||
| pub completed_at_ms: Option<i64>, |
There was a problem hiding this comment.
Let's make this non-optional too. Can we do something like this?
fn unknown_completed_at_ms() -> i64 {
0
}
pub struct ItemCompletedEvent {
// other fields
// ...
#[serde(default = "unknown_completed_at_ms")]
pub completed_at_ms: i64,
}
I think we need this since we read EventMsg::ItemCompleted(TurnItem::Plan(_)) from rollout files, so we want to keep that deserializing properly with a serde(default).
There was a problem hiding this comment.
and i think it's totally fine to return 0 since app-server only exposes item/completed notification while streaming a thread live. APIs that return thread history doesn't use this.
fbff229 to
9250609
Compare
9f8032e to
6189df0
Compare
| pub thread_id: String, | ||
| pub turn_id: String, | ||
| /// Unix timestamp (in milliseconds) when this item lifecycle started. | ||
| pub started_at_ms: i64, |
There was a problem hiding this comment.
let's make sure to add the #[ts(type = "number")] annotation for this field
| pub thread_id: String, | ||
| pub turn_id: String, | ||
| /// Unix timestamp (in milliseconds) when this item lifecycle completed. | ||
| pub completed_at_ms: i64, |
| pub turn_id: String, | ||
| pub item: TurnItem, | ||
| // Old rollout files may not have this field. Default to 0 so they still | ||
| // deserialize after tightening the core event contract. |
There was a problem hiding this comment.
nit: can we call out ItemCompleted for PlanItem in particular? that's the only one that ever got persisted
owenlin0
left a comment
There was a problem hiding this comment.
small comments but overall lgtm!
a011365 to
22edd59
Compare
22edd59 to
7fd7c90
Compare
Why
Tool families already disagree on what their existing
durationfields mean, so lifecycle latency should live on the shared item envelope instead of being inferred from per-tool execution fields. Carrying that envelope through app-server notifications gives downstream consumers one reusable timing signal without pretending every tool has the same execution semantics.What changed
started_at_msto coreItemStartedEventvalues andcompleted_at_msto coreItemCompletedEventvalues.startedAtMson app-serveritem/startednotifications andcompletedAtMsonitem/completednotifications.Verification
cargo check -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec -p codex-app-server-clientcargo test -p codex-core --test all web_search_item_is_emittedcargo test -p codex-core --test all image_generation_call_event_is_emittedcargo test -p codex-app-server-protocolStack created with Sapling. Best reviewed with ReviewStack.