SCAL-308282 Track polling and first response latency by kanwarbajwa · Pull Request #143 · thoughtspot/mcp-server

kanwarbajwa · 2026-05-07T20:53:07Z

Summary

Instrument the analysis polling bridge so we can measure backend readiness, client polling behavior, and sessions that never get polled.

What Changed

add shared analysis metric helpers for session counters, poll wait, first buffered update, first poll delay, first non-empty response, never-polled sessions, and storage errors
persist timing metadata in the ConversationStorageServer Durable Object and expose it through the storage client so one-shot lifecycle metrics are emitted once per response
record request-scoped analysis counters and poll-wait histograms in the MCP server analysis tool methods
add focused specs for the conversation storage timing state and the analysis tool metrics flow

Notes

first-buffered, first-poll, first-non-empty, and never-polled metrics are emitted from the storage Durable Object because it owns the cross-request conversation state
local design docs were updated for our internal reference but are not part of this commit

gemini-code-assist

Code Review

This pull request introduces comprehensive metrics tracking for analysis sessions, including session creation, message delivery, and polling performance. It adds a metrics state to the ConversationStorageServer to track timing events like the first buffered update and first non-empty response, and updates the MCPServer to record these metrics during tool execution. Feedback identifies an opportunity to refactor duplicated message retrieval logic in the storage server into a private helper method to improve maintainability.

Instrument the analysis polling bridge so we can measure backend readiness, client polling behavior, and sessions that never get polled. - add shared analysis metric helpers for session counters, poll wait, first buffered update, first poll delay, first non-empty response, never-polled sessions, and storage errors - persist timing metadata in the ConversationStorageServer Durable Object and expose it through the storage client so one-shot lifecycle metrics are emitted once per response - record request-scoped analysis counters and poll-wait histograms in the MCP server analysis tool methods - add focused specs for the conversation storage timing state and the analysis tool metrics flow - first-buffered, first-poll, first-non-empty, and never-polled metrics are emitted from the storage Durable Object because it owns the cross-request conversation state - local design docs were updated for our internal reference but are not part of this commit - npm run lint - npm test -- --coverage.enabled=false

Rifdhan · 2026-05-12T21:44:02Z

 					return new Response("Not Found", { status: 404 });
 			}
 		} catch (err) {
+			await this.recordStorageErrorMetric(this.toStorageOperation(operation));


Nit: prefer to do this after the console.error line, for two reasons:

Logging error in console is always reliable and we should do it first before other more complex error handling which can introduce further failure modes resulting in non-logging

This is async, so I want to log the error immediately first and foremost

Rifdhan · 2026-05-12T21:48:03Z

+		}
+	}
+
+	private toStorageOperation(operation: string): StreamStorageOperation {


Can be static

Rifdhan · 2026-05-12T21:48:50Z

+			case "messages":
+				return STREAM_STORAGE_OPERATIONS.messages;
+			default:
+				return STREAM_STORAGE_OPERATIONS.unknown;


It looks like we are never recording STREAM_STORAGE_OPERATIONS.alarm, can we remove it?

Rifdhan · 2026-05-12T21:50:52Z

- * conversationId, to ensure no users can access each other's conversations.
+ * The route path uses the conversation id. Isolation between users happens at the
+ * Durable Object id level by hashing the access token and combining it with the
+ * conversation id when constructing the stub name.


Why did we change these header comments? We are still using the storageId concept from my last change, there should be no change here.

Rifdhan · 2026-05-12T21:51:59Z

+					const body =
+						((await request
+							.json()
+							.catch(() => ({}))) as Partial<ConversationMetricsState>) ?? {};


Let's update the top-level header comment of this class to mention that we are passing metrics data in the request body

Rifdhan · 2026-05-12T21:53:36Z

+		const [existingIsDone, metricsState] = await Promise.all([
+			this.state.storage.get<boolean>(IS_DONE_KEY),
+			this.getConversationMetricsState(),
+		]);


Let's just call getConversationStateSnapshot() here and do away with getConversationMetricsState(). It feels unclean to be fetching 2 different keys in separate transactions.

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

Comment thread src/servers/conversation-storage-server.ts Outdated

kanwarbajwa force-pushed the SCAL-308282_FirstVisibleResponseLatency branch 3 times, most recently from 85306dd to aa8366a Compare May 8, 2026 20:18

Rifdhan reviewed May 8, 2026

View reviewed changes

Comment thread src/metrics/runtime/analysis-metrics.ts Outdated

Comment thread src/servers/mcp-server.ts Outdated

Comment thread src/servers/conversation-storage-server.ts Outdated

kanwarbajwa force-pushed the SCAL-308282_FirstVisibleResponseLatency branch 2 times, most recently from 2b05c6e to c06d6bc Compare May 12, 2026 02:37

kanwarbajwa force-pushed the SCAL-308282_FirstVisibleResponseLatency branch from c06d6bc to 4e1020f Compare May 12, 2026 03:53

Rifdhan reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SCAL-308282 Track polling and first response latency#143

SCAL-308282 Track polling and first response latency#143
kanwarbajwa wants to merge 1 commit into
mainfrom
SCAL-308282_FirstVisibleResponseLatency

kanwarbajwa commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Rifdhan May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kanwarbajwa commented May 7, 2026

Summary

What Changed

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Rifdhan May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants