From 811bddd3b971f4483936a2df8390bd79ec2cc576 Mon Sep 17 00:00:00 2001 From: Martyn Davies Date: Fri, 15 May 2026 17:30:05 +0200 Subject: [PATCH] docs(analytics): add Analytics section and sidebar entry Adds 11 reference pages covering Zuplo Analytics: overview, access and entitlements, shared controls, seven tab pages (Requests, Origins, Consumers, Agents, AI Gateway, MCP Gateway, MCP Server), and reference pages for the metrics glossary and URL parameters. Wires the section into the sidebar between Monetization and Observability. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/analytics/access-and-entitlements.md | 71 +++++++++++ docs/analytics/overview.md | 65 ++++++++++ docs/analytics/reference/metrics-glossary.md | 92 ++++++++++++++ docs/analytics/reference/url-parameters.md | 66 ++++++++++ docs/analytics/shared-controls.md | 122 +++++++++++++++++++ docs/analytics/tabs/agents.md | 88 +++++++++++++ docs/analytics/tabs/ai-gateway.md | 67 ++++++++++ docs/analytics/tabs/consumers.md | 73 +++++++++++ docs/analytics/tabs/mcp-gateway.md | 78 ++++++++++++ docs/analytics/tabs/mcp-server.md | 73 +++++++++++ docs/analytics/tabs/origins.md | 82 +++++++++++++ docs/analytics/tabs/requests.md | 96 +++++++++++++++ sidebar.ts | 31 +++++ 13 files changed, 1004 insertions(+) create mode 100644 docs/analytics/access-and-entitlements.md create mode 100644 docs/analytics/overview.md create mode 100644 docs/analytics/reference/metrics-glossary.md create mode 100644 docs/analytics/reference/url-parameters.md create mode 100644 docs/analytics/shared-controls.md create mode 100644 docs/analytics/tabs/agents.md create mode 100644 docs/analytics/tabs/ai-gateway.md create mode 100644 docs/analytics/tabs/consumers.md create mode 100644 docs/analytics/tabs/mcp-gateway.md create mode 100644 docs/analytics/tabs/mcp-server.md create mode 100644 docs/analytics/tabs/origins.md create mode 100644 docs/analytics/tabs/requests.md diff --git a/docs/analytics/access-and-entitlements.md b/docs/analytics/access-and-entitlements.md new file mode 100644 index 00000000..20f77d5a --- /dev/null +++ b/docs/analytics/access-and-entitlements.md @@ -0,0 +1,71 @@ +--- +title: "Access and Entitlements" +sidebar_label: "Access & Entitlements" +--- + +## When to use this + +- Confirm whether your account can see advanced analytics. +- Find out how many days of history you have access to. +- Understand the trial banner or the demo mode link. + +## Plan requirements + +Advanced analytics must be enabled on your account. Without it, the Analytics +page shows an upsell view with a **Contact Sales** call-to-action and no charts. + +## Free trial + +New accounts with advanced analytics enabled get an automatic free trial. The +trial: + +- Runs for the same number of days as your account's retention window. +- Shows a banner across the top of the Analytics page: "You're on a {N}-day + preview of Advanced Analytics, {N} days left." +- Includes two call-to-actions: **View demo →** (loads the dashboard with sample + data) and **Contact sales**. + +Accounts on the legacy analytics version are not eligible for the trial. They +continue to use the previous experience. + +:::note + +The trial banner notes that the charts may look sparse if your account hasn't +yet generated much traffic. Use **View demo →** to see what a fully populated +dashboard looks like. + +::: + +## Data retention + +Each account has an analytics history window measured in days. The window +controls: + +- How far back you can scroll using the time-range picker. +- Which presets in the picker are available. Presets longer than your window are + locked with an **Upgrade for [preset]** tooltip. +- The maximum start and end values when you pick a custom range. + +If you need a longer window, contact your Zuplo account team. + +## Demo mode + +Append `?demo=true` to the Analytics URL, or click **View demo →** in the trial +banner, to switch into demo mode. In demo mode: + +- Charts and tables are populated with synthetic sample data. +- A persistent banner reads: "You're viewing the Advanced Analytics demo with + sample data. Your real analytics aren't shown here." + +Remove the `demo` parameter from the URL to return to your real data. + +## Scope: account vs project + +- **Account scope** aggregates across every project in the account. The Requests + tab adds **Project Name** and **Deployment Name** as breakdowns; click a + project name to drill into project scope. +- **Project scope** filters to a single project and adds an **Environment** + selector (Working Copy, Production, Preview, Other) in the top bar. + +See [Shared controls](./shared-controls.md) for how scope affects filters and +breakdowns. diff --git a/docs/analytics/overview.md b/docs/analytics/overview.md new file mode 100644 index 00000000..15ecab4f --- /dev/null +++ b/docs/analytics/overview.md @@ -0,0 +1,65 @@ +--- +title: "Analytics" +sidebar_label: "Overview" +--- + +Zuplo Analytics is the dashboard inside the Zuplo portal that shows how traffic +moves through your gateway: request volume, latency, errors, who's calling you, +and (when relevant) AI gateway and MCP gateway activity. It's the page you open +when something looks off in production, when you're auditing spend, or when +you're answering "is anyone actually using this endpoint?" + +## When to use this + +- Investigate a latency spike or error surge across all projects in your + account, or inside a single project. +- Identify which API consumers, AI agents, or upstream origins drive the most + traffic or errors. +- Track AI gateway token usage and cost, or MCP gateway and server activity. + +## How to access + +Open **Analytics** in the Zuplo portal sidebar. The page works at two scopes: + +- **Account scope**: aggregates across every project in your account. Open + [Account Analytics](https://portal.zuplo.com/+/account/analytics). +- **Project scope**: open a project, then click **Analytics**. Filters to one + project and adds an **Environment** selector. + +## What's in this section + +- [Access and entitlements](./access-and-entitlements.md): plans, free trial, + demo mode, retention. +- [Shared controls](./shared-controls.md): time range, filters, environment + selector, banners, URL state. +- Tabs: + - [Requests](./tabs/requests.md): overall traffic, latency, errors. + - [Origins](./tabs/origins.md): backend performance. + - [Consumers](./tabs/consumers.md): per-consumer breakdowns. + - [Agents](./tabs/agents.md): classified AI agent traffic. + - [AI Gateway](./tabs/ai-gateway.md): LLM request volume, tokens, cost. + - [MCP Gateway](./tabs/mcp-gateway.md): virtual server routing, capability + invocations, upstream health. + - [MCP Server](./tabs/mcp-server.md): tool calls, resources, prompts on + Zuplo-hosted MCP servers. +- Reference: + - [Metrics glossary](./reference/metrics-glossary.md): every KPI and + percentile defined once. + - [URL parameters](./reference/url-parameters.md): permalink reference. + +## Tab visibility + +You'll see a subset of tabs depending on your plan and project setup: + +| Tab | When it appears | +| ----------- | --------------------------------------------------------------------- | +| Requests | All accounts with advanced analytics enabled. | +| Origins | The project uses managed-edge origins. | +| Consumers | All accounts with advanced analytics enabled. | +| Agents | All accounts with advanced analytics enabled. | +| AI Gateway | The project type is **ai**. | +| MCP Gateway | The project type is **standard** and an MCP gateway is in use. | +| MCP Server | The project type is **standard** and the project hosts an MCP server. | + +If you don't see Analytics at all, your account likely doesn't have advanced +analytics enabled. See [Access and entitlements](./access-and-entitlements.md). diff --git a/docs/analytics/reference/metrics-glossary.md b/docs/analytics/reference/metrics-glossary.md new file mode 100644 index 00000000..031b4c75 --- /dev/null +++ b/docs/analytics/reference/metrics-glossary.md @@ -0,0 +1,92 @@ +--- +title: "Metrics Glossary" +sidebar_label: "Metrics Glossary" +--- + +This page defines every term used in the Analytics dashboards once. KPI tables +on tab pages link here for depth. + +## HTTP status classes + +| Class | Meaning | +| ----- | ----------------------------------------------------------------------------------------------- | +| 2xx | Success. | +| 3xx | Redirection. | +| 4xx | Client error. The caller sent something the gateway or backend rejected. | +| 5xx | Server error. The gateway, an upstream origin, or an MCP backend failed to fulfill the request. | + +## Error rates + +**Client error rate.** 4xx count divided by total requests in the window, +expressed as a percentage. + +**Server error rate.** 5xx count divided by total requests in the window. + +**Request-weighted average.** When aggregating a rate across many entities +(consumers, agents, origins), each entity's rate is weighted by its request +count. A consumer with 100,000 requests at a 1% error rate contributes more than +a consumer with 100 requests at a 50% error rate. Use the request-weighted +figure to answer "what does the average request experience look like?"; use a +simple unweighted average to answer "what does the average consumer experience +look like?" + +## Latency + +**Avg latency.** Arithmetic mean response time. Sensitive to outliers. + +**P50 (median) latency.** Half of requests completed within this time. + +**P95 latency.** 95% of requests completed within this time. The other 5% took +longer. P95 is the standard tail-latency metric. + +**P99 latency.** 99% of requests completed within this time. Useful for spotting +outlier behavior that P95 may smooth over. + +**Latency distribution histogram.** Bands at P10, P50, P90, P95, P99. Clicking a +band on the Requests tab filters to requests in that duration range. + +## Active edge instances + +Distinct gateway worker instances actively serving traffic in each interval. A +rough indicator of how widely your traffic is distributed. + +## Active sessions (MCP Server) + +Distinct MCP sessions, estimated using HyperLogLog. The figure is approximate +but monotonic within a single time window. Accurate enough for trend analysis, +not for exact session counting. + +## Failure origin + +Classifies an error by where it originated: + +| Origin | Meaning | +| -------- | ---------------------------------------------------------- | +| gateway | The Zuplo gateway returned the error. | +| upstream | A backend origin or MCP server returned the error. | +| client | The client sent something invalid that caused the failure. | + +## Outcome class + +Used on MCP Gateway events: + +| Class | Meaning | +| ----------------- | -------------------------------------------------------------------- | +| success | Event completed normally. | +| application_error | Event failed due to an application-layer issue (e.g. invalid input). | +| gateway_error | The gateway itself returned an error. | +| upstream_error | An upstream MCP server returned an error. | + +## Tokens (AI Gateway) + +| Type | Meaning | +| ---------- | --------------------------------------------------------- | +| Prompt | Tokens in the request the gateway forwarded to the model. | +| Completion | Tokens in the model's response. | +| Embedding | Tokens consumed by embedding requests. | + +## Estimated cost (AI Gateway) + +Computed from token usage × the model's published pricing. Does not include +discounts, credits, or provider-side rounding. Use it for trend analysis, not +invoice reconciliation. diff --git a/docs/analytics/reference/url-parameters.md b/docs/analytics/reference/url-parameters.md new file mode 100644 index 00000000..7634a371 --- /dev/null +++ b/docs/analytics/reference/url-parameters.md @@ -0,0 +1,66 @@ +--- +title: "URL Parameters" +sidebar_label: "URL Parameters" +--- + +Every Analytics control persists to the URL. Copy the address bar to share any +view. + +## When to use this + +- Build a permalink to a specific time window, filter set, or demo view. +- Embed an Analytics link in a runbook, postmortem, or dashboard. +- Understand what each query parameter does. + +## Parameters + +| Parameter | Example | Effect | +| -------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------------------- | +| `time` | `?time=7d` | Apply a preset. Values: `1h`, `6h`, `24h`, `3d`, `7d`, `14d`, `28d`, `60d`, `90d`. | +| `start`, `end` | `?start=2026-05-01T00:00:00Z&end=2026-05-15T00:00:00Z` | Custom range as ISO-8601 datetimes. Overrides `time` when both are present. | +| `filter` | `?filter=httpStatus:class:5xx` | Add a filter as `::`. Repeat the parameter for multiple filters. | +| `demo` | `?demo=true` | Demo mode (sample data instead of your real analytics). | +| `preview` | `?preview=1` | Legacy preview mode. | + +## Match modes for `filter` + +| Mode | Meaning | Example | +| ---------- | --------------------- | ---------------------------------- | +| equals | Exact match. | `filter=httpMethod:equals:POST` | +| contains | Substring match. | `filter=route:contains:/v1/users` | +| in | Comma-separated list. | `filter=httpStatus:in:500,502,503` | +| not | Negation of equals. | `filter=country:not:US` | +| class | HTTP status class. | `filter=httpStatus:class:5xx` | +| startsWith | String prefix. | `filter=route:startsWith:/v1/` | +| endsWith | String suffix. | `filter=route:endsWith:.json` | + +## Permalink examples + +Last 7 days of 5xx errors on a specific route: + +``` +?time=7d&filter=httpStatus:class:5xx&filter=route:startsWith:/v1/users +``` + +Custom range with two filters: + +``` +?start=2026-05-01T00:00:00Z&end=2026-05-08T00:00:00Z&filter=country:equals:US&filter=httpMethod:equals:POST +``` + +Open the demo: + +``` +?demo=true +``` + +## Sharing + +The recipient sees the same view, provided they have access to the project or +account. + +## See also + +- [Shared controls](../shared-controls.md): what each control does in the UI. +- [Metrics glossary](./metrics-glossary.md): definitions for the fields you can + filter on. diff --git a/docs/analytics/shared-controls.md b/docs/analytics/shared-controls.md new file mode 100644 index 00000000..645d1427 --- /dev/null +++ b/docs/analytics/shared-controls.md @@ -0,0 +1,122 @@ +--- +title: "Shared Controls" +sidebar_label: "Shared Controls" +--- + +Every Analytics tab uses the same set of controls at the top of the page: a time +range picker, a filter bar, and (at project scope) an environment selector. +State persists to the URL so you can share or bookmark any view. + +## When to use this + +- Narrow a tab to a time window, environment, or set of filter values. +- Build a shareable link to a specific view. +- Understand what each banner across the top of the page means. + +## Time range + +The time range picker controls every chart, table, and KPI on the active tab. + +**Presets.** Last 1h, 6h, 24h, 3d, 7d, 14d, 28d, 60d, 90d. + +**Custom range.** Use the datetime-local inputs for **Start** and **End**. Both +fields are clamped to your account's retention window. + +**Locked presets.** Presets longer than your retention window show an **Upgrade +for [preset]** tooltip. See +[Access and entitlements](./access-and-entitlements.md). + +## Filters + +Filters render as removable pills in a sticky bar at the top of the tab. Add a +filter from any breakdown table by clicking a value, or build one manually. + +**Match modes.** Each filter uses one of: + +| Mode | Meaning | +| ---------- | ----------------------------------- | +| equals | Exact match. | +| contains | Substring match. | +| in | Value is in a comma-separated list. | +| not | Negation of equals. | +| class | HTTP status class (e.g. `5xx`). | +| startsWith | String prefix. | +| endsWith | String suffix. | + +**Clearing.** Remove a single pill with its **×**, or click **Clear all +filters** to reset. + +**Disabled fields.** Some fields are grayed out on tabs where they don't apply. +For example, `originHost` is unavailable on Requests, Consumers, and Agents; +`userSub` is unavailable on Origins. + +## Environment selector + +The environment selector appears only at project scope. It's a dropdown grouped +as: + +- **Working Copy** +- **Production** +- **Preview** +- **Other** + +Each environment shows a request count next to its name. The active selection +appears as a blue pill in the top bar. + +## Account vs project scope + +See +[Access and entitlements](./access-and-entitlements.md#scope-account-vs-project) +for how scope affects available breakdowns and the environment selector. + +## URL state and permalinks + +Every control persists to the URL. To share a view, copy the address bar. +There's no separate share button. + +| Parameter | Example | Effect | +| -------------- | ------------------------------------------------------ | ------------------------------------------------------- | +| `time` | `?time=7d` | Apply a preset. | +| `start`, `end` | `?start=2026-05-01T00:00:00Z&end=2026-05-15T00:00:00Z` | Custom range. Overrides `time`. | +| `filter` | `?filter=httpStatus:class:5xx` | Add a filter. Repeat the parameter for multiple values. | +| `demo` | `?demo=true` | Demo mode (sample data). | +| `preview` | `?preview=1` | Legacy preview mode. | + +See [URL parameters](./reference/url-parameters.md) for the full reference. + +## Refresh + +A spinning loader appears in the sticky bar while data refetches, and a +semi-transparent **Updating…** overlay covers the content area. There's no +manual refresh button and no auto-refresh interval. Change a control to trigger +a refetch. + +## Banners + +Banners appear at the top of the page in this priority order: + +1. **Preview banner**: when `preview=1` is set. Indicates legacy preview mode. +2. **Demo banner**: when `demo=true` is set. Reminds you sample data is shown + instead of your real analytics. +3. **Trial banner**: for new accounts with advanced analytics. Shows days + remaining and offers **View demo →** and **Contact Sales**. + +## Loading and empty states + +Each tab uses a shape-aware skeleton while the first request is in flight. The +product analytics tabs (AI Gateway, MCP Gateway, MCP Server) suppress that +skeleton briefly to avoid flashing when data is already cached. Empty states on +those tabs include a short description and a "Read the … docs" link to the +relevant product section. + +## Status colors + +The same color palette is used across every chart that breaks down by HTTP +status class: + +| Class | Color | +| ----- | ----- | +| 2xx | Green | +| 3xx | Blue | +| 4xx | Amber | +| 5xx | Red | diff --git a/docs/analytics/tabs/agents.md b/docs/analytics/tabs/agents.md new file mode 100644 index 00000000..a1359dda --- /dev/null +++ b/docs/analytics/tabs/agents.md @@ -0,0 +1,88 @@ +--- +title: "Agents" +sidebar_label: "Agents" +--- + +The **Agents** tab isolates AI agent traffic: requests classified as coming from +ChatGPT, Claude.ai, Cursor, GPTBot, and similar clients. It's a focused view; +browsers, webhooks, and generic SDK callers are excluded. + +## When to use this + +- See which AI agents are calling your API and how much volume they generate. +- Catch agent-specific error patterns. For example, one agent that fails CORS or + returns 4xx more often than the others. +- Compare latency experience across agents. + +## Summary KPIs + +| Name | What it measures | +| ----------------- | ---------------------------------------------------------------------------- | +| **Requests** | Total agent-classified requests. Excludes browsers, webhooks, generic SDKs. | +| **Client Errors** | Request-weighted 4xx rate across agents. | +| **Server Errors** | Request-weighted 5xx rate. Secondary: count of agents with at least one 5xx. | +| **Agents** | Distinct classified agents seen in the window. | +| **Total Errors** | Combined 4xx + 5xx count. Secondary: agents affected. | + +## Charts + +**Request Volume.** Stacked bars by status class. Granularity is always hourly +on this tab. + +**Agent Error Rates.** 4xx and 5xx over time. _What to look for:_ divergence +between agents is the headline signal. If Cursor shows a 12% 4xx rate while +ChatGPT sits at 2%, the issue is almost certainly specific to how Cursor calls +your endpoint. + +**Agent Latency Over Time.** P50, P95, P99 lines. + +## Agent table + +| Column | Notes | +| --------------- | -------------------------------- | +| Agent | Classified agent name. | +| Requests | Count with an inline volume bar. | +| Client Errors % | 4xx percentage. | +| Server Errors % | 5xx percentage. | +| Avg / P95 / P99 | Latency percentiles. | +| 4xx sparkline | Inline trend over the window. | +| 5xx sparkline | Inline trend over the window. | + +Searchable and sortable on any column. Click a row to filter the tab to that +agent. **Show more** loads the next 50. + +## Classified agents + +The classifier currently recognizes: ChatGPT, Claude.ai, Cursor, Claude Code, +GPTBot, Perplexity, Cline, Continue, OpenAI SDK, Anthropic SDK, Google AI, +Common Crawl. The list expands over time. + +Unclassified traffic is excluded from the Agents tab. + +:::warning + +Agent charts use a dedicated hourly rollup. Filtering other tabs by agent isn't +supported. Use the Agents tab to drill into an individual agent. + +::: + +## Filters + +The filter bar applies. `originHost` is not applicable here. See +[Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**The Agents tab is empty.** Either no classified agents called your gateway in +the window, or your retention window doesn't yet include any agent traffic. Try +the demo with **View demo →** in the trial banner to see what a populated tab +looks like. See [Access and entitlements](../access-and-entitlements.md). + +**I see a known agent in my logs but not here.** The classifier is conservative; +it labels traffic that clearly matches a known agent fingerprint. Generic SDK +traffic that doesn't identify itself is excluded. If you believe an agent should +be classified, send the User-Agent string to your Zuplo contact. + +**An agent shows zero requests but appears in the table.** Filters on the rest +of the tab may be excluding its traffic for the current window. Clear filters to +verify. diff --git a/docs/analytics/tabs/ai-gateway.md b/docs/analytics/tabs/ai-gateway.md new file mode 100644 index 00000000..1ea93eb4 --- /dev/null +++ b/docs/analytics/tabs/ai-gateway.md @@ -0,0 +1,67 @@ +--- +title: "AI Gateway" +sidebar_label: "AI Gateway" +--- + +The **AI Gateway** tab shows LLM traffic flowing through Zuplo's AI Gateway: +request volume, token usage, estimated cost, model and provider distribution, +latency, cache effectiveness, and blocked-request reasons. It's visible when the +project type is **ai**. + +## When to use this + +- Audit AI spend by model or provider. +- Compare cache hit rate before and after enabling caching. +- Investigate why requests are being blocked by your guardrails. + +## Summary KPIs + +| Name | What it measures | +| ------------------ | ---------------------------------------------------------- | +| **Total Requests** | All AI gateway requests in the window. | +| **Total Tokens** | Sum across requests. Secondary: prompt / completion split. | +| **Estimated Cost** | Computed from model pricing × token usage. | +| **Median Latency** | P50 across all AI gateway requests. | + +## Charts + +**Request Time Series.** Three series in one chart: requests, tokens, and cost +over the window. + +**Model Usage.** Stacked bars by model with a sidebar legend showing top models +by share. Click a model in the legend to highlight it; the others fade. + +**Token Breakdown.** A donut split of prompt / completion / embedding tokens, +plus a time series of the same. + +**Provider Breakdown.** A donut and time series by provider, plus a +top-providers list. + +**Latency Distribution.** Histogram of P10, P50, P90, P95, P99. + +**Latency Over Time.** P50, P95, P99 lines. + +**Cache Hit Rate.** Hits vs misses over time, with a summary hit rate. _What to +look for:_ a stable hit rate above your target after enabling caching means +semantic caching is working as configured. + +**Blocked Requests.** Donut and time series by block reason type. Useful when +guardrails or quota policies are doing meaningful work. + +## Filters + +The filter bar applies. See [Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**The AI Gateway tab is empty.** No AI Gateway traffic has been recorded in the +selected window. Start proxying requests through the AI Gateway and the charts +populate automatically. + +**Estimated cost doesn't match my provider bill.** Estimated cost is computed +from token usage and published pricing. It excludes discounts and credits. See +[Metrics glossary](../reference/metrics-glossary.md#estimated-cost-ai-gateway). + +**Cache hit rate is 0%.** Either caching isn't enabled on the route, or every +request was unique enough that no entry matched. Check your AI Gateway cache +configuration. diff --git a/docs/analytics/tabs/consumers.md b/docs/analytics/tabs/consumers.md new file mode 100644 index 00000000..f982a90d --- /dev/null +++ b/docs/analytics/tabs/consumers.md @@ -0,0 +1,73 @@ +--- +title: "Consumers" +sidebar_label: "Consumers" +--- + +The **Consumers** tab breaks traffic down by API consumer: anyone calling your +gateway, whether authenticated or anonymous. Use it to see who your noisiest +callers are, who's hitting errors, and which consumers experience the slowest +latency. + +## When to use this + +- Find the top API consumers by request volume. +- Identify which consumer is responsible for a 4xx or 5xx surge. +- Compare latency experience across consumers (for example, paid vs free tier). + +## Summary KPIs + +| Name | What it measures | +| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | +| **Requests** | Total requests across all consumers in the window. | +| **Client Errors** | Request-weighted 4xx rate across consumers (high-traffic consumers count more). See [Metrics glossary](../reference/metrics-glossary.md). | +| **Server Errors** | Request-weighted 5xx rate. Secondary: count of consumers with at least one 5xx. | +| **Consumers** | Distinct consumers (authenticated plus anonymous). | +| **Total Errors** | Combined 4xx + 5xx count. Secondary: consumers affected. | + +## Charts + +**Request Volume.** Stacked bars by status class. The chart title updates to +reflect the active consumer filter so you can tell at a glance whether you're +looking at one consumer or all of them. + +**Consumer Error Rates.** 4xx and 5xx over time. _What to look for:_ a sustained +4xx rate from one consumer usually points to a broken integration on their side. + +**Consumer Latency Over Time.** P50, P95, P99 lines. + +## Consumer table + +| Column | Notes | +| --------------- | ------------------------------------------------------------------- | +| User | Consumer identity. Anonymous requests show **Anonymous · No auth**. | +| Requests | Count with an inline volume bar. | +| Client Errors % | 4xx percentage. | +| Server Errors % | 5xx percentage. | +| Avg / P95 / P99 | Latency percentiles. | +| 4xx sparkline | Inline trend over the window. | +| 5xx sparkline | Inline trend over the window. | + +The table is searchable and sortable on any column (default: requests +descending). Clicking a row filters the entire tab to that consumer. **Show +more** loads the next 50. + +## Filters + +The filter bar applies. `originHost` is not applicable on this tab. See +[Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**Everything is showing as Anonymous.** If your gateway isn't authenticating +requests, or your auth policy isn't attaching a consumer identity, every request +falls into the **Anonymous · No auth** bucket. Check your API key or JWT policy +configuration. + +**I clicked a row but the charts didn't change.** A row click adds a consumer +filter pill. If you don't see the pill in the sticky bar, your click landed on a +non-row element. Try clicking the user cell directly. + +**The 5xx rate here is higher than on Requests.** The Consumers KPI is +request-weighted across consumers, while the Requests KPI is a flat rate over +all requests. They diverge when high-error consumers are a small share of total +volume. See [Metrics glossary](../reference/metrics-glossary.md). diff --git a/docs/analytics/tabs/mcp-gateway.md b/docs/analytics/tabs/mcp-gateway.md new file mode 100644 index 00000000..5ee1ae9a --- /dev/null +++ b/docs/analytics/tabs/mcp-gateway.md @@ -0,0 +1,78 @@ +--- +title: "MCP Gateway" +sidebar_label: "MCP Gateway" +--- + +The **MCP Gateway** tab shows server-side traffic through Zuplo's MCP gateway: +OAuth flows, auth and policy decisions, virtual-server routing, capability +invocations, and upstream MCP server health. It's visible when the project type +is **standard** and an MCP gateway is in use. + +## When to use this + +- See which virtual servers and capabilities are being exercised, and by whom. +- Track auth and policy decision outcomes. +- Identify whether failures originate in the gateway, the upstream, or the + client. + +## MCP Gateway vs MCP Server + +This tab is about traffic _to_ an MCP fleet via Zuplo's gateway. If you're +looking for what happened _inside_ an MCP server you host on Zuplo (tool calls, +JSON-RPC methods), see the [MCP Server](./mcp-server.md) tab. Some accounts see +both tabs; some see only one. + +## Summary KPIs + +| Name | What it measures | +| ------------------- | ---------------------------------------------------------------------------------------------- | +| **Events** | Total MCP Gateway events in the window. | +| **Success Rate** | Share of events with outcome = success. Secondary: success / error split. | +| **p95 Latency** | Total P95. Secondary: gateway-vs-upstream split, useful for telling where time is being spent. | +| **Failure Origins** | Sum of gateway + upstream + client failure counts. | + +See [Metrics glossary](../reference/metrics-glossary.md) for the failure-origin +and outcome-class definitions. + +## Charts + +**Events Time Series.** Stacked by top event types. + +**Event Family Donut.** Distribution across families: `mcp_request`, +`capability_invocation`, `auth_event`, `upstream_request`, `policy_decision`, +`control_plane_audit`. + +**Latency Split.** Total, gateway, and upstream P50 / P95 / P99 over time. _What +to look for:_ a P95 driven entirely by the upstream slice points to a slow MCP +backend; a gateway-heavy P95 points to policy or auth overhead. + +## Breakdown tables + +| Table | Columns | +| -------------------- | ------------------------------------------------------- | +| Top Capabilities | Capability, Type, Calls, Errors (count + %), P95. | +| Top Virtual Servers | Virtual Server, Events, Errors. | +| Top Upstream Servers | Upstream, Events, Errors, P95 upstream latency. | +| Top Clients | Client, Kind (from the `initialize` handshake), Events. | +| MCP Methods | Method, Events. | +| Upstream Auth Modes | Auth Mode, Events. | +| Failure Origins | Origin layer (gateway / upstream / client), Errors. | +| Top Reason Codes | Class, Code, Events, Errors. | + +## Filters + +The filter bar applies. See [Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**The MCP Gateway tab is empty.** No MCP Gateway events have been recorded in +the selected window. Once a client connects and invokes a capability, the +dashboard populates. + +**I don't see this tab.** Visibility requires project type **standard** and an +MCP gateway in use. If you're hosting an MCP server on Zuplo instead, look for +the [MCP Server](./mcp-server.md) tab. + +**Errors show but Failure Origins is empty.** Failure origins are classified +server-side from event metadata. Events without a clear origin classification +are counted in Errors but not in any of the gateway / upstream / client buckets. diff --git a/docs/analytics/tabs/mcp-server.md b/docs/analytics/tabs/mcp-server.md new file mode 100644 index 00000000..06b7131c --- /dev/null +++ b/docs/analytics/tabs/mcp-server.md @@ -0,0 +1,73 @@ +--- +title: "MCP Server" +sidebar_label: "MCP Server" +--- + +The **MCP Server** tab shows what happens inside MCP servers hosted on Zuplo: +tool invocations, resource reads, prompt gets, JSON-RPC method usage, transport +mix, and per-tool latency. It's visible when the project type is **standard** +and the project hosts an MCP server. + +## When to use this + +- Find the slowest or most-called tools. +- See which transport (stdio, HTTP, SSE) and which clients dominate traffic. +- Investigate JSON-RPC error codes returned to clients. + +## MCP Server vs MCP Gateway + +This tab is about activity inside MCP servers you host on Zuplo. If you're +looking for the server-side picture of traffic flowing through Zuplo's MCP +gateway (auth, routing, upstream health), see the +[MCP Gateway](./mcp-gateway.md) tab. + +## Summary KPIs + +| Name | What it measures | +| ------------------- | ------------------------------------------------------------------------------------------------------------------------- | +| **Tool Calls** | Total tool invocations in the window. Secondary: resource reads and prompt gets. | +| **Active Sessions** | Distinct MCP sessions (approximate, estimated via HyperLogLog). See [Metrics glossary](../reference/metrics-glossary.md). | +| **Error Rate** | Share of tool calls returning an application, gateway, or upstream error. | +| **p95 Latency** | P95 across all tool calls. | + +## Charts + +**Calls Time Series.** Four series in one chart: tool calls, resource reads, +prompt gets, and session starts. + +**Three donuts in a row.** + +- **JSON-RPC Methods**: distribution across the methods clients invoke. +- **Transport**: stdio / http / sse split. +- **Clients**: top clients by name from the `initialize` handshake. + +**Latency Percentiles Card and Latency Time Series.** Summary card plus P50 / +P95 / P99 over time. + +## Tables + +**Top Tools.** Tool name, Calls, Errors (count + %), and P50 / P95 / P99 +latency. The fastest way to find a slow or noisy tool. + +**Three list panels.** + +- **Top Resources Read**: URI, optional name, reads. +- **Top Prompts**: prompt, gets. +- **JSON-RPC Error Codes**: label, count. + +## Filters + +The filter bar applies. See [Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**The MCP Server tab is empty.** No MCP Server traffic has been recorded in the +selected window. Invoke a tool from a client and the dashboard populates. + +**Active sessions count looks too round.** Active sessions are estimated with +HyperLogLog. Accurate at scale, but the figure is approximate and may not +exactly match a count of unique session IDs. + +**I don't see this tab.** Visibility requires project type **standard** and an +MCP server hosted by the project. If you're consuming an MCP fleet through +Zuplo's gateway instead, look for the [MCP Gateway](./mcp-gateway.md) tab. diff --git a/docs/analytics/tabs/origins.md b/docs/analytics/tabs/origins.md new file mode 100644 index 00000000..f6f115c6 --- /dev/null +++ b/docs/analytics/tabs/origins.md @@ -0,0 +1,82 @@ +--- +title: "Origins" +sidebar_label: "Origins" +--- + +The **Origins** tab shows backend performance: how each upstream host you proxy +to is performing in terms of volume, error rate, and latency. It's visible when +the project uses managed-edge origins. + +## When to use this + +- Identify which backend is slow or returning errors. +- Compare the latency contribution of DNS, TCP, TLS, and application time. +- Audit traffic distribution across direct origins and service tunnels. + +## Summary metrics + +The header strip shows totals derived from the time series: + +| Name | What it measures | +| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | +| Total requests | All requests served against any origin in the window. | +| 4xx rate | Client error rate across all origins. | +| 5xx rate | Server error rate across all origins. | +| Weighted avg latency | Origin response time weighted by request count, so high-traffic origins dominate. See [Metrics glossary](../reference/metrics-glossary.md). | + +## Charts + +**Backend Request Time Series.** Stacked bars by status class, aggregated across +origins by default. Apply a host filter to scope to one origin. + +**Backend Latency.** Average and P95 over time. _What to look for:_ a P95 climb +while the average stays flat usually points to a few slow origins or routes +inside an otherwise healthy fleet. + +**Backend Error Rate.** 4xx and 5xx rates over time. + +**Request Lifecycle.** Stacked time spent in each phase of an origin request: +**DNS time**, **TCP time**, **TLS time**, and **application time**. A high TLS +slice indicates handshake overhead; a high application slice indicates the +origin is slow. + +## Tables + +Two tables sit side by side in a 2-column grid. + +### Direct Origins + +| Column | Notes | +| --------------- | -------------------------------- | +| Host | The origin hostname. | +| Requests | Count with an inline volume bar. | +| Client Errors | 4xx percentage. | +| Server Errors | 5xx percentage. | +| Avg / P95 / P99 | Latency percentiles. | +| 4xx sparkline | Inline trend over the window. | +| 5xx sparkline | Inline trend over the window. | + +Clicking a row toggles a host filter. Click again to remove it. + +### Service Tunnels + +Same columns and behavior as Direct Origins, scoped to tunnel-routed origins. +The table is hidden when no tunnel traffic is present. + +## Filters + +The filter bar applies, with one exception: `userSub` is not applicable on this +tab. See [Shared controls](../shared-controls.md#filters). + +## Troubleshooting + +**The Origins tab isn't visible.** It appears only when the project uses +managed-edge origins. If your project routes traffic differently, the tab is +hidden. + +**Service Tunnels table is missing.** That table only renders when at least one +origin is reached over a service tunnel. + +**A 5xx spike on one origin doesn't match the Requests tab.** If you've filtered +the Requests tab to a different route or status class, totals won't match. Clear +filters or compare with the same filters applied on both tabs. diff --git a/docs/analytics/tabs/requests.md b/docs/analytics/tabs/requests.md new file mode 100644 index 00000000..44086418 --- /dev/null +++ b/docs/analytics/tabs/requests.md @@ -0,0 +1,96 @@ +--- +title: "Requests" +sidebar_label: "Requests" +--- + +The **Requests** tab is the default Analytics overview: every request through +your gateway in the selected time window, with charts and breakdowns for volume, +latency, and errors. + +## When to use this + +- Spot-check overall traffic and error rate across a project or the whole + account. +- Investigate a spike in 4xx or 5xx responses. +- Drill from a route, status code, or geographic breakdown into the underlying + requests. + +## Summary KPIs + +| Name | What it measures | When it's useful | +| ----------------- | ------------------------------------------------------------- | ----------------------------------------- | +| **Requests** | Total request count. Secondary value: successful (2xx) count. | Quick health check on volume and success. | +| **Client Errors** | 4xx rate (4xx ÷ total). Secondary value: raw 4xx count. | Spot bad-input or auth issues. | +| **Server Errors** | 5xx rate (5xx ÷ total). Secondary value: raw 5xx count. | Spot gateway or upstream failures. | +| **Avg Latency** | Mean response time. Secondary value: min to max. | Detect broad latency regressions. | +| **Consumers** | Distinct API consumers (authenticated + anonymous). | Gauge active audience. | + +See [Metrics glossary](../reference/metrics-glossary.md) for how rates and +percentiles are computed. + +## Charts + +**Request Time Series.** Stacked bars per interval, broken down by status class +(2xx / 3xx / 4xx / 5xx). Drag to select a region to zoom; the time range picker +updates to match. + +**Request Locations Map.** A world map with a heatmap of request volume by +location. Shown only when geolocation data is present. + +**Latency Over Time.** P50, P95, and P99 lines. _What to look for:_ a widening +gap between P50 and P95 typically signals a tail-latency problem affecting a +subset of requests. + +**Error Rate.** 4xx and 5xx rates plotted over time. + +**Latency Distribution.** A histogram of P10, P50, P90, P95, and P99 buckets. +Click a band to filter the rest of the tab to requests in that duration range. + +**Active Instances.** Distinct active edge instances over time. A rough +indicator of how widely your traffic is distributed across gateway workers. + +## Breakdowns + +Each breakdown shows the top 10 values by request count. Click **Show more** to +load the next 50. + +**Primary breakdowns:** + +- **HTTP Method** +- **HTTP Status** +- **Route Path** + +**Account scope only:** + +- **Project Name**: click to drill into project-scope analytics. +- **Deployment Name**: click to drill into a specific deployment. + +**Secondary breakdowns:** + +- **Country**, **City**, **Colo** +- **User Sub** +- **Client IP** +- **AS Organization** + +Clicking any value applies an `equals` filter for that field. + +## Filters + +The full filter bar applies. `originHost` is not applicable on this tab. See +[Shared controls](../shared-controls.md#filters) for match modes and the filter +pill UI. + +## Troubleshooting + +**The map is missing.** The Request Locations Map only renders when geolocation +data is present in the time window. Short windows for low-traffic projects may +not include any geolocated requests. + +**Show more doesn't load anything.** You may already be viewing every value for +that breakdown. Top-10 plus 50 covers up to 60 distinct values; beyond that, +narrow the time range or add a filter. + +**My charts look sparse.** If your account is new, the trial banner across the +top calls this out. Click **View demo →** in the banner to see what a fully +populated dashboard looks like. See +[Access and entitlements](../access-and-entitlements.md). diff --git a/sidebar.ts b/sidebar.ts index 4d2a0311..85b47173 100644 --- a/sidebar.ts +++ b/sidebar.ts @@ -669,6 +669,37 @@ export const documentation: Navigation = [ "articles/rename-or-move-project", ], }, + { + type: "category", + label: "Analytics", + icon: "chart-line", + items: [ + "analytics/overview", + "analytics/access-and-entitlements", + "analytics/shared-controls", + { + type: "category", + label: "Tabs", + items: [ + "analytics/tabs/requests", + "analytics/tabs/origins", + "analytics/tabs/consumers", + "analytics/tabs/agents", + "analytics/tabs/ai-gateway", + "analytics/tabs/mcp-gateway", + "analytics/tabs/mcp-server", + ], + }, + { + type: "category", + label: "Reference", + items: [ + "analytics/reference/metrics-glossary", + "analytics/reference/url-parameters", + ], + }, + ], + }, { type: "category", label: "Observability",