Skip to content

P1: Expose Prometheus metrics at /metrics (#173)#191

Open
dkijania wants to merge 1 commit into
mainfrom
feat/metrics
Open

P1: Expose Prometheus metrics at /metrics (#173)#191
dkijania wants to merge 1 commit into
mainfrom
feat/metrics

Conversation

@dkijania

Copy link
Copy Markdown
Contributor

What & why

Part of the production-readiness epic (#163). Refs #173.

There was no metrics endpoint — only optional Jaeger tracing — so request rate, errors, latency, and process health were invisible in production.

Changes

  • New /metrics endpoint (prom-client) via a Yoga plugin.
  • RED HTTP metrics: http_requests_total{method,route,status}, http_request_duration_seconds histogram, http_requests_in_flight gauge.
  • Node process metrics: CPU, memory, event loop lag, GC.
  • Route labels normalised to a known set (/, /healthcheck, /readiness, /metrics; anything else → other) to bound label cardinality; the /metrics scrape is not self-counted.

New dependency: prom-client.

On DB pool saturation

The issue also lists DB pool-saturation gauges. postgres.js exposes no pool-introspection API, so those need a query-instrumentation pass rather than a simple read of pool state. I scoped that out of this PR and left #173 open for it (this PR is Refs, not Closes) — happy to follow up with an instrumentation approach if you want it.

Testing

  • npm run build — clean
  • npm run test:unit — all pass; tests cover the exposition format, per-route/status counting (asserts a request count of 2), and exclusion of the scrape endpoint
  • npm run lint / npx prettier --debug-check . — clean

🤖 Generated with Claude Code

There was no metrics endpoint — only optional Jaeger tracing — so request rate,
errors, latency, and process health were invisible in production.

Add a prom-client registry served at `/metrics`:

- RED HTTP metrics via a Yoga plugin: `http_requests_total{method,route,status}`,
  `http_request_duration_seconds` histogram, and `http_requests_in_flight`.
- Standard Node process metrics (CPU, memory, event loop, GC).
- Route labels are normalised to a known set (`other` otherwise) to bound
  cardinality; the `/metrics` scrape is not self-counted.

DB pool-saturation gauges are intentionally out of scope here: postgres.js
exposes no pool-introspection API, so that needs a query-instrumentation pass —
tracked as a follow-up.

Unit tests cover the exposition format, per-route/status counting, and that the
scrape endpoint is excluded.

Refs #173.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QSuak9smCHbp4N17xjjLF6
@dkijania dkijania added production-readiness Work toward making the API production-ready / publicly available P1 Strongly recommended before GA labels Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P1 Strongly recommended before GA production-readiness Work toward making the API production-ready / publicly available

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant