Skip to content

P1: Graceful shutdown — drain, flush traces, uncaught handlers (#170)#188

Open
dkijania wants to merge 1 commit into
mainfrom
feat/graceful-shutdown
Open

P1: Graceful shutdown — drain, flush traces, uncaught handlers (#170)#188
dkijania wants to merge 1 commit into
mainfrom
feat/graceful-shutdown

Conversation

@dkijania

Copy link
Copy Markdown
Contributor

What & why

Part of the production-readiness epic (#163). Closes #170.

Shutdown previously called server.close() then process.exit(0) from the close event. It didn't bound how long draining could take, never flushed OpenTelemetry spans (losing the tail of traces on every deploy), and had no handlers for uncaughtException / unhandledRejection.

Changes

  • New src/server/graceful-shutdown.ts — a small, unit-tested createGracefulShutdown orchestrator.
  • Entry point wires it up:
    • Drain in-flight requests via server.close(), then run teardown steps (flush the tracer provider, close the Postgres pool) in order.
    • A hard SHUTDOWN_TIMEOUT_MS deadline (default 10s) forces exit if draining/teardown hangs; the process exits at most once.
    • The handler is idempotent — a second signal is ignored.
    • SIGINT/SIGTERM/SIGQUIT plus uncaughtException/unhandledRejection all route through it.
  • buildPlugins now returns the tracer provider so the entry point can flush it.
Env var Default Meaning
SHUTDOWN_TIMEOUT_MS 10000 Max ms to drain before forcing exit

Testing

  • npm run build — clean
  • npm run test:unit — all pass; new tests cover ordering, idempotency, a failing teardown step (logged, doesn't abort the rest), and the timeout-forces-exit path (injected exit hook, asserts exit code 1 fires exactly once)
  • npm run lint — clean
  • npx prettier --debug-check . — exit 0

🤖 Generated with Claude Code

Shutdown previously called `server.close()` then `process.exit(0)` from the
close event. It didn't bound how long draining could take, never flushed
OpenTelemetry spans (losing the tail of traces on deploy), and had no handlers
for uncaughtException / unhandledRejection.

Add a small, unit-tested `createGracefulShutdown` orchestrator and wire it into
the entry point:

- Drain in-flight requests via `server.close()`, then run teardown steps
  (flush the tracer provider, close the Postgres pool) in order.
- A hard `SHUTDOWN_TIMEOUT_MS` deadline (default 10s) forces exit if draining or
  teardown hangs; the process exits at most once.
- The handler is idempotent, so a second signal is ignored.
- SIGINT/SIGTERM/SIGQUIT plus uncaughtException/unhandledRejection all route
  through it.

`buildPlugins` now returns the tracer provider so the entry point can flush it.
Unit tests cover ordering, idempotency, a failing teardown step, and the
timeout-forces-exit path with an injected exit hook.

Closes #170.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QSuak9smCHbp4N17xjjLF6
@dkijania dkijania added production-readiness Work toward making the API production-ready / publicly available P1 Strongly recommended before GA labels Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P1 Strongly recommended before GA production-readiness Work toward making the API production-ready / publicly available

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P1: Graceful shutdown — drain requests, flush traces, uncaught handlers

1 participant