Skip to content

fix: register SIGINT/SIGTERM handlers in HTTP transport#318

Open
danilofuchs wants to merge 2 commits into
bytebase:mainfrom
danilofuchs:fix/graceful-shutdown-http
Open

fix: register SIGINT/SIGTERM handlers in HTTP transport#318
danilofuchs wants to merge 2 commits into
bytebase:mainfrom
danilofuchs:fix/graceful-shutdown-http

Conversation

@danilofuchs
Copy link
Copy Markdown

@danilofuchs danilofuchs commented May 21, 2026

Summary

The HTTP transport branch in src/server.ts had no SIGINT/SIGTERM handlers, so when an orchestrator (Docker, Kubernetes, ECS, etc.) sent SIGTERM to ask the server to shut down, the process either skipped connector teardown entirely or hung until the orchestrator's stop-timeout window expired and SIGKILL landed (exit code 137).

Concretely, this caused:

  • DB pools / SSH tunnels in ConnectorManager to be torn down ungracefully
  • setTimeout-based IAM refresh timers to keep the event loop alive past SIGTERM
  • The TOML config watcher (chokidar) to keep watching until forced-kill
  • ECS / Kubernetes deploys to surface as 137 exits instead of clean rollovers

The STDIO branch already had a SIGINT/SIGTERM handler. This change extracts it into a shared installShutdownHandlers helper and wires it into both transport branches.

@danilofuchs danilofuchs requested a review from tianzhou as a code owner May 21, 2026 16:00
Copilot AI review requested due to automatic review settings May 21, 2026 16:00
The HTTP transport had no signal handlers at all, so SIGTERM from
orchestrators (Docker/Kubernetes/ECS) either bypassed connector teardown
or hung the process until the orchestrator sent SIGKILL (exit 137). In
either case the running server skipped clean disconnects for DB pools,
SSH tunnels, and IAM refresh timers in the ConnectorManager.

Extract the existing STDIO shutdown into a transport-agnostic helper
that closes the active transport (HTTP server or stdio), calls
connectorManager.disconnect(), stops the config watcher, and falls back
to a forced exit after 25s so a stuck cleanup can't outlast typical
orchestrator stop-timeouts. Wire it up from both transport branches.

Add an integration test that asserts the server exits with code 0
within a few seconds of SIGTERM or SIGINT in HTTP mode — without the
fix the process is killed by signal (exit code null), with the fix the
handler runs and exits cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@danilofuchs danilofuchs force-pushed the fix/graceful-shutdown-http branch from 50c43e1 to faa7ada Compare May 21, 2026 16:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves shutdown behavior by adding SIGINT/SIGTERM handling to the HTTP transport path (matching the existing STDIO behavior) so orchestrators can terminate the server cleanly without forced SIGKILL, and adds an integration test to prevent regressions.

Changes:

  • Extracted a shared installShutdownHandlers helper and wired it into both HTTP and STDIO transports.
  • Added a 25s forced-exit fallback during shutdown to avoid exceeding common orchestrator stop timeouts.
  • Added an integration test that spawns the HTTP server, sends SIGTERM/SIGINT, and asserts clean exit.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/server.ts Introduces shared shutdown handler and installs it for HTTP + STDIO transports.
src/tests/graceful-shutdown-integration.test.ts Adds integration coverage to validate clean termination on SIGTERM/SIGINT for HTTP mode.
Comments suppressed due to low confidence (1)

src/tests/graceful-shutdown-integration.test.ts:98

  • Same issue as above: proc.killed is not a reliable indicator that the child has exited, so SIGKILL may be skipped and the server process can leak if shutdown hangs. Use proc.exitCode/signalCode or the timeout result to decide whether to send SIGKILL.
    } finally {
      if (!proc.killed) proc.kill('SIGKILL');
      if (fs.existsSync(dbPath)) fs.unlinkSync(dbPath);

Comment thread src/server.ts Outdated
Comment on lines +181 to +189
clearTimeout(forceExit);
process.exit(0);
}
};

process.on("SIGINT", () => shutdown("SIGINT"));
process.on("SIGTERM", () => shutdown("SIGTERM"));
return shutdown;
};
Comment thread src/server.ts Outdated
Comment on lines +181 to +189
clearTimeout(forceExit);
process.exit(0);
}
};

process.on("SIGINT", () => shutdown("SIGINT"));
process.on("SIGTERM", () => shutdown("SIGTERM"));
return shutdown;
};
Comment thread src/__tests__/graceful-shutdown-integration.test.ts
Comment on lines +81 to +83
await waitForServerReady(`http://localhost:${testPort}`);
proc.kill('SIGINT');
const result = await waitForExit(proc, 10_000);
- Make each shutdown step best-effort so a failing transport close still
  runs connector disconnect and config-watcher cleanup.
- Track shutdown errors and exit non-zero on failure so orchestrators /
  CI can observe partial cleanup.
- Test: pick an ephemeral port at runtime instead of a fixed 3099 to
  avoid collisions in parallel runs or busy CI runners.
- Test: use proc.exitCode / signalCode instead of proc.killed to decide
  whether to force-kill the child; proc.killed flips true on the first
  signal even if the process is still alive, which could leak it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants