fix(tcpclient, iostream): close SSLIOStream on TLS handshake timeout to prevent fd leak by armorbreak001 · Pull Request #3636 · tornadoweb/tornado

armorbreak001 · 2026-06-14T17:19:45Z

Problem (tornado#3614)

When TCPClient.connect() is called with both ssl_options and timeout, a TLS handshake timeout causes permanent file descriptor leak.

Root Cause

start_tls() transfers socket ownership in three steps:

Extract raw socket from original IOStream → self.socket = None
Wrap in SSL → create new SSLIOStream (local variable only)
Return Future[SSLIOStream] that resolves when handshake completes

When gen.with_timeout() fires:

Caller gets TimeoutError
Original stream's .socket is already None → close() is no-op
The SSLIOStream (holding the real fd) is only reachable through the inner Future
gen.with_timeout explicitly does not cancel the wrapped Future
SSLIOStream stays on IOLoop forever → fd leaked permanently

Impact

Every timed-out TLS connection leaks one file descriptor. Long-running services that retry connections (e.g. HTTP clients with short timeouts) will eventually hit the OS fd limit (EMFILE / "too many open files").

Fix

Two files, minimal change:

`tornado/iostream.py`

In start_tls(): store the SSLIOStream as self._tls_stream before returning the future.

self._tls_stream = ssl_stream  # Keep reference for timeout cleanup

`tornado/tcpclient.py`

In connect(): on TimeoutError during start_tls(), close _tls_stream.

except gen.TimeoutError:
    tls_stream = getattr(stream, "_tls_stream", None)
    if tls_stream is not None:
        tls_stream.close()
        stream._tls_stream = None
    raise

Testing

New test: TestTLSHandshakeTimeoutCleanup.test_tls_handshake_timeout_closes_stream

Opens TCP server socket (no TLS accept)
Calls TCPClient.connect() with SSL + 0.01s timeout
Expects TimeoutError
Verifies no fd leaked via /proc/self/fd count

All existing tests pass:

tcpclient_test.py: ✅ 27 passed, 4 skipped
iostream_test.py: ✅ 142 passed, 61 skipped

Fixes #3614

…to prevent fd leak (tornado#3614) When TCPClient.connect() times out during the TLS handshake phase, start_tls() has already transferred socket ownership to a new SSLIOStream and set the original stream's socket to None. The caller's TimeoutError handler could only call stream.close(), which is a no-op since the socket is already gone. Meanwhile the SSLIOStream (holding the real fd) remains registered on the IOLoop forever — leaking the file descriptor. Fix: - In IOStream.start_tls(): store the new SSLIOStream as self._tls_stream so callers can reach it after timeout - In TCPClient.connect(): on gen.with_timeout TimeoutError during start_tls(), close stream._tls_stream to release the fd - Add regression test: TestTLSHandshakeTimeoutCleanup verifies no file descriptor leak after TLS handshake timeout This is a targeted fix with minimal API surface change. The _tls_stream attribute is only set during start_tls() and cleared after cleanup.

bdarnell

This is intended to replace #3615, right? Please update existing PRs instead of starting new ones. And make sure that at least the checks run in the "quick" ci config pass before pushing changes.

bdarnell · 2026-06-23T20:24:07Z

+                    # start_tls() transferred socket ownership to a new
+                    # SSLIOStream (self._tls_stream). Close it to prevent
+                    # a file descriptor leak. See tornado#3614.
+                    tls_stream = getattr(stream, "_tls_stream", None)


This feels dirty - we don't cancel the coroutine, so it could wake back up and find that its internal state (i.e. self._tls_stream) has been modified.

I think in this case it's all more or less harmless (except for the logging that is currently failing CI) because we don't rely on self._tls_stream after setting it and closing the socket will cause it to generate an internal StreamClosedError, but it's difficult to reason about.

I think it would probably be better to push awareness of the timeout into start_tls() so that we don't handle the timeout at a different level from other errors. Alternately, cancelling the task might do the job more cleanly for us (or at least it's supposed to. Tornado was originally built with non-cancelable futures and I'm not sure how much cancellation works now that we've picked that feature up from asyncio).

bdarnell · 2026-06-24T01:07:52Z

+        ctx.check_hostname = False
+        ctx.verify_mode = _ssl.CERT_NONE
+
+        # Count open fds before (Linux /proc/self/fd)


This is platform-specific and at least needs to be guarded.

Is counting open FDs the best way to do this? This might be a time when mocks would be most appropriate, but that seems messy too.

bdarnell · 2026-06-24T01:07:58Z

+
+    @gen_test
+    def test_tls_handshake_timeout_closes_stream(self) -> None:
+        import ssl as _ssl


armorbreak001 · 2026-06-24T01:28:55Z

Closing in favor of #3615, which has been updated with the improved approach (tls_future.cancel() + _tls_stream cleanup + platform guard for test). Thanks for the review @bdarnell!

bdarnell reviewed Jun 24, 2026

View reviewed changes

armorbreak001 closed this Jun 24, 2026

armorbreak001 mentioned this pull request Jun 24, 2026

fix(tcpclient): close SSLIOStream on TLS handshake timeout to prevent socket leak #3615

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(tcpclient, iostream): close SSLIOStream on TLS handshake timeout to prevent fd leak#3636

fix(tcpclient, iostream): close SSLIOStream on TLS handshake timeout to prevent fd leak#3636
armorbreak001 wants to merge 1 commit into
tornadoweb:masterfrom
armorbreak001:fix/tls-handshake-timeout-socket-leak

armorbreak001 commented Jun 14, 2026

Uh oh!

bdarnell left a comment

Uh oh!

bdarnell Jun 23, 2026

Uh oh!

bdarnell Jun 24, 2026

Uh oh!

bdarnell Jun 24, 2026

Uh oh!

armorbreak001 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

armorbreak001 commented Jun 14, 2026

Problem (tornado#3614)

Root Cause

Impact

Fix

tornado/iostream.py

tornado/tcpclient.py

Testing

Uh oh!

bdarnell left a comment

Choose a reason for hiding this comment

Uh oh!

bdarnell Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

bdarnell Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

bdarnell Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

armorbreak001 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`tornado/iostream.py`

`tornado/tcpclient.py`