-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix(tcpclient, iostream): close SSLIOStream on TLS handshake timeout to prevent fd leak #3636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,6 +13,7 @@ | |
| # License for the specific language governing permissions and limitations | ||
| # under the License. | ||
| import getpass | ||
| import os | ||
| import socket | ||
| import typing | ||
| import unittest | ||
|
|
@@ -428,3 +429,63 @@ def test_two_family_timeout_after_connect_timeout(self): | |
| self.assertEqual(len(conn.streams), 1) | ||
| self.assert_connector_streams_closed(conn) | ||
| self.assertRaises(TimeoutError, future.result) | ||
|
|
||
|
|
||
| class TestTLSHandshakeTimeoutCleanup(AsyncTestCase): | ||
| """Regression test: TLS handshake timeout must not leak file descriptors. | ||
|
|
||
| When TCPClient.connect() times out during the TLS handshake phase, | ||
| start_tls() has already transferred socket ownership to a new SSLIOStream. | ||
| The timeout handler must close that SSLIOStream to prevent an fd leak. | ||
| See tornado#3614. | ||
| """ | ||
|
|
||
| @gen_test | ||
| def test_tls_handshake_timeout_closes_stream(self) -> None: | ||
| import ssl as _ssl | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? |
||
|
|
||
| # Set up a server socket that accepts TCP but never completes TLS | ||
| server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
| server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) | ||
| server_sock.bind(("127.0.0.1", 0)) | ||
| port = server_sock.getsockname()[1] | ||
| server_sock.listen(1) | ||
|
|
||
| ctx = _ssl.SSLContext(_ssl.PROTOCOL_TLS_CLIENT) | ||
| ctx.check_hostname = False | ||
| ctx.verify_mode = _ssl.CERT_NONE | ||
|
|
||
| # Count open fds before (Linux /proc/self/fd) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is platform-specific and at least needs to be guarded. Is counting open FDs the best way to do this? This might be a time when mocks would be most appropriate, but that seems messy too. |
||
| try: | ||
| fds_before = len(os.listdir("/proc/self/fd")) | ||
| except OSError: | ||
| fds_before = 0 | ||
|
|
||
| client = TCPClient() | ||
| error_raised = False | ||
| try: | ||
| yield client.connect( | ||
| "127.0.0.1", | ||
| port, | ||
| ssl_options=ctx, | ||
| timeout=0.01, | ||
| ) | ||
| except TimeoutError: | ||
| error_raised = True | ||
|
|
||
| self.assertTrue(error_raised, "Expected TimeoutError from TLS handshake timeout") | ||
|
|
||
| server_sock.close() | ||
|
|
||
| # Let IOLoop process cleanup callbacks | ||
| from tornado.gen import moment | ||
| yield moment | ||
|
|
||
| # Verify no fd leaked | ||
| if fds_before > 0: | ||
| fds_after = len(os.listdir("/proc/self/fd")) | ||
| self.assertLessEqual( | ||
| fds_after, fds_before + 2, | ||
| "File descriptor leaked after TLS handshake timeout " | ||
| f"(before={fds_before}, after={fds_after})", | ||
| ) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels dirty - we don't cancel the coroutine, so it could wake back up and find that its internal state (i.e.
self._tls_stream) has been modified.I think in this case it's all more or less harmless (except for the logging that is currently failing CI) because we don't rely on
self._tls_streamafter setting it and closing the socket will cause it to generate an internal StreamClosedError, but it's difficult to reason about.I think it would probably be better to push awareness of the timeout into
start_tls()so that we don't handle the timeout at a different level from other errors. Alternately, cancelling the task might do the job more cleanly for us (or at least it's supposed to. Tornado was originally built with non-cancelable futures and I'm not sure how much cancellation works now that we've picked that feature up from asyncio).