-
Notifications
You must be signed in to change notification settings - Fork 65
Fix connection contention by reusing aiohttp ClientSession #592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Co-authored-by: cgillum <2704139+cgillum@users.noreply.github.com>
Co-authored-by: cgillum <2704139+cgillum@users.noreply.github.com>
Co-authored-by: cgillum <2704139+cgillum@users.noreply.github.com>
…sion locks Co-authored-by: cgillum <2704139+cgillum@users.noreply.github.com>
Co-authored-by: cgillum <2704139+cgillum@users.noreply.github.com>
andystaples
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with one tiny nit comment
| async def _close_session() -> None: | ||
| """Close the shared ClientSession if it exists. | ||
| This function should be called during worker shutdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This comment says we should call _close_session on worker shutdown, but it is only called by _handle_request_error. Can we add the call during shutdown or update the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a way to hook into the shutdown? If not, I'll just update the comment since process shutdown should clean up all resources anyways.
Fix connection contention issue in http_utils.py - reuse ClientSession ✅
Summary
Fixed a critical connection contention issue where the Python Durable Functions SDK was creating a new aiohttp.ClientSession for every HTTP request to the internal RPC endpoint. This anti-pattern was causing intermittent ConnectionTimeoutErrors (~30s) under concurrent load, as observed in production (ICM 695094479).
Plan:
Implementation Details:
_client_sessionwith async lock for thread-safe initialization_handle_request_error()helper with session lock and try/finally for safe cleanup_close_session()with session lock and try/finally for worker shutdown cleanupKey considerations addressed:
Test Coverage:
Security Summary:
No vulnerabilities found during CodeQL security analysis. The changes are safe and improve the reliability of the SDK under concurrent load.
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.