Skip to content

Fix connection contention issue in http_utils.py - reuse ClientSession #591

@cgillum

Description

@cgillum

Summary

The Python Durable Functions SDK creates a new aiohttp.ClientSession for every HTTP request to the internal RPC endpoint. This is an anti-pattern that can cause connection contention and timeouts under concurrent load.

Problem

In http_utils.py, the post_async_request function creates a new session for each request:

python async def post_async_request(url: str, data: Any = None, ...) -> List[Union[int, Any]]: async with aiohttp.ClientSession() as session: # New session per request # ...

The aiohttp documentation explicitly warns against this pattern:

"Don't create a session per request. Most likely you need a session per application which performs all requests together."

Impact

During a production investigation (ICM 695094479), we observed intermittent ConnectionTimeoutError (~30s) when calling client.start_new() to start orchestrations. The error occurs in aiohttp/connector.py:_wrap_create_connection, indicating TCP connection establishment failures.

Under concurrent load (bursts of 6-9 simultaneous requests), multiple requests compete to establish new TCP connections instead of reusing pooled connections from a shared session.

Proposed Fix

Modify http_utils.py to reuse a single ClientSession with configurable timeout and connection pooling.

Considerations

  1. Thread safety - May need to use an async lock when initializing the session
  2. Session lifecycle - Need to handle session cleanup on worker shutdown
  3. Connection limits - The TCPConnector limits should be tuned appropriately

Additional Context

  • The 30s timeout matches aiohttp's default sock_connect timeout
  • Similar issues have been reported in the aiohttp community

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions