Skip to content

Implement real-time rate limiting for local connectors #2

@trahulkumar

Description

@trahulkumar

Problem

The current [ProcurementScout] iterates through 20+ local government sources sequentially. While this naturally spaces out requests between different domains, individual connectors that implement pagination (looping through multiple pages of results) can hit a single server with rapid-fire requests. This risks triggering Web Application Firewalls (WAF) or getting our crawler IP banned by smaller municipal servers.

Proposed Solution

Implement a centralized RateLimiter utility or decorator that enforces a mandatory "politeness delay" between HTTP requests to the same host.

Technical Specifications

  • Algorithm: Token Bucket or simple "Sleep-until" mechanism.
  • Default Policy: 1 request per 2.0 seconds per domain.
  • Handling 429s: If a server responds with HTTP 429 Too Many Requests, the connector should respect the Retry-After header or apply exponential backoff.

Tasks

  • Create govsignal/utils/rate_limiter.py.
  • Implement a @rate_limit(calls=1, period=2) decorator.
  • Apply this decorator to the _fetch_data methods in [govsignal/local_connectors.py].
  • Update [config.yaml] to allow per-source rate limit overrides.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions