Skip to content

Transparent 429 retry with Retry-After + exponential backoff#75

Open
DblD wants to merge 1 commit into
cliviu74:masterfrom
DblD:feat/429-retry-with-backoff
Open

Transparent 429 retry with Retry-After + exponential backoff#75
DblD wants to merge 1 commit into
cliviu74:masterfrom
DblD:feat/429-retry-with-backoff

Conversation

@DblD
Copy link
Copy Markdown
Contributor

@DblD DblD commented May 3, 2026

Summary

Adds transparent retry on 429 responses, addressing the rate-limit problem flagged in #68 and the README warning. Long-running consumers (notably the Home Assistant integration) currently lose connectivity on the first 429 and stay offline until manually reloaded — this PR makes the library survive normal rate-limit weather without any caller-side changes.

How

A single requests.Session is mounted on Wallbox.__init__ with an HTTPAdapter whose max_retries is a urllib3.Retry configured for:

  • status_forcelist=(429,) — see scoping note below
  • allowed_methods=GET, POST, PUT, DELETE — see note below
  • connect=0, read=0 — network errors are not retried (see scoping note)
  • respect_retry_after_header=True — honors the server's Retry-After
  • backoff_factor=1.0 — exponential fallback when Retry-After is absent (~1s, 2s, 4s)
  • raise_on_status=False — preserves the existing requests.HTTPError contract once retries are exhausted

Every requests.get/post/put call site in wallbox.py is mechanically swapped for self._session.get/post/put. No method signatures change.

Note on allowed_methods including DELETE

The library does not currently issue any DELETE requests, but DELETE is kept in allowed_methods for two reasons:

  1. HTTP-spec idempotency: DELETE is idempotent by definition, so it sits in the same retry-safety family as GET and PUT. urllib3's own default allowed_methods includes DELETE for exactly this reason.
  2. Abstraction completeness: _session is the canonical retry-enabled HTTP session for the wallbox API. Enumerating it by HTTP-spec property (idempotent verbs + the POST we explicitly opted into) reads more cleanly than enumerating by current-usage. If someone adds a DELETE-using method later, retries fire automatically rather than silently regressing.

This is a small judgment call — happy to tighten to ("GET", "POST", "PUT") if you prefer the whitelist to track actual usage strictly. Either is defensible.

Scoping decisions

429 only — 5xx is intentionally not in the forcelist. Two POST endpoints in this lib are not idempotent: restartCharger (action 3) and updateFirmware (action 5). Retrying these on a transient 5xx after the device has already accepted the request could fire the action twice, with updateFirmware carrying the same risk the README already warns about ("can brick your charger"). 429 is always safe to retry — by definition the server has not processed the request.

Auth failures (401/403) are excluded so bad credentials surface immediately rather than burning the retry budget.

Network errors are also excluded (connect=0, read=0). urllib3.Retry would otherwise retry connection errors, DNS failures, and read timeouts by default. Silently retrying these is a behaviour change beyond this PR's stated scope and would mask real configuration errors behind extra wait time. The retry budget is reserved for 429 only.

Backwards compatibility and behaviour

The lib's failure semantics, post-PR:

Outcome Caller sees
Successful call decoded JSON / status code
429, retry succeeds decoded JSON (retry happens transparently)
429, retries exhausted requests.HTTPError (status 429)
401 / 403 (auth) requests.HTTPError immediately
Other 4xx requests.HTTPError immediately
5xx requests.HTTPError immediately
Network error requests.ConnectionError / Timeout immediately

Two new constructor kwargs make the budget tunable for callers who want different behaviour (named in camelCase to match requestGetTimeout / jwtTokenDrift):

w = Wallbox(user, password, maxRetries=5, backoffFactor=2.0)

Observability

The lib itself does not emit log records — preserving parity with the rest of the codebase. urllib3 logs each retry at WARNING level on the urllib3.connectionpool logger, so callers who want visibility can configure that logger:

import logging
logging.getLogger("urllib3.connectionpool").setLevel(logging.WARNING)

In Home Assistant this surfaces in the integration's debug log when the user enables debug logging. Adding lib-native logging is plausible as a follow-up if you'd prefer that, but it felt out of scope for this PR.

Dependency floor

A urllib3>=1.26.0 floor is added to setup.py install_requires. The Retry(allowed_methods=...) kwarg this PR uses was renamed from method_whitelist in urllib3 1.26.0; without the floor, an end-user installing on the existing requests>=2.22.0 lower bound could land on urllib3 1.25.x and hit a TypeError at construction time. Anyone with a modern requests already satisfies >=1.26.0.

Tests

The repo had no test scaffolding, so this PR adds one in three layers:

Layer Purpose Tests Tool
TestRetryConfiguration adapter and Retry config sanity 9 inspection
TestExistingErrorContract non-retryable HTTPError contract preserved 3 requests-mock
TestRetryBehaviour end-to-end retries through the real HTTP path 5 pytest-httpserver

The behavioural layer matters because requests-mock swaps in its own adapter and bypasses the urllib3 Retry machinery. pytest-httpserver spins up a real localhost server so requests travel through the full urllib3 Retry path and the retry behaviour is actually exercised. The behavioural layer covers:

  • 429 → 200 triggers exactly one retry and returns the decoded payload
  • 429 forever exhausts the retry budget and raises requests.HTTPError (not RetryError)
  • A Retry-After: 1 header is honored (timing-bounded assertion)
  • A 401 is not retried (single request, immediate HTTPError)
  • A connection refused is not retried (timing-bounded — fast fail)

Each behavioural test was verified before commit by mutating the relevant Retry config (e.g. dropping 429 from the forcelist, flipping raise_on_status, disabling respect_retry_after_header, adding 401 to the forcelist, or removing connect=0/read=0) to confirm the test fails for the expected reason. The connect=0, read=0 change specifically followed strict red-green TDD: the test failed naturally before the fix was applied, then passed after.

$ pytest tests/ -v
=========== 17 passed in 1.55s ===========

requirements-dev.txt is added with pytest, requests-mock, and pytest-httpserver on top of the runtime requirements. Test code uses Python's standard snake_case convention for test method names and local variables (pytest discovery norm); production code is uniformly camelCase.

Tested locally

Smoke-tested against the live Wallbox API (Home Assistant context) — normal calls behave exactly as before, no regressions visible. I do not have a reliable way to force-trigger a 429 from the live API on demand, so the retry path itself is exercised through the test suite rather than against production.

Out of scope (happy to file separately)

  • Switching getChargerStatus to the v4 endpoint that @jorisdrenth documented in Wallbox API has changed #68 — different concern, larger surface
  • A local minimum-poll-interval cache as a defensive layer
  • Async support
  • Retrying transient 5xx for the safe, idempotent verbs only — would need a second adapter
  • Lib-native logging (e.g. structured retry events) — separate design conversation

Refs: #68

@DblD DblD marked this pull request as draft May 3, 2026 20:36
@DblD DblD force-pushed the feat/429-retry-with-backoff branch from b8e1d06 to be7921d Compare May 3, 2026 20:50
@DblD DblD changed the title Transparent 429/5xx retry with Retry-After + exponential backoff Transparent 429 retry with Retry-After + exponential backoff May 3, 2026
@DblD DblD force-pushed the feat/429-retry-with-backoff branch 2 times, most recently from c532b8f to 350de33 Compare May 3, 2026 21:05
@DblD DblD marked this pull request as ready for review May 3, 2026 21:07
Copy link
Copy Markdown
Owner

@cliviu74 cliviu74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes requested.

Comment thread wallbox/wallbox.py Outdated
)
session = requests.Session()
adapter = HTTPAdapter(max_retries=retry)
session.mount("http://", adapter)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any point in using adapter for http:// schema? All trafic is https for the Wallbox api?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: dropped the http:// mount from the production session. The lib only ever talks to api.wall-box.com / user-api.wall-box.com (both https), so retry-arming a scheme we never use in production was dead code — agreed.

The pytest-httpserver test infrastructure was the only consumer of the http:// mount, since the test server runs plaintext localhost. To preserve full behavioural test coverage I considered two approaches:

  1. Switch tests to https with a self-signed cert (via pytest-httpserver's SSL support + trustme or verify=False)
  2. Mirror the retry adapter onto http:// inside the test session only, leaving production https-only

Went with (2) — two lines in the test fixture (and one in test_connection_error_is_not_retried, which builds its Wallbox manually):

w._session.mount("http://", w._session.get_adapter("https://example.com/"))

Reasoning: (1) requires generating self-signed certs and either disabling cert verification or trusting the cert chain — adds test infrastructure and a new dev dependency to solve what amounts to a localhost test setup detail. (2) keeps the test setup self-contained, doesn't add deps, and the small mirror line documents itself with a comment explaining why pytest-httpserver requires it.

Happy to switch to (1) if you'd rather not have any test-side mount mirroring.

Comment thread wallbox/wallbox.py Outdated
self._session = self._build_session(maxRetries, backoffFactor)

@staticmethod
def _build_session(maxRetries, backoffFactor):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Camel Case for _build_session (_buildSession)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed: _buildSession. Snake_case slipped through — same pattern as usedRefreshToken on #74. Consistent with the rest of the codebase now.

@DblD DblD force-pushed the feat/429-retry-with-backoff branch from 350de33 to a68a0ce Compare May 9, 2026 06:33
The Wallbox API enforces rate limits on third-party clients (per the
maintainer's note on cliviu74#68: 3 GET / 5 non-GET per minute). When the limit
is hit the server returns 429 with a Retry-After header. Today the
library raises HTTPError on the first 429, which knocks long-running
consumers (e.g. the Home Assistant integration) offline until the
integration is manually reloaded.

This change introduces a single requests.Session, mounted with an
HTTPAdapter that uses urllib3.Retry to handle 429 responses. The
Retry-After header is honored when present; otherwise exponential
backoff is applied. raise_on_status is disabled so callers continue to
receive requests.HTTPError once retries are exhausted, preserving the
existing exception contract — no behaviour change for callers, only
added resilience.

Scoping decisions:

  * 429 only. Transient 5xx is deliberately not in the forcelist
    because two POST endpoints (restartCharger, updateFirmware) are
    not idempotent — a transient 5xx after the device has already
    accepted the action could otherwise trigger a duplicate. 429 is
    always safe to retry: the request was not processed by definition.

  * Auth failures (401/403) are excluded so bad credentials surface
    immediately rather than burning the retry budget.

  * Network errors (connect=0, read=0) are not retried either. The
    retry budget is for 429 only — silently retrying network failures
    would be a behaviour change beyond this PR's stated scope and
    would mask real configuration errors behind extra wait time.

  * The retry adapter is mounted on https:// only, since both
    baseUrl and authUrl point at the Wallbox API which is https
    only. The test suite mirrors the adapter onto http:// in the
    test session, because pytest-httpserver runs plaintext localhost.

Defaults: maxRetries=3, backoffFactor=1.0. Both are configurable via
new constructor kwargs (named in camelCase to match the existing style
of requestGetTimeout and jwtTokenDrift). The internal builder helper
follows the same convention (_buildSession).

Behaviour and observability are documented in README so callers know
exactly what they will see for each failure mode. The library itself
does not emit log records; callers who want retry visibility can
configure the urllib3.connectionpool logger at WARNING level (urllib3
already logs each retry there).

A test suite is added under tests/ in three layers:

  * TestRetryConfiguration — adapter and Retry config (9 tests)
  * TestExistingErrorContract — non-retryable HTTPError contract is
    preserved (3 tests, requests-mock)
  * TestRetryBehaviour — end-to-end against a localhost HTTP server
    so the full urllib3 Retry path is actually exercised (5 tests,
    pytest-httpserver)

Each behavioural test was verified by mutating the relevant Retry
config to break it and confirming the test caught the regression
before being committed. The connect/read disabling specifically
followed strict red-green TDD: the test failed naturally without the
fix, then passed after it was applied.

A urllib3>=1.26.0 floor is added to install_requires because this
change uses the allowed_methods kwarg (renamed from method_whitelist
in urllib3 1.26.0). Without the floor, a user installing on the
existing requests>=2.22.0 lower bound could land on urllib3 1.25.x and
hit a TypeError at construction time.

Refs: cliviu74#68 ("Wallbox API has changed"), README "WARNING - Ratelimits"
@DblD DblD force-pushed the feat/429-retry-with-backoff branch from a68a0ce to 5908a60 Compare May 9, 2026 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants