Fix(llm): retry transient failures and make concurrency configurable by jhamze7 · Pull Request #29 · NVIDIA/SkillSpector

jhamze7 · 2026-06-12T03:30:14Z

This pull request addresses the issue of LLM calls immediately failing due to a lack of retry logic.

Changes:

llm_utils.py:

Created helper functions for the sync and async batch runners that retry LLM calls 4 times by default (this number is configurable with the max_attempts argument) with exponential backoff timing. These functions string match the error message to keep the implementation universal across different providers.

llm_analyzer_base.py:

Wrapped the LLM calls in the helper functions created in llm_utils.py. Also set max_concurrency to the SKILLSPECTOR_MAX_CONCURRENCY environment variable, with the value defaulting to 5 if the variable is empty.

.env.example:

Created a SKILLSPECTOR_MAX_CONCURRENCY environment variable. This change was also noted in docs/DEVELOPMENT.md

test_llm_analyzer_base.py:

Added sync and async tests verifying that LLM calls are retried when a simulated 429 error occurs. The tests also verify that processing succeeds after a transient failure and raises an exception once the retry limit is exhausted.

Closes #10

…ment variable

CharmingGroot · 2026-06-12T04:54:17Z

Verified on a self-hosted OpenAI-compatible backend (vLLM serving Qwen/Qwen3.6-35B-A3B-FP8).

Configurable concurrency works: scanning the repo's tests/fixtures/malicious_skill/ fixture with LLM analysis completes the full pass at SKILLSPECTOR_MAX_CONCURRENCY=5 (default) and at 3. Being able to tune concurrency down is what a single-instance backend needs.
Retry classifier behaves as written: exercising retry_llm_call_sync directly — 429 and timeout errors are retried (4 attempts with backoff), non-retryable errors raise immediately. Matches the PR's stated scope.

LGTM for the #10 retry/backoff + configurable concurrency goal.

jhamze7 added 4 commits June 11, 2026 22:41

Created retry tests for sync and async calls

4336c9b

Updated documentation to include SKILLSPECTOR_MAX_CONCURRENCY environ…

11492ad

…ment variable

fix(llm): retry transient failures and make concurrency configurable

97bbfb1

Increased default max retries to 4

619d522

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix(llm): retry transient failures and make concurrency configurable#29

Fix(llm): retry transient failures and make concurrency configurable#29
jhamze7 wants to merge 4 commits into
NVIDIA:mainfrom
jhamze7:fix-llm-retry-backoff

jhamze7 commented Jun 12, 2026

Uh oh!

CharmingGroot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jhamze7 commented Jun 12, 2026

Changes:

llm_utils.py:

llm_analyzer_base.py:

.env.example:

test_llm_analyzer_base.py:

Uh oh!

CharmingGroot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants