Skip to content

Make embedding and reranker batch sizes tunable via env vars#77

Merged
justincasher merged 1 commit into
mainfrom
justin/env-var-batch-sizes
Apr 13, 2026
Merged

Make embedding and reranker batch sizes tunable via env vars#77
justincasher merged 1 commit into
mainfrom
justin/env-var-batch-sizes

Conversation

@justincasher
Copy link
Copy Markdown
Owner

Summary

  • EmbeddingClient: add batch_size constructor param; falls back to LEAN_EXPLORE_EMBEDDING_BATCH_SIZE env var, then to the existing default of 8.
  • RerankerClient: add batch_size constructor param that sets the default used by rerank(); falls back to LEAN_EXPLORE_RERANKER_BATCH_SIZE, then to the device-specific defaults (16 on CUDA, 32 on CPU).
  • Lets deployments tune throughput vs. memory without code changes — raise on large-VRAM hardware for better throughput, lower on 8GB-VRAM boxes.
  • Also fixes two pre-existing UP038 lint violations in scripts/generate_docs_data.py that were blocking any commit to main.

@justincasher justincasher merged commit b5dc202 into main Apr 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant