Skip to content

(retriever) fix pip install race condition by persisting build timestamp#1480

Open
edknv wants to merge 2 commits intoNVIDIA:mainfrom
edknv:edwardk/retriever-version
Open

(retriever) fix pip install race condition by persisting build timestamp#1480
edknv wants to merge 2 commits intoNVIDIA:mainfrom
edknv:edwardk/retriever-version

Conversation

@edknv
Copy link
Collaborator

@edknv edknv commented Mar 3, 2026

Description

Running pip install nemo_retriever/ intermittently fails with the following:

Building wheels for collected packages: nemo-retriever
  Building wheel for nemo-retriever (pyproject.toml) ... done
  Created wheel for nemo-retriever: filename=nemo_retriever-2026.3.3.dev20260303195249-py3-none-any.whl size=329711 sha256=1b718aa50dd5a2a8c1e0829696d1cac7d5e9474cc6695acbc12ea73bae64fc13
  Stored in directory: /tmp/pip-ephem-wheel-cache-wpw7ycf1/wheels/f6/e1/6f/b3b2202ae726599eec555b4a433d70c1ce29ff63335a28423b
  WARNING: Built wheel for nemo-retriever is invalid: Wheel has unexpected file name: expected '2026.3.3.dev20260303195245', got '2026.3.3.dev20260303195249'
Failed to build nemo-retriever
error: failed-wheel-build-for-install

× Failed to build installable wheels for some pyproject.toml based projects
╰─> nemo-retriever

This happens because pip's PEP 517 build process invokes the setuptools backend in two separate subprocesses:

  1. prepare_metadata_for_build_wheel — computes metadata (including version)
  2. build_wheel — builds the actual wheel

Since lru_cache doesn't persist across processes, each got a different datetime.now() fallback, causing a version mismatch.

This PR fixes the issue by persisting the computed timestamp to an ephemeral .build_stamp file so the second subprocess reuses the same value.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@edknv edknv requested a review from a team as a code owner March 3, 2026 20:14
@edknv edknv requested a review from drobison00 March 3, 2026 20:14
@edknv edknv force-pushed the edwardk/retriever-version branch from a3e318c to 9a9f9ed Compare March 3, 2026 20:22
@edknv edknv changed the title Fix pip install race condition by persisting build timestamp (retriever) fix pip install race condition by persisting build timestamp Mar 3, 2026
@edknv edknv requested a review from jdye64 March 3, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant