Skip to content

Exempt first cibuildwheel build from ccache-hit enforcement#877

Open
IvanaGyro wants to merge 1 commit into
masterfrom
claude/festive-mayer-QsPK6
Open

Exempt first cibuildwheel build from ccache-hit enforcement#877
IvanaGyro wants to merge 1 commit into
masterfrom
claude/festive-mayer-QsPK6

Conversation

@IvanaGyro
Copy link
Copy Markdown
Collaborator

Problem

tools/validate_ccache_stats.py runs as the CIBW_TEST_COMMAND, i.e. once per Python-version build in a cibuildwheel run, and zeroes the ccache counters at the end so each build observes fresh per-build stats. The first Python version compiles the C++ sources from a cold cache, so it legitimately reports zero cache hits — but the script unconditionally raises SystemExit whenever no hits are detected, failing that first build. Only the later Python versions can reuse the object files the first build produced.

Change

  • tools/validate_ccache_stats.py: Track the first build of a run with a marker file at $CCACHE_DIR/.cytnx_first_wheel_build_done. CCACHE_DIR is the only storage shared across the per-version build containers (via the bind mount), so it can carry this signal. When the marker is absent, the build is the first of the run: hit enforcement is skipped and the marker is created. When present, zero hits still fails as before.
  • .github/workflows/release_pypi.yml: Remove the marker once per run, after the cache is restored and before cibuildwheel is invoked. This guarantees a marker that was persisted inside the cached CCACHE_DIR from a previous run cannot mask a genuine cache-miss failure on a later run.

Behavior

Build Cache hits Result
1st Python version 0 pass (marker absent → cold cache expected)
later Python version 0 fail (marker present → real regression)
any Python version > 0 pass

The per-run reset means the exemption applies to exactly one build per cibuildwheel invocation. macOS is covered too, since CCACHE_DIR there points at $HOME/.ccache.

Co-authored-by: Claude noreply@anthropic.com


Generated by Claude Code

The first python-version wheel in a cibuildwheel run compiles the C++
sources from a cold cache, so zero ccache hits is expected there; only
later versions reuse those objects and must hit the cache.

Track the first build with a marker file in CCACHE_DIR (shared across the
per-version build containers via the bind mount) and skip hit enforcement
when it is absent. The release_pypi workflow removes the marker once per
run, after the cache is restored and before cibuildwheel, so a marker
carried over inside the cached CCACHE_DIR cannot mask a genuine failure.

Co-authored-by: Claude <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to skip ccache hit enforcement during the first build of a cibuildwheel run, as it starts with a cold cache. It uses a marker file inside the CCACHE_DIR to identify the first build. Feedback on this change highlights a potential issue if CCACHE_DIR is configured as a relative path, which could lead to incorrect path resolution during cibuildwheel test execution. It is recommended to resolve CCACHE_DIR relative to the repository root to ensure robustness.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tools/validate_ccache_stats.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 29.49%. Comparing base (ee9d856) to head (ee1d3b0).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #877   +/-   ##
=======================================
  Coverage   29.49%   29.49%           
=======================================
  Files         241      241           
  Lines       35512    35512           
  Branches    14777    14777           
=======================================
  Hits        10475    10475           
+ Misses      17784    17783    -1     
- Partials     7253     7254    +1     
Flag Coverage Δ
cpp 29.09% <ø> (ø)
python 52.71% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
C++ backend 30.74% <ø> (ø)
Python bindings 17.09% <ø> (ø)
Python package 52.71% <ø> (ø)
see 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee9d856...ee1d3b0. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@IvanaGyro IvanaGyro requested a review from pcchen June 3, 2026 11:26
@pcchen
Copy link
Copy Markdown
Collaborator

pcchen commented Jun 7, 2026

Review: PR #877 — "Exempt first cibuildwheel build from ccache-hit enforcement"

Overview

Targeted fix for the recurring BuildWheel-ubuntu-24.04 cold-cache failure. The first Python-version build in a cibuildwheel run compiles everything from scratch, so zero cache hits is expected and correct — but the script was unconditionally failing. This PR introduces a per-run marker file to distinguish the first build from a genuine cache regression.


Correctness

Marker logic in validate_ccache_stats.py:

ccache_dir = os.getenv('CCACHE_DIR')
first_build_marker = ... / '.cytnx_first_wheel_build_done' if ccache_dir else None
is_first_build = first_build_marker is not None and not first_build_marker.exists()
if first_build_marker is not None:
    first_build_marker.parent.mkdir(parents=True, exist_ok=True)
    first_build_marker.touch()
  • Marker is created unconditionally (regardless of hit count) when it doesn't exist. This is correct — once the first build completes (hit or miss), subsequent builds must enforce cache use. ✅
  • When CCACHE_DIR is unset, is_first_build = False — enforcement is always on for non-ccache environments. ✅
  • On Linux the marker lives at /host_ccache/.cytnx_first_wheel_build_done (inside the container); on macOS at $HOME/.ccache/.cytnx_first_wheel_build_done. Both are in the shared directory that persists across Python-version builds. ✅

Workflow cleanup in release_pypi.yml:

mkdir -p "${HOME}/.ccache"
rm -f "${HOME}/.ccache/.cytnx_first_wheel_build_done"

This runs in the "Set Ccache Directory" step — after actions/cache restores the cached ccache, before cibuildwheel launches. A stale marker carried in the cached CCACHE_DIR from a prior run is cleared here, ensuring the per-run exemption only fires once per actual cibuildwheel invocation. ✅

The mkdir -p guard is correct: on a cache-miss run the directory may not exist yet. ✅

Behavior table is accurate:

Build Hits Result
1st (marker absent) 0 ✅ pass — cold cache expected
2nd+ (marker present) 0 ❌ fail — real regression
any > 0 ✅ pass

No issues

The fix is minimal, well-scoped, and correctly handles all the relevant cases (Linux container bind-mount, macOS, stale marker from cached ccache, missing CCACHE_DIR). The existing ccache --zero-stats call still runs unconditionally before the marker logic, so per-build stats remain accurate.

Ready to merge.

Posted by Claude Code on behalf of @pcchen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants