Description of the bug:
With Bazel 9.1.0, a build using --experimental_remote_cache_chunking, a local --disk_cache, and top-level output downloads can report success while materializing a large output file as a single CDC chunk instead of the full blob.
The corrupted output has the size and contents of one CDC chunk, not the original file. For executable outputs this can result in Exec format error because the file no longer starts with the expected executable header.
- Bazel 9.1.0
--experimental_remote_cache_chunking
--disk_cache=...
--remote_download_outputs=toplevel or any code path materializing an output file through Bazel's path-backed output stream
- Remote cache/server that advertises CDC
SplitBlob / SpliceBlob support
A large executable output is downloaded from a remote cache hit, Bazel reports success, but the file in bazel-bin/... is only one CDC chunk. Disabling --experimental_remote_cache_chunking avoids the issue
Expected behavior
The top-level output should be the full reassembled blob matching the digest recorded in the ActionResult.
If chunk reassembly produces the wrong bytes, Bazel should fail the download rather than reporting a successful build with a corrupted output.
Root cause
Bazel 9.1.0's chunked download path reassembles a blob by passing the final caller-provided output stream into each per-chunk download:
for (Digest chunkDigest : chunkDigests) {
getFromFuture(combinedCache.downloadBlob(context, chunkDigest, out));
}
For top-level output materialization, that out eventually wraps a LazyFileOutputStream for the final output path. ReportingOutputStream and LazyFileOutputStream implement MaybePathBacked, so the local disk cache can see the final output path.
DiskCacheClient.download(...) has a path-backed fast path: when the output stream exposes a path, it copies the cached CAS object directly to that path. That is correct for whole-blob downloads, but it is wrong when the CAS object being downloaded is an individual CDC chunk and the path is the final parent blob output.
As a result, each chunk download can replace the final output file with that chunk. After the last chunk download, the output path contains only one chunk.
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
- Bazel 9.1.0
--experimental_remote_cache_chunking
--disk_cache=...
--remote_download_outputs=toplevel or any code path materializing an output file through Bazel's path-backed output stream
- Remote cache/server that advertises CDC
SplitBlob / SpliceBlob support
- Access a binary from the
bazel-bin after a build
Which operating system are you running Bazel on?
Linux x86
What is the output of bazel info release?
release 9.1.0
Resolved by #29614
Description of the bug:
With Bazel 9.1.0, a build using
--experimental_remote_cache_chunking, a local--disk_cache, and top-level output downloads can report success while materializing a large output file as a single CDC chunk instead of the full blob.The corrupted output has the size and contents of one CDC chunk, not the original file. For executable outputs this can result in
Exec format errorbecause the file no longer starts with the expected executable header.--experimental_remote_cache_chunking--disk_cache=...--remote_download_outputs=toplevelor any code path materializing an output file through Bazel's path-backed output streamSplitBlob/SpliceBlobsupportA large executable output is downloaded from a remote cache hit, Bazel reports success, but the file in
bazel-bin/...is only one CDC chunk. Disabling--experimental_remote_cache_chunkingavoids the issueExpected behavior
The top-level output should be the full reassembled blob matching the digest recorded in the
ActionResult.If chunk reassembly produces the wrong bytes, Bazel should fail the download rather than reporting a successful build with a corrupted output.
Root cause
Bazel 9.1.0's chunked download path reassembles a blob by passing the final caller-provided output stream into each per-chunk download:
For top-level output materialization, that
outeventually wraps aLazyFileOutputStreamfor the final output path.ReportingOutputStreamandLazyFileOutputStreamimplementMaybePathBacked, so the local disk cache can see the final output path.DiskCacheClient.download(...)has a path-backed fast path: when the output stream exposes a path, it copies the cached CAS object directly to that path. That is correct for whole-blob downloads, but it is wrong when the CAS object being downloaded is an individual CDC chunk and the path is the final parent blob output.As a result, each chunk download can replace the final output file with that chunk. After the last chunk download, the output path contains only one chunk.
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
--experimental_remote_cache_chunking--disk_cache=...--remote_download_outputs=toplevelor any code path materializing an output file through Bazel's path-backed output streamSplitBlob/SpliceBlobsupportbazel-binafter a buildWhich operating system are you running Bazel on?
Linux x86
What is the output of
bazel info release?release 9.1.0
Resolved by #29614