Support bounded parallel chunk transfers#29341
Conversation
|
FYI @tjgq I think this was a follow-up from the original PR |
|
@bazel-io fork 9.2.0 |
sluongng
left a comment
There was a problem hiding this comment.
I'm a bit skeptical of the current approach, so I will skip on reading the window filling implementation right now. (though Codex does suggest there is a problem there)
Each invocation may have multiple actions/spawns running in parallel, each creates multiple blob uploads/downloads, and some of those blobs are chunked blobs. Adding parallelism on the blob level feels like a local optimization, and the new flag does not offer strong control over the total parallelism of the invocation.
I wonder if we need something higher-level that lets us effectively enforce a global parallelism for uploads and downloads🤔
8969051 to
33689c3
Compare
There was a problem hiding this comment.
Pull request overview
Enables bounded parallelism for content-defined chunk (CDC) blob uploads/downloads, improving throughput while avoiding unbounded per-blob fanout.
Changes:
- Implement sliding-window style, per-blob bounded concurrency for chunk uploads in
ChunkedBlobUploader. - Implement sliding-window style, per-blob bounded concurrency for chunk downloads (including in-flight dedup) in
ChunkedBlobDownloader. - Add/expand unit tests for window refill, cancellation, and failure propagation; add a JMH benchmark binary target.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobUploader.java |
Adds bounded in-flight chunk upload window and cancellation on failure. |
src/main/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloader.java |
Adds bounded in-flight chunk download window with reassembly and in-flight dedup. |
src/test/java/com/google/devtools/build/lib/remote/ChunkedBlobUploaderTest.java |
Adds tests for window refill, cancellation, and failure handling for parallel uploads. |
src/test/java/com/google/devtools/build/lib/remote/ChunkedBlobDownloaderTest.java |
Updates tests for new download API and adds parallel-window behavior tests. |
src/test/java/com/google/devtools/build/lib/remote/ChunkedTransferBenchmark.java |
Introduces a JMH benchmark for chunked upload/download with latency + jitter. |
src/test/java/com/google/devtools/build/lib/remote/BUILD |
Adds a java_opt_binary target to run the new benchmark. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Updated this in the follow-up direction you suggested: I removed the flag/plumbing and kept only a small hardcoded per-blob window of 32 as a guard against huge single-blob fanout. The actual global active RPC limit is still the shared gRPC channel pool ( |
33689c3 to
8363b37
Compare
8363b37 to
976a25f
Compare
c64cbdb to
d846665
Compare
There was a problem hiding this comment.
My main point of contention with this approach is that files are buffered in memory in their entirety while being transferred, which could easily cause us to run out of heap. I'm concerned that this might make chunked transfers useless in practice for builds with lots of parallelism.
I think this is easy to fix for the upload side: instead of passing a byte[] around, pass a (path, offset) and delay the opening/reading of the InputStream until the upload starts. (That does increase the number of open descriptors, but as long as they're still bounded by MAX_IN_FLIGHT_CHUNK_UPLOADS, it might be ok.)
The download side unfortunately isn't as easy. We could try to write a chunk out as soon as it completes instead of waiting for all of them the next one in order, but that doesn't help much if the downloads complete roughly at the same time (which is what we'd expect if the transport is completely fair).
Any thoughts?
d846665 to
2a2987d
Compare
I like this. I adjusted the upload path to lazily matierialize the chunk. But for download we can't really do this. I adjusted the lookahead to match the concurrency, but realistically this would be rare that these take up a lot of memory, since we should immediately consume the memory once the next chunk in order is available. It's a tad messier |
2a2987d to
cc1c408
Compare
tjgq
left a comment
There was a problem hiding this comment.
Thanks - I'm happy with the current state; we can revisit the downloader if anyone reports problems. I only have a couple of nits, will approve once they've been addressed.
cc1c408 to
5e2051f
Compare
|
@bazel-io fork 9.2.0 |
@iancha1992 Let me know if there's issues importing this one here. They may be a merge conflicts |
|
@tyler-french We're currently on it. |
|
@bazel-io fork 9.2.0 |
|
@bazel-io fork 9.1.1 |
|
Hmm... During import, certain tests are somehow failing for RBE (https://buildkite.com/bazel/google-bazel-presubmit/builds/102948/list?jid=019e4143-51bb-4115-9a99-73ded8407a81&tab=artifacts). We recently upgraded RBE VMs to Ubuntu 24.04 from 20.04. Could you rebase on top of HEAD and see if you can trigger the same failures? |
5e2051f to
16f48f9
Compare
@Wyverald I think there's some underlying permissions issues with 24.04 because of unprivileged user namespace restrictions. These test failure do not seem like they could be caused by anything in this PR, probably just because they were previously cache hits which are now needing to be re-run |
|
EDIT: seems to still be failing 😞 |
|
Thanks for the investigation! @coeuvre to take a look as well. |
|
It might be related to a recent upgrade of the rbe worker pool. Still investigating. In the meanwhile, the affected tests on RBE have been disabled by bca5e06. |
16f48f9 to
cb1a9c2
Compare
|
@tyler-french Sure, this will be cherry-picked as soon as it's merged to master. |
Description
For
--experimental_remote_cache_chunkingimplemented in #28437This PR enables parallel uploads and downloads for chunked files to improve performance. Transport-level concurrency is still bounded by the existing remote cache gRPC connection/concurrency limits. The chunk transfer managers add a fixed per-blob window of 16 to prevent a single large blob from fanning out too aggressively.
To avoid issues with batches, uploads and downloads use simple sliding-window style transfer managers. This PR also fixes #29544 by properly copying the data to
outRELNOTES: CDC chunk uploads and downloads can now happen in parallel within a large blob.
Benchmarking
Benchmarks were rerun on 2026-05-12 with the JMH target
//src/test/java/com/google/devtools/build/lib/remote:ChunkedTransferBenchmark. The parent baseline was measured by running the same benchmark harness against the parent commit.With the synthetic benchmark of network delays and simulated jitter, the current branch is about 11-13x faster than the parent baseline for these cases. As usual, this synthetic benchmark is not a substitute for real remote-cache measurements.
After Change:
Before Change:
Big File:
Resolves: #29544