Skip to content

[9.1.1] Support bounded parallel chunk transfers (https://github.com/bazelbui…#29614

Open
iancha1992 wants to merge 1 commit into
bazelbuild:release-9.1.1from
iancha1992:cp29586
Open

[9.1.1] Support bounded parallel chunk transfers (https://github.com/bazelbui…#29614
iancha1992 wants to merge 1 commit into
bazelbuild:release-9.1.1from
iancha1992:cp29586

Conversation

@iancha1992
Copy link
Copy Markdown
Member

…ld/bazel/pull/29341)

For --experimental_remote_cache_chunking implemented in #28437

This PR enables parallel uploads and downloads for chunked files to improve performance. Transport-level concurrency is still bounded by the existing remote cache gRPC connection/concurrency limits. The chunk transfer managers add a fixed per-blob window of 16 to prevent a single large blob from fanning out too aggressively.

To avoid issues with batches, uploads and downloads use simple sliding-window style transfer managers.

RELNOTES: CDC chunk uploads and downloads can now happen in parallel within a large blob.

Benchmarks were rerun on 2026-05-12 with the JMH target //src/test/java/com/google/devtools/build/lib/remote:ChunkedTransferBenchmark. The parent baseline was measured by running the same benchmark harness against the parent commit.

With the synthetic benchmark of network delays and simulated jitter, the current branch is about 11-13x faster than the parent baseline for these cases. As usual, this synthetic benchmark is not a substitute for real remote-cache measurements.

After Change:

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt   Score   Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   1  avgt    3  67.963 ± 2.039  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   2  avgt    3  67.925 ± 2.257  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   4  avgt    3  67.915 ± 2.497  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   8  avgt    3  67.946 ± 2.863  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   1  avgt    3  61.357 ± 7.385  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   2  avgt    3  61.373 ± 7.920  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   4  avgt    3  61.326 ± 8.220  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   8  avgt    3  61.393 ± 8.376  ms/op

Before Change:

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt    Score     Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   1  avgt    3  812.081 ± 463.666  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   2  avgt    3  812.570 ± 442.021  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   4  avgt    3  811.883 ± 459.534  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   8  avgt    3  812.371 ± 461.231  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   1  avgt    3  740.734 ± 389.653  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   2  avgt    3  742.434 ± 412.117  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   4  avgt    3  742.483 ± 395.466  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   8  avgt    3  742.509 ± 397.122  ms/op

Big File:

CURRENT BRANCH (512 MiB)

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt     Score     Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A           512           1048576             25              N/A              10                   8  avgt    3   991.101 ± 173.890  ms/op
ChunkedTransferBenchmark.uploadChunked                1048576           N/A               N/A             25        536870912              10                   8  avgt    3  1087.504 ± 123.857  ms/op
PARENT BASELINE (512 MiB)

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt      Score      Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A           512           1048576             25              N/A              10                   8  avgt    3  12849.136 ± 1733.063  ms/op
ChunkedTransferBenchmark.uploadChunked                1048576           N/A               N/A             25        536870912              10                   8  avgt    3  11888.109 ± 2591.883  ms/op

Closes #29341.

PiperOrigin-RevId: 919144974
Change-Id: Iaa0ca8971bd21c879f21c708327b5ddd837ecf1f

Description

Motivation

Build API Changes

No

Checklist

  • I have added tests for the new use cases (if any).
  • I have updated the documentation (if applicable).

Release Notes

RELNOTES: None

Commit 084958b

For `--experimental_remote_cache_chunking` implemented in bazelbuild#28437

This PR enables parallel uploads and downloads for chunked files to improve performance. Transport-level concurrency is still bounded by the existing remote cache gRPC connection/concurrency limits. The chunk transfer managers add a fixed per-blob window of 16 to prevent a single large blob from fanning out too aggressively.

To avoid issues with batches, uploads and downloads use simple sliding-window style transfer managers.

RELNOTES: CDC chunk uploads and downloads can now happen in parallel within a large blob.

Benchmarks were rerun on 2026-05-12 with the JMH target `//src/test/java/com/google/devtools/build/lib/remote:ChunkedTransferBenchmark`. The parent baseline was measured by running the same benchmark harness against the parent commit.

With the synthetic benchmark of network delays and simulated jitter, the current branch is about 11-13x faster than the parent baseline for these cases. As usual, this synthetic benchmark is not a substitute for real remote-cache measurements.

After Change:
```
Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt   Score   Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   1  avgt    3  67.963 ± 2.039  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   2  avgt    3  67.925 ± 2.257  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   4  avgt    3  67.915 ± 2.497  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   8  avgt    3  67.946 ± 2.863  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   1  avgt    3  61.357 ± 7.385  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   2  avgt    3  61.373 ± 7.920  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   4  avgt    3  61.326 ± 8.220  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   8  avgt    3  61.393 ± 8.376  ms/op
```

Before Change:
```
Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt    Score     Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   1  avgt    3  812.081 ± 463.666  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   2  avgt    3  812.570 ± 442.021  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   4  avgt    3  811.883 ± 459.534  ms/op
ChunkedTransferBenchmark.downloadChunked                  N/A            32              1024             25              N/A              10                   8  avgt    3  812.371 ± 461.231  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   1  avgt    3  740.734 ± 389.653  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   2  avgt    3  742.434 ± 412.117  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   4  avgt    3  742.483 ± 395.466  ms/op
ChunkedTransferBenchmark.uploadChunked                   1024           N/A               N/A             25            32768              10                   8  avgt    3  742.509 ± 397.122  ms/op
```

Big File:
```
CURRENT BRANCH (512 MiB)

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt     Score     Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A           512           1048576             25              N/A              10                   8  avgt    3   991.101 ± 173.890  ms/op
ChunkedTransferBenchmark.uploadChunked                1048576           N/A               N/A             25        536870912              10                   8  avgt    3  1087.504 ± 123.857  ms/op
```
```
PARENT BASELINE (512 MiB)

Benchmark                                 (avgChunkSizeBytes)  (chunkCount)  (chunkSizeBytes)  (delayMillis)  (fileSizeBytes)  (jitterMillis)  (schedulerThreads)  Mode  Cnt      Score      Error  Units
ChunkedTransferBenchmark.downloadChunked                  N/A           512           1048576             25              N/A              10                   8  avgt    3  12849.136 ± 1733.063  ms/op
ChunkedTransferBenchmark.uploadChunked                1048576           N/A               N/A             25        536870912              10                   8  avgt    3  11888.109 ± 2591.883  ms/op
```

Closes bazelbuild#29341.

PiperOrigin-RevId: 919144974
Change-Id: Iaa0ca8971bd21c879f21c708327b5ddd837ecf1f
@iancha1992 iancha1992 requested a review from a team as a code owner May 21, 2026 18:34
@github-actions github-actions Bot added team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels May 21, 2026
@iancha1992 iancha1992 requested a review from tjgq May 21, 2026 18:34
@iancha1992 iancha1992 enabled auto-merge May 21, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-review PR is awaiting review from an assigned reviewer team-Remote-Exec Issues and PRs for the Execution (Remote) team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants