Overview
This report covers test failures observed in committed-code gradle-check runs (Timer and Post Merge Action builds against main) during the 24-hour window ending 2026-04-30. Nine distinct test failures were identified across 7 builds. Historical failure data is aggregated across all build types (including PR builds) to show the true flake rate.
None of the failures reproduced deterministically with the original seed on a local dev desktop, which is consistent with timing-sensitive flaky tests whose seeds control randomization but not thread scheduling, GC pauses, or network timing.
Summary Table
Sorted by total unique builds affected (all-time, all build types):
| Test |
Builds Affected (Total) |
First Seen |
Apr 2026 |
Trend |
Reproduced Locally? |
IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNode |
257 |
2024-05 |
12 |
⚠️ Worsening (resurgence after long quiet period) |
No |
IndexActionIT.testAutoGenerateIdNoDuplicates |
240 |
2024-03 |
21 |
⚠️ Worsening (steady climb since 2025-04) |
No |
FullRollingRestartIT.testFullRollingRestart |
232 |
2024-10 |
21 |
⚠️ Worsening (burst pattern, active since 2026-02) |
No |
RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs |
190 |
2024-03 |
7 |
➡️ Stable (chronic, low-level) |
No |
AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback |
158 |
2024-03 |
4 |
➡️ Stable (chronic, low-level) |
N/A (requires Docker fixture) |
FullRollingRestartIT.testFullRollingRestart_withNoRecoveryPayloadAndSource |
113 |
2024-10 |
9 |
⚠️ Worsening (tracks testFullRollingRestart pattern) |
No |
EhcacheDiskCacheManagerTests.testCreateAndCloseCacheConcurrently |
21 |
2025-07 |
7 |
⚠️ Worsening (spike in Apr 2026) |
No |
WarmIndexSegmentReplicationIT.testRestartPrimary_NoReplicas |
21 |
2024-04 |
3 |
➡️ Stable (low-frequency, long-lived) |
No |
WarmIndexSegmentReplicationIT.testReplicaHasDiffFilesThanPrimary |
16 |
2024-07 |
3 |
➡️ Stable (low-frequency) |
No |
Detailed Findings
1. IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNode
- Recent build: #75416
- Error: Suite timeout reached
- Seed:
2259C86F44A4690D (suite-level only — individual test seed not available due to timeout)
- Local reproduction: Did not reproduce
- Pattern: Massive burst in May–Jun 2024 (120+119 builds), then nearly silent until Mar 2026 (3 builds) and a sharp resurgence in Apr 2026 (12 builds). The Apr 2026 uptick coincides with the CI runner migration to m7a.8xlarge — faster CPUs may be amplifying a latent timing issue. The failure mode (suite timeout) suggests the test or a co-located test in the suite hangs intermittently.
- Monthly:
2024-05:120, 2024-06:119, 2024-07:1, 2024-12:1, 2025-02:1, 2026-03:3, 2026-04:12
2. IndexActionIT.testAutoGenerateIdNoDuplicates
- Recent builds: #75437, #75422
- Error: Count mismatch — expected document count doesn't match actual (e.g., "Count is 79 but 73 was expected")
- Seeds:
DF7FD6BB7B77DE60:EC27A2EA45D12891, 29883286046413C0:1AD046D73AC2E531
- Local reproduction: Did not reproduce with either seed
- Pattern: Chronic flaky test present since Mar 2024. Significant escalation starting Apr 2025 (46 builds), remaining elevated since. The test runs with
cluster.indices.replication.strategy=SEGMENT parameterization. The count mismatch suggests a race between auto-generated ID indexing and the subsequent count verification under segment replication.
- Monthly:
2024-03:1, 2024-04:1, 2024-06:3, 2024-07:3, 2024-09:3, 2024-10:10, 2024-11:4, 2024-12:2, 2025-01:6, 2025-02:4, 2025-03:7, 2025-04:46, 2025-05:12, 2025-06:11, 2025-07:20, 2025-08:13, 2025-09:9, 2025-10:9, 2025-11:8, 2025-12:10, 2026-01:7, 2026-02:11, 2026-03:19, 2026-04:21
3. FullRollingRestartIT.testFullRollingRestart
- Recent build: #75436
- Error:
replica shards haven't caught up with primary expected:<24> but was:<19>
- Seed:
A69557A7C9948C52:A7D259FE9CC06338
- Local reproduction: Did not reproduce
- Pattern: Burst pattern — isolated hit in Oct 2024, then massive spike Jul 2025 (105 builds), tapering through Aug 2025, quiet Sep 2025–Jan 2026, then resurgence Feb 2026 (35) continuing through Apr 2026 (21). Runs with
cluster.indices.replication.strategy=SEGMENT. The assertion failure indicates replicas lag behind the primary during rolling restart, a timing-sensitive condition.
- Monthly:
2024-10:1, 2025-04:1, 2025-07:105, 2025-08:43, 2025-09:1, 2026-02:35, 2026-03:25, 2026-04:21
4. RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs
- Recent build: #75412
- Error:
java.lang.AssertionError (bare assertion, no message)
- Seed:
6BE1A9E35D5F1D77:A6167C3F1B40CAB1
- Local reproduction: Did not reproduce
- Pattern: One of the longest-running flaky tests, present since Mar 2024. Peaked Aug–Sep 2024 (19 builds/month), then declined to a low chronic rate of 2–9 builds/month. Stable and persistent — not worsening but not improving either.
- Monthly:
2024-03:1, 2024-04:8, 2024-05:14, 2024-06:5, 2024-07:13, 2024-08:19, 2024-09:19, 2024-10:16, 2024-11:9, 2024-12:8, 2025-01:16, 2025-02:5, 2025-03:3, 2025-04:2, 2025-05:4, 2025-06:3, 2025-07:3, 2025-08:7, 2025-09:3, 2025-10:2, 2025-11:3, 2025-12:2, 2026-01:7, 2026-02:2, 2026-03:9, 2026-04:7
5. AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback
- Recent build: #75422
- Error:
RepositoryVerificationException: path is not accessible on cluster-manager node
- Seed: Not available in error output
- Local reproduction: Could not attempt — requires Docker compose for Azure fixture
- Pattern: Chronic since Mar 2024. Peaked Aug–Sep 2024 and Dec 2025, otherwise steady at 1–6 builds/month. The error suggests an infrastructure/fixture issue rather than a code logic bug.
- Monthly:
2024-03:1, 2024-04:6, 2024-05:3, 2024-06:6, 2024-07:4, 2024-08:18, 2024-09:22, 2024-10:8, 2024-11:5, 2024-12:11, 2025-01:3, 2025-02:4, 2025-03:3, 2025-04:7, 2025-05:6, 2025-06:2, 2025-07:1, 2025-08:1, 2025-09:6, 2025-10:6, 2025-11:5, 2025-12:15, 2026-01:4, 2026-02:4, 2026-03:3, 2026-04:4
6. FullRollingRestartIT.testFullRollingRestart_withNoRecoveryPayloadAndSource
- Recent build: #75424
- Error:
replica shards haven't caught up with primary expected:<16> but was:<13>
- Seed:
ECF4F5EFE2ED5E0C:DFBD9FAF4BA03F54
- Local reproduction: Did not reproduce
- Pattern: Tracks
testFullRollingRestart almost exactly — same class, same root cause (replica lag under SEGMENT replication during rolling restart), same burst timeline. Jul 2025 spike (47), Feb 2026 resurgence (17), continuing Apr 2026 (9).
- Monthly:
2024-10:1, 2025-04:1, 2025-07:47, 2025-08:24, 2025-09:1, 2026-02:17, 2026-03:13, 2026-04:9
7. EhcacheDiskCacheManagerTests.testCreateAndCloseCacheConcurrently
- Recent build: #75418
- Error: Suite timeout reached
- Seed:
E8DBFF5D8F617ACF (suite-level only)
- Local reproduction: Did not reproduce
- Pattern: Newer flaky test, first seen Jul 2025. Low frequency (1–4 builds/month) until Apr 2026 spike to 7 builds. The Apr 2026 increase coincides with the m7a.8xlarge CI runner migration. Suite timeout suggests the concurrent cache create/close test may deadlock or hang under certain thread interleavings.
- Monthly:
2025-07:1, 2025-08:2, 2025-09:1, 2025-10:2, 2025-11:4, 2025-12:2, 2026-01:2, 2026-04:7
8. WarmIndexSegmentReplicationIT.testRestartPrimary_NoReplicas
- Recent build: #75437
- Error:
FileCache did not initialise correctly
- Seed:
DF7FD6BB7B77DE60:A56DA7FD5AE1A066
- Local reproduction: Did not reproduce
- Pattern: Low-frequency, long-lived flaky test since Apr 2024. Sporadic 1–3 builds/month with no clear trend. The FileCache initialization error suggests a race condition during primary restart in warm index segment replication.
- Monthly:
2024-04:2, 2024-05:1, 2024-07:2, 2024-09:2, 2024-12:1, 2025-06:1, 2025-08:3, 2025-09:1, 2025-12:2, 2026-02:3, 2026-04:3
9. WarmIndexSegmentReplicationIT.testReplicaHasDiffFilesThanPrimary
- Recent build: #75424
- Error:
Unexpected ShardFailures — refresh operation failed with RemoteTransportException
- Seed:
ECF4F5EFE2ED5E0C:1B18569E388D817B
- Local reproduction: Did not reproduce
- Pattern: Lowest frequency of all tests in this report (16 total builds since Jul 2024). Sporadic 1–3 builds/month. The shard refresh failure during segment replication suggests a transient network/transport issue in the test cluster.
- Monthly:
2024-07:2, 2024-09:2, 2024-12:1, 2025-03:1, 2025-07:1, 2025-08:3, 2025-12:1, 2026-01:2, 2026-04:3
Notes
- CI runner migration: Around 2026-04-15, gradle-check runners moved from m5.8xlarge to m7a.8xlarge. Several tests show Apr 2026 upticks that may be attributable to faster CPUs amplifying latent timing races (notably
IndicesRequestCacheIT, EhcacheDiskCacheManagerTests).
- Seed non-determinism: All 8 locally-tested seeds passed, confirming these failures depend on factors beyond RandomizedRunner seed control (thread scheduling, GC timing, network ordering). This is expected for integration tests, especially those involving segment replication and rolling restarts.
- SEGMENT replication: 4 of 9 tests run with
cluster.indices.replication.strategy=SEGMENT parameterization. Segment replication timing sensitivity is a recurring theme.
Overview
This report covers test failures observed in committed-code gradle-check runs (Timer and Post Merge Action builds against
main) during the 24-hour window ending 2026-04-30. Nine distinct test failures were identified across 7 builds. Historical failure data is aggregated across all build types (including PR builds) to show the true flake rate.None of the failures reproduced deterministically with the original seed on a local dev desktop, which is consistent with timing-sensitive flaky tests whose seeds control randomization but not thread scheduling, GC pauses, or network timing.
Summary Table
Sorted by total unique builds affected (all-time, all build types):
IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNodeIndexActionIT.testAutoGenerateIdNoDuplicatesFullRollingRestartIT.testFullRollingRestartRemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocsAzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollbackFullRollingRestartIT.testFullRollingRestart_withNoRecoveryPayloadAndSourceEhcacheDiskCacheManagerTests.testCreateAndCloseCacheConcurrentlyWarmIndexSegmentReplicationIT.testRestartPrimary_NoReplicasWarmIndexSegmentReplicationIT.testReplicaHasDiffFilesThanPrimaryDetailed Findings
1. IndicesRequestCacheIT.testDeleteAndCreateSameIndexShardOnSameNode
2259C86F44A4690D(suite-level only — individual test seed not available due to timeout)2024-05:120, 2024-06:119, 2024-07:1, 2024-12:1, 2025-02:1, 2026-03:3, 2026-04:122. IndexActionIT.testAutoGenerateIdNoDuplicates
DF7FD6BB7B77DE60:EC27A2EA45D12891,29883286046413C0:1AD046D73AC2E531cluster.indices.replication.strategy=SEGMENTparameterization. The count mismatch suggests a race between auto-generated ID indexing and the subsequent count verification under segment replication.2024-03:1, 2024-04:1, 2024-06:3, 2024-07:3, 2024-09:3, 2024-10:10, 2024-11:4, 2024-12:2, 2025-01:6, 2025-02:4, 2025-03:7, 2025-04:46, 2025-05:12, 2025-06:11, 2025-07:20, 2025-08:13, 2025-09:9, 2025-10:9, 2025-11:8, 2025-12:10, 2026-01:7, 2026-02:11, 2026-03:19, 2026-04:213. FullRollingRestartIT.testFullRollingRestart
replica shards haven't caught up with primary expected:<24> but was:<19>A69557A7C9948C52:A7D259FE9CC06338cluster.indices.replication.strategy=SEGMENT. The assertion failure indicates replicas lag behind the primary during rolling restart, a timing-sensitive condition.2024-10:1, 2025-04:1, 2025-07:105, 2025-08:43, 2025-09:1, 2026-02:35, 2026-03:25, 2026-04:214. RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs
java.lang.AssertionError(bare assertion, no message)6BE1A9E35D5F1D77:A6167C3F1B40CAB12024-03:1, 2024-04:8, 2024-05:14, 2024-06:5, 2024-07:13, 2024-08:19, 2024-09:19, 2024-10:16, 2024-11:9, 2024-12:8, 2025-01:16, 2025-02:5, 2025-03:3, 2025-04:2, 2025-05:4, 2025-06:3, 2025-07:3, 2025-08:7, 2025-09:3, 2025-10:2, 2025-11:3, 2025-12:2, 2026-01:7, 2026-02:2, 2026-03:9, 2026-04:75. AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback
RepositoryVerificationException: path is not accessible on cluster-manager node2024-03:1, 2024-04:6, 2024-05:3, 2024-06:6, 2024-07:4, 2024-08:18, 2024-09:22, 2024-10:8, 2024-11:5, 2024-12:11, 2025-01:3, 2025-02:4, 2025-03:3, 2025-04:7, 2025-05:6, 2025-06:2, 2025-07:1, 2025-08:1, 2025-09:6, 2025-10:6, 2025-11:5, 2025-12:15, 2026-01:4, 2026-02:4, 2026-03:3, 2026-04:46. FullRollingRestartIT.testFullRollingRestart_withNoRecoveryPayloadAndSource
replica shards haven't caught up with primary expected:<16> but was:<13>ECF4F5EFE2ED5E0C:DFBD9FAF4BA03F54testFullRollingRestartalmost exactly — same class, same root cause (replica lag under SEGMENT replication during rolling restart), same burst timeline. Jul 2025 spike (47), Feb 2026 resurgence (17), continuing Apr 2026 (9).2024-10:1, 2025-04:1, 2025-07:47, 2025-08:24, 2025-09:1, 2026-02:17, 2026-03:13, 2026-04:97. EhcacheDiskCacheManagerTests.testCreateAndCloseCacheConcurrently
E8DBFF5D8F617ACF(suite-level only)2025-07:1, 2025-08:2, 2025-09:1, 2025-10:2, 2025-11:4, 2025-12:2, 2026-01:2, 2026-04:78. WarmIndexSegmentReplicationIT.testRestartPrimary_NoReplicas
FileCache did not initialise correctlyDF7FD6BB7B77DE60:A56DA7FD5AE1A0662024-04:2, 2024-05:1, 2024-07:2, 2024-09:2, 2024-12:1, 2025-06:1, 2025-08:3, 2025-09:1, 2025-12:2, 2026-02:3, 2026-04:39. WarmIndexSegmentReplicationIT.testReplicaHasDiffFilesThanPrimary
Unexpected ShardFailures— refresh operation failed with RemoteTransportExceptionECF4F5EFE2ED5E0C:1B18569E388D817B2024-07:2, 2024-09:2, 2024-12:1, 2025-03:1, 2025-07:1, 2025-08:3, 2025-12:1, 2026-01:2, 2026-04:3Notes
IndicesRequestCacheIT,EhcacheDiskCacheManagerTests).cluster.indices.replication.strategy=SEGMENTparameterization. Segment replication timing sensitivity is a recurring theme.