You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Flaky test report: committed-code failures on 2026-05-09
Summary
Analysis of gradle-check failures against committed code (Timer and Post Merge Action builds targeting main) in the 24 hours ending 2026-05-09T10:00Z. Found 31 failure records across 7 distinct builds, representing 10 distinct failing tests (excluding class-level classMethod duplicates). Build 76364 had a systemic failure affecting all qa:smoke-test-http tests due to a RestCancellableNodeClient channel tracking issue.
Error: SpanData validation failed for validator AllSpansAreEndedProperly — spans from dispatchedShardOperationOnPrimary not ended during cluster disruption
Seed: CB09D2F627911882
Reproduced locally: ❌ No (passed with seed)
First seen: 2024-04-05
Total unique builds affected: 103
Pattern: Chronic low-rate flake, stable at ~3-7 builds/month since inception. Slight uptick in Apr 2026 (5 builds) but within historical range. This is a telemetry validation issue during disruption scenarios where spans are not properly ended when nodes are disrupted mid-operation.
Pattern: Chronic flake with a major spike in Nov 2025 (42 builds). Otherwise stable at 1-6 builds/month. The Nov 2025 spike suggests a temporary regression that was later fixed. Currently at baseline rate (~1-2/month).
Error: expected:<0> but was:<2> in waitForTwoOutstandingRequests
Seed: 8410C0C2683BE2F0
Reproduced locally: ❌ No (passed with seed)
First seen: 2024-03-26
Total unique builds affected: 198
Pattern: Chronic high-rate flake. Historically 5-18 builds/month. Notable worsening trend: Apr 2026 had 10 builds, May 2026 already has 11 builds (9 days in). This is a timing-sensitive test that uses assertBusy to wait for outstanding requests, suggesting a race condition in indexing pressure tracking.
Error: expected:<2> but was:<1> at NRTReplicationEngineTests.java:81
Seed: CF6EC8EA3A74B5A6
Reproduced locally: ✅ Yes (deterministic with seed)
First seen: 2025-10-13
Total unique builds affected: 14
Pattern: Relatively new flake, first seen Oct 2025. Low rate (1-3 builds/month) but worsening — Apr 2026 had 3 builds, May 2026 already has 3 builds (9 days in). The deterministic reproduction with seed suggests a test-logic bug rather than a timing issue.
Error: java.lang.AssertionError (assertion on lag metrics)
Seeds: CF6EC8EA3A74B5A6, AE215E3F25600C15
Reproduced locally: ❌ No (passed with seed)
First seen: 2025-10-15
Total unique builds affected: 31
Pattern: New and rapidly worsening. Dormant from Nov 2025–Feb 2026, then exploded: Mar 2026 (8 builds), Apr 2026 (13 builds), May 2026 (8 builds in 9 days). This is the most actively worsening test in this report. The Kafka integration tests use embedded Kafka which is sensitive to timing.
Pattern: Chronic flake, stable at 2-6 builds/month. Uptick in Apr 2026 (10 builds) which correlates with the mid-April CI runner migration to m7a.8xlarge. May be CPU-speed sensitive.
Reproduced locally: ⚠️ Could not run (requires JAVA21_HOME for BWC build)
First seen: 2024-03-25
Total unique builds affected: 257
Pattern: Chronic high-rate flake, the most-affected test in this report by total build count. Consistently 4-29 builds/month. Recent months show elevated rates (Feb 2026: 16, Mar 2026: 18, May 2026: 10 in 9 days). This is a rolling-upgrade test that exercises segment replication across versions.
Error: 1 channels still being tracked in RestCancellableNodeClient while there should be none expected:<0> but was:<1>
Seed: 770632E0BB388172
Reproduced locally: ❌ No (passed with seed)
First seen: 2024-03-26
Total unique builds affected: 427
Pattern: Chronic high-rate flake affecting the entire qa:smoke-test-http test suite. This is a teardown-time assertion that a REST channel was not properly cleaned up. In build 76364, it caused all HTTP integration tests to fail (14+ test methods across 7 test classes). Historically 5-47 builds/month. Notable spike in Nov 2025 (47 builds). Currently elevated: Apr 2026 (23 builds), May 2026 (25 builds in 9 days). Worsening.
10. DetailedErrorsDisabledIT (same root cause as #9)
Note: Same root cause as SearchRestCancellationIT. All HTTP tests in build 76364 failed with this same assertion. Listed separately because it's a different test class, but the fix would be the same.
Build 76364 had a systemic failure: a single RestCancellableNodeClient channel leak caused all 7 HTTP test classes (14+ methods) to fail. This is a single root cause manifesting across many tests.
NRTReplicationEngineTests.testAcquireLastIndexCommit is the only test that reproduced deterministically with its seed, suggesting a test-logic bug rather than a timing/environmental issue.
IngestFromKafkaIT.testAllActiveOffsetBasedLag is the most rapidly worsening test — it went from dormant to 8-13 failures/month in the span of 3 months.
The April 2026 CI runner migration (m5.8xlarge → m7a.8xlarge) correlates with upticks in SharedClusterSnapshotRestoreIT and ShardIndexingPressureSettingsIT.
IndexingIT.testIndexingWithSegRep could not be reproduced locally because the rolling-upgrade test requires JAVA21_HOME for building the BWC distribution.
Flaky test report: committed-code failures on 2026-05-09
Summary
Analysis of gradle-check failures against committed code (Timer and Post Merge Action builds targeting
main) in the 24 hours ending 2026-05-09T10:00Z. Found 31 failure records across 7 distinct builds, representing 10 distinct failing tests (excluding class-levelclassMethodduplicates). Build 76364 had a systemic failure affecting allqa:smoke-test-httptests due to aRestCancellableNodeClientchannel tracking issue.Failing Tests
1. ClusterDisruptionIT (classMethod)
SpanData validation failed for validator AllSpansAreEndedProperly— spans fromdispatchedShardOperationOnPrimarynot ended during cluster disruptionCB09D2F6279118822. RemoteSplitIndexIT.testCreateSplitIndex
expected:<0> but was:<67>26B8D7F0BC427F513. ShardIndexingPressureSettingsIT.testShardIndexingPressureEnforcedEnabledDisabledSetting
expected:<0> but was:<2>inwaitForTwoOutstandingRequests8410C0C2683BE2F0assertBusyto wait for outstanding requests, suggesting a race condition in indexing pressure tracking.4. NRTReplicationEngineTests.testAcquireLastIndexCommit
expected:<2> but was:<1>atNRTReplicationEngineTests.java:81CF6EC8EA3A74B5A65. IngestFromKafkaIT.testAllActiveOffsetBasedLag
java.lang.AssertionError(assertion on lag metrics)CF6EC8EA3A74B5A6,AE215E3F25600C156. IngestFromKafkaIT.testCloseIndex
ConditionTimeoutException: Condition was not fulfilled within 1 minutes84563A7AEEF383D27. SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot
Expected: <0L> but: was <1L>5FD3E28C78CB69CC8. IndexingIT.testIndexingWithSegRep
expected:<0> but was:<1>3DD511F96246C1629. SearchRestCancellationIT (multiple methods) — RestCancellableNodeClient channel leak
1 channels still being tracked in RestCancellableNodeClient while there should be none expected:<0> but was:<1>770632E0BB388172qa:smoke-test-httptest suite. This is a teardown-time assertion that a REST channel was not properly cleaned up. In build 76364, it caused all HTTP integration tests to fail (14+ test methods across 7 test classes). Historically 5-47 builds/month. Notable spike in Nov 2025 (47 builds). Currently elevated: Apr 2026 (23 builds), May 2026 (25 builds in 9 days). Worsening.10. DetailedErrorsDisabledIT (same root cause as #9)
RestCancellableNodeClientchannel tracking assertion as Bump nebula.ospackage-base from 9.0.0 to 9.1.1 in /distribution/packages #9770632E0BB388172Summary Table
Notes
RestCancellableNodeClientchannel leak caused all 7 HTTP test classes (14+ methods) to fail. This is a single root cause manifesting across many tests.