fix: stop DestinationMigrationCoordinator cycling on unrelated cluster events#2163
Open
thecodingshrimp wants to merge 1 commit into
Conversation
…Changed() to stop migration cycle on unrelated cluster events
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
DestinationMigrationCoordinator.clusterChanged()has no index-scope filter. Any cluster state event resetsfinishFlagand reschedules the migration job — even events completely unrelated to the scheduled-jobs index (e.g., datastream mapping updates from security-analytics). On a cluster where destinations have already been migrated, this causes an infinite Reset→Cancel→Perform cycle that generates continuous log noise.Root cause
The
clusterChangedcallback resetsfinishFlag = falseand reschedules unconditionally. Once migration completes andfinishFlag = true, the next unrelated cluster event immediately restarts the cycle.Fix
Add the same index-scope guard used by
JobSweeper:```kotlin
if (!event.indexRoutingTableChanged(ScheduledJob.SCHEDULED_JOBS_INDEX)) return
```
This ensures the coordinator only acts on routing-table changes to
.opensearch-alerting-scheduled-jobs— the only index whose state is actually relevant to destination migration.Testing
finishFlagor the migration scheduler.opensearch-alerting-scheduled-jobsrouting changes