What is the bug?
In DocLevelMonitorQueries.kt (~line 500), the condition controlling when to re-fetch the write index for the query index alias uses != instead of ==, causing the re-fetch path to fire on every monitor execution rather than only in the backwards-compatibility case.
// BROKEN (current): fires ALWAYS because the stored concrete index name
// (e.g. ".opensearch-sap-pre-packaged-rules-queries-000001") always differs from the alias name
if (targetQueryIndex != monitor.dataSources.queryIndex && monitor.deleteQueryIndexInEveryRun == true)
// CORRECT: fires only when the alias name itself was mistakenly stored in metadata (legacy case)
if (targetQueryIndex == monitor.dataSources.queryIndex && monitor.deleteQueryIndexInEveryRun == true)
Every time the condition fires unnecessarily, getWriteIndexNameForAlias is called and metadata is rewritten, triggering a delete+recreate of the backing query index. This generates 6–10 MergeSchedulerConfig log lines per node per cycle.
How can one reproduce the bug?
- Deploy OpenSearch with the Security Analytics plugin and enable chained findings monitors (which set
deleteQueryIndexInEveryRun=true).
- Run any doc-level monitor backed by the default query index alias.
- Observe log output: each monitor execution produces a burst of
MergeSchedulerConfig log entries from the query index being deleted and recreated.
- At scale (3 master nodes, many detectors), this produces 23,000+ log lines/minute.
What is the expected behavior?
The re-fetch of the write index name from the alias should only occur in the backwards-compatibility case — when the metadata stored the alias name itself (not the concrete backing index name). Under normal operation the stored concrete index name differs from the alias, so the condition should evaluate to false and no delete+recreate should occur.
What is your host/environment?
- OS: Linux
- Version: OpenSearch 3.x
- Plugins: Security Analytics plugin with chained findings monitors enabled
Do you have any additional context?
This bug compounds with a related issue in opensearch-project/security-analytics: the chained_findings monitor is created with deleteQueryIndexInEveryRun=true, which is the trigger that activates the broken condition path. Both fixes are required to fully eliminate the log storm. The security-analytics fix prevents the flag from being set unnecessarily; this fix corrects the inverted logic that acts on the flag.
What is the bug?
In
DocLevelMonitorQueries.kt(~line 500), the condition controlling when to re-fetch the write index for the query index alias uses!=instead of==, causing the re-fetch path to fire on every monitor execution rather than only in the backwards-compatibility case.Every time the condition fires unnecessarily,
getWriteIndexNameForAliasis called and metadata is rewritten, triggering a delete+recreate of the backing query index. This generates 6–10MergeSchedulerConfiglog lines per node per cycle.How can one reproduce the bug?
deleteQueryIndexInEveryRun=true).MergeSchedulerConfiglog entries from the query index being deleted and recreated.What is the expected behavior?
The re-fetch of the write index name from the alias should only occur in the backwards-compatibility case — when the metadata stored the alias name itself (not the concrete backing index name). Under normal operation the stored concrete index name differs from the alias, so the condition should evaluate to
falseand no delete+recreate should occur.What is your host/environment?
Do you have any additional context?
This bug compounds with a related issue in
opensearch-project/security-analytics: thechained_findingsmonitor is created withdeleteQueryIndexInEveryRun=true, which is the trigger that activates the broken condition path. Both fixes are required to fully eliminate the log storm. The security-analytics fix prevents the flag from being set unnecessarily; this fix corrects the inverted logic that acts on the flag.