Skip to content

Prune search shard groups using index-level field domains before can_match#21865

Open
laminelam wants to merge 2 commits into
opensearch-project:mainfrom
laminelam:feature/shard_pruning_m27
Open

Prune search shard groups using index-level field domains before can_match#21865
laminelam wants to merge 2 commits into
opensearch-project:mainfrom
laminelam:feature/shard_pruning_m27

Conversation

@laminelam
Copy link
Copy Markdown
Contributor

@laminelam laminelam commented May 28, 2026

Resolves #21566

Implements the search-core consumer side of index-level search pruning described in the RFC.

This adds a conservative coordinator-side pruning step before can_match that uses finalized index-level field-domain metadata from IndexMetadata custom data. If metadata is missing, malformed, unfinalized, unsupported, or unsafe to evaluate, the shard group is kept.

The implementation introduces generic field-domain abstractions with initial date-range support, mandatory query-constraint extraction, dynamic pruning settings, PIT safety guards, and test coverage.

Benchmark

  • 90 daily indices
  • 1M docs

Note: Benchmark was done on local (no network involved). Although the improvement is already significant locally it should be even more pronounced in a distributed cluster, where pruning also avoids remote can_match requests.

.

Metric Pruning disabled Pruning enabled Reduction Improvement
Avg wall time 60.084 ms 11.845 ms 80.29% 5.07x faster
P50 wall time 66.713 ms 11.610 ms 82.60% 5.75x faster
P90 wall time 85.724 ms 13.432 ms 84.33% 6.38x faster
Avg response took 59.200 ms 11.100 ms 81.25% 5.33x faster
Avg can_match requests 27 0.6 97.78% 45x fewer

.

Core Classes

Class Role
SearchIndexPruningService Main coordinator-side service. Computes pruned shard groups and fails open on pruning errors.
SearchIndexPruningSettings Dynamic cluster settings: enabled flag, min shard threshold, configured pruning fields.
SearchIndexPruningResult Immutable result describing which original shard groups may be skipped.
QueryConstraint Generic mandatory query constraint abstraction.
RangeQueryConstraint Query-side range constraint. It is intentionally not date-specific.
QueryConstraintExtractor Extracts mandatory constraints from a search request.
MandatoryQueryConstraintExtractor Walks mandatory query clauses and emits constraints for configured pruning fields.
FieldDomain Index-level description of values that may exist for one field in one concrete index.
DateRangeFieldDomain Initial field-domain implementation for date
FieldDomainProvider Resolves field domains for an index and field.
IndexFieldDomainMetadata Codec for index_field_domains custom metadata.
FieldDomainParser Parser/encoder for one field-domain metadata type.
FieldDomainParserRegistry Maps metadata type values to parser implementations.
FieldDomainEvaluator Proves whether a field domain can or cannot match a query constraint.
DateRangeFieldDomainEvaluator Evaluates date range domains against range query constraints using OpenSearch date parsing.
FieldDomainEvaluationContext Request-scoped evaluation context, currently search start time for date math now.
ClusterStateFieldDomainProvider Reads field-domain metadata from IndexMetadata.getCustomData(...).

.

image

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@laminelam laminelam requested a review from a team as a code owner May 28, 2026 01:10
@github-actions github-actions Bot added enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc labels May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

PR Reviewer Guide 🔍

(Review updated until commit 02bd664)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

When all shard groups are skipped, activeShardIndexToOriginalShardIndex is empty, so firstActiveShardIndex defaults to 0. If possibleMatches.set(0) is called, it sets the bit for the original shard group at index 0, which may already be skipped. This could cause a skipped shard group to be incorrectly marked as a possible match.

if (results.hasActiveShards()) {
    possibleMatches.set(results.getFirstActiveShardIndex());
}
Possible Issue

When all active shard groups would be pruned, the method returns notPruned(shardIterators) to preserve existing skip flags. However, if some shard groups were already skipped before pruning (e.g., by another mechanism), and all remaining active groups are pruned, the result claims no pruning occurred. This could mislead callers about whether pruning logic ran successfully versus being bypassed entirely.

if (pruned == activeShardGroups) {
    /*
     * Pruning is an optimization. If every currently active shard group would be skipped, fall back to
     * the original iterators until zero-shard response semantics are covered explicitly for this
     * pre-can-match stage. Existing skip flags are preserved because they may have been set by another
     * search execution step.
     */
    return SearchIndexPruningResult.notPruned(shardIterators);
}
Possible Issue

When lowerInclusive equals Long.MAX_VALUE and includeLower is false, the code increments to produce an exclusive lower bound. This causes overflow, wrapping to Long.MIN_VALUE, which inverts the range and returns an empty range instead of detecting the overflow condition. A query like gte(Long.MAX_VALUE, false) would incorrectly prune all indexes.

if (constraint.includeLower() == false) {
    if (lowerInclusive == Long.MAX_VALUE) {
        return Optional.of(NormalizedRange.EMPTY);
    }
    lowerInclusive++;
}
Possible Issue

When upperInclusive equals Long.MIN_VALUE and includeUpper is false, the code decrements to produce an exclusive upper bound. This causes underflow, wrapping to Long.MAX_VALUE, which inverts the range and returns an empty range instead of detecting the underflow condition. A query like lte(Long.MIN_VALUE, false) would incorrectly prune all indexes.

if (constraint.includeUpper() == false) {
    if (upperInclusive == Long.MIN_VALUE) {
        return Optional.of(NormalizedRange.EMPTY);
    }
    upperInclusive--;

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

PR Code Suggestions ✨

Latest suggestions up to 02bd664
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Handle underflow in upper bound

When converting an exclusive lower bound to inclusive by incrementing, overflow at
Long.MAX_VALUE is handled, but underflow at Long.MIN_VALUE for the upper bound
decrement is not checked. Add a symmetric check for upperInclusive == Long.MIN_VALUE
before decrementing to prevent incorrect range evaluation.

server/src/main/java/org/opensearch/action/search/pruning/DateRangeFieldDomainEvaluator.java [113-118]

 if (lowerInclusive == Long.MAX_VALUE) {
     return Optional.of(NormalizedRange.EMPTY);
 }
 lowerInclusive++;
+...
+if (upperInclusive == Long.MIN_VALUE) {
+    return Optional.of(NormalizedRange.EMPTY);
+}
+upperInclusive--;
Suggestion importance[1-10]: 8

__

Why: This identifies a real asymmetry in boundary handling. The code checks for Long.MAX_VALUE overflow when incrementing the lower bound but doesn't check for Long.MIN_VALUE underflow when decrementing the upper bound. This could lead to incorrect range evaluation in edge cases with exclusive upper bounds at the minimum value.

Medium
General
Verify shard availability before selection

When no shards match but active shards exist, setting only the first active shard
may not produce a valid search response if that shard was already skipped by another
mechanism. Consider verifying that the selected shard is actually available before
setting it, or document why the first active shard is guaranteed to be valid.

server/src/main/java/org/opensearch/action/search/CanMatchPreFilterSearchPhase.java [224-226]

 if (results.hasActiveShards()) {
-    possibleMatches.set(results.getFirstActiveShardIndex());
+    int firstActiveIndex = results.getFirstActiveShardIndex();
+    if (shardsIts.get(firstActiveIndex).skip() == false) {
+        possibleMatches.set(firstActiveIndex);
+    }
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion adds a defensive check to ensure the selected shard isn't already skipped. While this could prevent edge cases, the existing code's getFirstActiveShardIndex() returns an index from activeShardIndexToOriginalShardIndex, which only contains non-skipped shards at construction time. The check adds safety but may be redundant given the design.

Low
Add logging for pruning fallback

The fallback when all active shards would be pruned prevents zero-result responses,
but this creates inconsistent behavior where pruning effectiveness depends on
whether other mechanisms have already skipped shards. Consider explicitly handling
the zero-shard case or documenting this limitation more prominently in the method
signature.

server/src/main/java/org/opensearch/action/search/SearchIndexPruningService.java [187-195]

 if (pruned == activeShardGroups) {
-    /*
-     * Pruning is an optimization. If every currently active shard group would be skipped, fall back to
-     * the original iterators until zero-shard response semantics are covered explicitly for this
-     * pre-can-match stage. Existing skip flags are preserved because they may have been set by another
-     * search execution step.
-     */
+    logger.debug(
+        "all {} active shard groups would be pruned; falling back to unpruned execution to preserve search semantics",
+        activeShardGroups
+    );
     return SearchIndexPruningResult.notPruned(shardIterators);
 }
Suggestion importance[1-10]: 4

__

Why: Adding debug logging when all active shards would be pruned improves observability. However, this is a minor enhancement that doesn't fix a bug or significantly improve functionality. The existing comment already explains the behavior clearly.

Low

Previous suggestions

Suggestions up to commit 0dedb3d
CategorySuggestion                                                                                                                                    Impact
General
Log fallback when all shards pruned

When all shard groups are pruned, the code falls back to notPruned to avoid
zero-shard responses. However, this fallback silently discards the pruning decision
without logging or metrics. Consider logging this fallback at debug or info level so
operators can detect when pruning logic would have eliminated all shards.

server/src/main/java/org/opensearch/action/search/SearchIndexPruningService.java [181-188]

 if (pruned == originalSize) {
     /*
      * Pruning is an optimization. If every shard group would be skipped, fall back to
      * the original iterators until zero-shard response semantics are covered explicitly
      * for this pre-can-match stage.
      */
+    logger.debug("All {} shard groups would be pruned; falling back to unpruned search", originalSize);
     return SearchIndexPruningResult.notPruned(shardIterators);
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion adds logging when all shard groups would be pruned, which is valuable for operators to understand pruning behavior. This is a moderate improvement for observability and debugging, though not critical for correctness.

Low
Handle zero active shards explicitly

When cardinality == 0 and no active shards exist, the code silently skips setting
any shard in possibleMatches. This may cause downstream phases to receive an empty
shard iterator, potentially breaking aggregation or response construction. Consider
explicitly handling the zero-active-shards case or documenting the expected
behavior.

server/src/main/java/org/opensearch/action/search/CanMatchPreFilterSearchPhase.java [222-227]

 if (cardinality == 0) {
     // this is a special case where we have no hit but we need to get at least one search response in order
     // to produce a valid search result with all the aggs etc.
     if (results.hasActiveShards()) {
         possibleMatches.set(results.getFirstActiveShardIndex());
+    } else {
+        // No active shards available; downstream phases must handle empty iterator
+        logger.debug("No active shards available for can_match phase");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that when cardinality == 0 and no active shards exist, the code doesn't set any shard in possibleMatches. However, adding a debug log is a minor improvement that aids debugging but doesn't fix a critical issue. The existing code already handles this case by leaving possibleMatches empty, which downstream phases should handle.

Low
Clarify bounds in error message

The bounds check uses >= for the upper bound, but the error message says "for active
shard count". If activeShardIndex equals length, it's out of bounds. The check is
correct, but the error message could clarify that valid indexes are [0, length) to
avoid confusion during debugging.

server/src/main/java/org/opensearch/action/search/CanMatchPreFilterSearchPhase.java [344-355]

 private int originalShardIndex(int activeShardIndex) {
     if (activeShardIndex < 0 || activeShardIndex >= activeShardIndexToOriginalShardIndex.length) {
         throw new IllegalStateException(
             "invalid can_match shard index ["
                 + activeShardIndex
-                + "] for active shard count ["
+                + "] (valid range: [0, "
                 + activeShardIndexToOriginalShardIndex.length
-                + "]"
+                + "))"
         );
     }
     return activeShardIndexToOriginalShardIndex[activeShardIndex];
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion improves the error message clarity by explicitly stating the valid range [0, length) instead of just "active shard count". This is a minor readability improvement that helps debugging but doesn't address a functional issue.

Low

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0dedb3d: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@laminelam
Copy link
Copy Markdown
Contributor Author

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue
When all shard groups are skipped by pruning, the code attempts to set the first active shard index. However, if activeShardIndexToOriginalShardIndex is empty, firstActiveShardIndex defaults to 0, which may not be a valid shard group index if all shards were skipped before can_match. This could cause an out-of-bounds access or incorrect shard selection when possibleMatches.set(results.getFirstActiveShardIndex()) is called on line 225.

if (results.hasActiveShards()) {
    possibleMatches.set(results.getFirstActiveShardIndex());
}

Possible Issue
When all shard groups would be pruned, the code falls back to the original iterators without clearing any skip flags that may have been set. If any iterator was already marked as skipped before pruning (e.g., by a previous phase), this fallback preserves that skip state, potentially causing zero shards to be queried when at least one was expected. The fallback should ensure at least one shard group is queryable.

if (pruned == originalSize) {
    /*
     * Pruning is an optimization. If every shard group would be skipped, fall back to
     * the original iterators until zero-shard response semantics are covered explicitly
     * for this pre-can-match stage.
     */
    return SearchIndexPruningResult.notPruned(shardIterators);
}

Possible Issue
When converting an exclusive lower bound to inclusive, the code increments lowerInclusive without checking for Long.MAX_VALUE overflow. If lowerInclusive is already Long.MAX_VALUE, incrementing it wraps to Long.MIN_VALUE, producing an incorrect range that may prune matching shards. This scenario is narrow but possible with date_nanos fields or extreme date math expressions.

if (constraint.includeLower() == false) {
    if (lowerInclusive == Long.MAX_VALUE) {
        return Optional.of(NormalizedRange.EMPTY);
    }
    lowerInclusive++;
}

1- When all shard groups are skipped before can_match, hasActiveShards() is false, so possibleMatches.set(…) is not called.

2- Pruning should not clear existing skip flags because they may come from another search step.

3- This won’t happen because we are already checking if (lowerInclusive == Long.MAX_VALUE)

logs reformating

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 02bd664

@laminelam
Copy link
Copy Markdown
Contributor Author

laminelam commented May 28, 2026

@msfroh

If you have time can you please take a look?

To facilitate the review I have generated a visual and some classes description.

In short, the way it works is:

  • Another service (for ex SIM) publishes the field domain (think of it as min max for timestamp for ex) to the index metadata.
  • The field configuration is loaded from the index metadata and gets parsed and registered (hence the Registry). Only "finalized" fields are taken in account (we ignore "temporary" data)
  • At query time, the query is parsed to extract the mandatory clauses (query constraints) for the configured field.
  • Evaluators evaluate the field domain (index info) against query constraint (query info) to decide whether an index can_match
  • Non matching indices are skipped.

The implementation is completely generic, for now only DateRangeFieldDomain is implemented but it is easy to add other implementations serving other types of data.
The feature is disabled by default (opt-in).

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 02bd664: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC/Proposal]: Index-level search pruning before can_match using field bounds metadata

1 participant