Prune search shard groups using index-level field domains before can_match#21865
Prune search shard groups using index-level field domains before can_match#21865laminelam wants to merge 2 commits into
Conversation
Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
PR Reviewer Guide 🔍(Review updated until commit 02bd664)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 02bd664
Previous suggestionsSuggestions up to commit 0dedb3d
|
|
❌ Gradle check result for 0dedb3d: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
1- When all shard groups are skipped before can_match, hasActiveShards() is false, so possibleMatches.set(…) is not called. 2- Pruning should not clear existing skip flags because they may come from another search step. 3- This won’t happen because we are already checking if (lowerInclusive == Long.MAX_VALUE) |
logs reformating Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
|
Persistent review updated to latest commit 02bd664 |
|
If you have time can you please take a look? To facilitate the review I have generated a visual and some classes description. In short, the way it works is:
The implementation is completely generic, for now only DateRangeFieldDomain is implemented but it is easy to add other implementations serving other types of data. |
|
❌ Gradle check result for 02bd664: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Resolves #21566
Implements the search-core consumer side of index-level search pruning described in the RFC.
This adds a conservative coordinator-side pruning step before
can_matchthat uses finalized index-level field-domain metadata fromIndexMetadatacustom data. If metadata is missing, malformed, unfinalized, unsupported, or unsafe to evaluate, the shard group is kept.The implementation introduces generic field-domain abstractions with initial date-range support, mandatory query-constraint extraction, dynamic pruning settings, PIT safety guards, and test coverage.
Benchmark
Note: Benchmark was done on local (no network involved). Although the improvement is already significant locally it should be even more pronounced in a distributed cluster, where pruning also avoids remote can_match requests.
.
.
Core Classes
SearchIndexPruningServiceSearchIndexPruningSettingsSearchIndexPruningResultQueryConstraintRangeQueryConstraintQueryConstraintExtractorMandatoryQueryConstraintExtractorFieldDomainDateRangeFieldDomainFieldDomainProviderIndexFieldDomainMetadataindex_field_domainscustom metadata.FieldDomainParserFieldDomainParserRegistrytypevalues to parser implementations.FieldDomainEvaluatorDateRangeFieldDomainEvaluatorFieldDomainEvaluationContextnow.ClusterStateFieldDomainProviderIndexMetadata.getCustomData(...)..
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.