Skip to content

[Analytics Engine] Let constant filter predicates reach the backend#21893

Open
ahkcs wants to merge 1 commit into
opensearch-project:mainfrom
ahkcs:feature/analytics-filter-constant-predicate
Open

[Analytics Engine] Let constant filter predicates reach the backend#21893
ahkcs wants to merge 1 commit into
opensearch-project:mainfrom
ahkcs:feature/analytics-filter-constant-predicate

Conversation

@ahkcs
Copy link
Copy Markdown
Contributor

@ahkcs ahkcs commented May 29, 2026

Description

OpenSearchFilterRule.resolveViableBackends rejected any deterministic filter predicate with no field references, throwing:

UnsupportedOperationException: Constant predicate with no field references reached the filter rule:
[>(MKTIME('10/18/2003 20:07:13':VARCHAR), 1000000000)].
ReduceExpressionsRule in PlannerImpl should have eliminated it.

The rule assumed ReduceExpressionsRule always folds such predicates away. That assumption breaks when the predicate embeds a function the plan-time RexExecutor cannot evaluate — for example a PPL conversion UDF applied to a literal:

... | eval s = '10/18/2003 20:07:13' | convert mktime(s) | where s > 1000000000

mktime(<literal>) is constant and deterministic, but the constant folder skips it because no plan-time implementation exists (only the DataFusion backend evaluates the conversion UDF natively). The unfolded constant predicate then reaches the filter rule and trips the assertion, returning a 500.

Fix

A predicate with no field references carries no field, so backend selection is unconstrained by field-type support. Hand it to any backend viable for the child — exactly as the existing non-deterministic branch (RAND() > 0) already does — and let the backend evaluate the constant at runtime.

The change is strictly additive: the branch is only reachable when the previous code already threw UnsupportedOperationException, so no previously succeeding filter is affected.

Verification

Reproduced and verified against a :run cluster with the analytics-engine plugin set, routing every test index through the analytics engine.

PPL convert command, run through the analytics-engine path:

Test suite Before After
CalciteConvertCommandIT (opensearch-project/sql, analytics route) 28/29 29/29
ConversionFunctionsIT (QA, :sandbox:qa:analytics-engine-rest) 63/63 65/65

The single prior CalciteConvertCommandIT failure (testConvertTimeformatWithWhere) now passes. Two regression tests are added to ConversionFunctionsIT — one where the constant predicate passes a row through, one where it filters every row out.

Check List

  • Functionality includes testing.
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

OpenSearchFilterRule.resolveViableBackends threw UnsupportedOperationException
for a deterministic predicate with no field references, asserting that
ReduceExpressionsRule must have already eliminated it. That assumption breaks
for a predicate embedding a backend-only function the plan-time RexExecutor
cannot evaluate — e.g. a PPL conversion UDF over a literal:

  ... | eval s = '10/18/2003 20:07:13' | convert mktime(s) | where s > 1000000000

mktime(<literal>) is constant and deterministic, but the constant folder skips
it (no plan-time implementation exists — only DataFusion evaluates it
natively), so the constant predicate survives to the filter rule and trips the
assertion with a 500.

Such a predicate carries no field, so backend selection is unconstrained by
field-type support. Hand it to any backend viable for the child — identical to
the existing non-deterministic branch (RAND() > 0) — and let the backend
evaluate the constant at runtime. The change is strictly additive: the branch
is only reachable when the previous code already threw, so no previously
succeeding filter is affected.

Adds two regression tests to ConversionFunctionsIT covering a constant
predicate that passes and one that filters every row out.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
@ahkcs ahkcs requested a review from a team as a code owner May 29, 2026 20:43
@github-actions
Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

@github-actions
Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Validate non-empty viable backends

Validate that childViableBackends is not empty before returning it. If no viable
backends exist for the child node, returning an empty list could lead to runtime
failures when attempting to execute the predicate. Add a check and throw an
appropriate exception if the set is empty.

sandbox/plugins/analytics-engine/src/main/java/org/opensearch/analytics/planner/rules/OpenSearchFilterRule.java [169-181]

 // A predicate with no field references is normally eliminated by
 // ReduceExpressionsRule before reaching this rule. Two cases legitimately survive:
 // 1. Non-deterministic predicates (e.g. RAND() > 0) — ReduceExpressionsRule
 // deliberately leaves them alone because folding would change semantics.
 // 2. Deterministic predicates embedding a backend-only function the plan-time
 // RexExecutor cannot evaluate (e.g. a PPL conversion UDF over a literal:
 // `where mktime('10/18/2003 20:07:13') > 1000000000`). The folder skips the
 // UDF because no plan-time implementation exists, so the constant predicate
 // reaches the filter rule unreduced.
 // In both cases the predicate carries no field, so backend selection is unconstrained
 // by field-type support — hand it to any backend viable for the child, which can
 // evaluate the constant (DataFusion executes the UDF natively at runtime).
+if (childViableBackends.isEmpty()) {
+    throw new UnsupportedOperationException(
+        "No viable backends available to evaluate constant predicate: [" + predicate + "]"
+    );
+}
 return new ArrayList<>(childViableBackends);
Suggestion importance[1-10]: 5

__

Why: The suggestion to validate childViableBackends is not empty is reasonable for defensive programming. However, the PR explicitly removes similar error-throwing logic, suggesting the codebase handles empty backends elsewhere. The score reflects a valid but potentially unnecessary check given the PR's direction.

Low

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for e4b86f9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant