…pe mismatch (exact SqlTypeName)
OpenSearchAggregateSplitRule.shouldSkipPartialFinalSplit() skips the
PARTIAL/FINAL split for non-prefix group sets whose group-key ordinal would
land on a PARTIAL agg-output slot — the structural FINAL reuses the original
ordinals over the keys-first PARTIAL output, so a non-prefix group key would
take the agg-output column's type, which must equal the original input field
type or Volcano's Aggregate.typeMatchesInferred / row-type equivalence check
throws.
The collision check compared SqlTypeFamily, which is too coarse: BIGINT and
INTEGER both belong to the NUMERIC family, so a span group key (INTEGER) over
a measure declared BIGINT slipped through and the split was attempted,
crashing with either:
AssertionError: type mismatch: aggCall type: BIGINT inferred type: INTEGER
or
IllegalArgumentException: Type mismatch: rel rowtype ... $f2: BIGINT -> INTEGER
Compare the exact SqlTypeName instead, so these cases fall back to a SINGLE
coordinator aggregate (always correct) rather than an unsound split.
Surfaced by PPL `chart ... by <field> span=...` over an INTEGER measure on the
analytics-engine route (e.g. `chart max(balance) by age span=10`); also covers
the equivalent `stats ... by <field> span=...` shape.
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Description
OpenSearchAggregateSplitRule.shouldSkipPartialFinalSplit()skips the PARTIAL/FINAL split for non-prefix group sets — those where a group-key ordinalk >= groupCountlands on a PARTIAL agg-output slot. The structural FINAL reuses the original ordinals over the keys-first PARTIAL output, so such a group key would take the agg-output column's type, which must equal the original input-field type or Volcano'sAggregate.typeMatchesInferred/ row-type-equivalence check throws.The collision check compared
SqlTypeFamily, which is too coarse:BIGINTandINTEGERboth belong to theNUMERICfamily, so a span group key (INTEGER) paired with a measure declaredBIGINTslipped through, the split was attempted, and planning crashed with either:or
Fix
Compare the exact
SqlTypeNameinstead of the type family, so these cases fall back to a SINGLE coordinator aggregate (always correct) rather than an unsound partial/final split. One-line semantics change; no behavior change for prefix group sets (handled by the earlier early-return) or for non-prefix sets whose collision-slot types already match.How it surfaced / verification
Surfaced by PPL
chart ... by <field> span=...over anINTEGERmeasure on the analytics-engine route (e.g.chart max(balance) by age span=10,chart usenull=... avg(balance) over gender by age span=10). The equivalentstats ... by <field> span=...shape hits the same path.Verified against
CalciteChartCommandIT(opensearch-project/sql) force-routed through the analytics-engine path. The two span tests that crashed on planning go red → green:CalciteChartCommandIT)testChartMaxBalanceByAgeSpanaggCall type: BIGINT inferred type: INTEGERtestChartUseNullTrueWithNullStrType mismatch … $f2: BIGINT -> INTEGERCombined with the partition-boundary coercion fix (#21878), chart on the analytics-engine route goes from 8/15 → 10/15; the remaining 5 are out of scope here (datetime wire-format sql#5420 ×2, and the
attributes.client.ipBinary/BinaryView scan mismatch ×3).Follow-up
A self-contained QA IT under
sandbox/qa/analytics-engine-restexercisingstats max(<int>) by <field> span=Nwould regression-cover this in OpenSearch CI (no existing Span/Stats QA IT covers thestats max(int-measure) by … spannon-prefix shape). Not included here.Check List
--signoff.