Skip to content

[SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode#55940

Closed
gengliangwang wants to merge 2 commits into
apache:masterfrom
gengliangwang:SPARK-56915-make-date-interval
Closed

[SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode#55940
gengliangwang wants to merge 2 commits into
apache:masterfrom
gengliangwang:SPARK-56915-make-date-interval

Conversation

@gengliangwang
Copy link
Copy Markdown
Member

@gengliangwang gengliangwang commented May 17, 2026

What changes were proposed in this pull request?

Introduce DateTimeExpressionUtils.java with two static helpers:

  • makeDateExact(int year, int month, int day): wraps LocalDate.of(...) in an ansiDateTimeArgumentOutOfRange(DateTimeException) try/catch, then calls DateTimeUtils.localDateToDays(...). The ArithmeticException from localDateToDays's internal toIntExact is intentionally not caught — it propagates as-is (matching the pre-PR behavior for that overflow case).
  • makeIntervalExact(int years, int months, int weeks, int days, int hours, int mins, Decimal secs): wraps IntervalUtils.makeInterval in the arithmeticOverflowError(message, "", null) try/catch.

datetimeExpressions.MakeDate and intervalExpressions.MakeInterval delegate to the helpers in their failOnError = true codegen + eval paths. The non-ANSI branch keeps the inline try/catch -> null form because it sets isNull = true on failure.

Why are the changes needed?

Part of SPARK-56908 (umbrella). The try/catch wrapper that maps DateTimeException (for MakeDate) / ArithmeticException (for MakeInterval) to the user-facing ANSI error was emitted inline in doGenCode; the helper collapses it to a single call per call site without losing the wrapped exception's message (no QueryContext argument is involved, so this PR introduces no per-row references[] regression).

Does this PR introduce any user-facing change?

No.

How was this patch tested?

build/sbt "catalyst/testOnly *DateExpressionsSuite *IntervalExpressionsSuite"

96/96 pass.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x

@gengliangwang
Copy link
Copy Markdown
Member Author


Stack overview (SPARK-56908 umbrella)

This PR is part of a stack of 8 PRs against SPARK-56908. Order:

  1. [SPARK-56909][SQL] Simplify Cast to int/long codegen under ANSI mode #55934 — [SPARK-56909][SQL] Simplify Cast to int/long codegen under ANSI mode (this stack base)
  2. [SPARK-56910][SQL] Simplify Cast to byte/short codegen under ANSI mode #55935 — [SPARK-56910][SQL] Simplify Cast to byte/short codegen under ANSI mode
  3. [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode #55936 — [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode
  4. [SPARK-56912][SQL] Simplify Cast to boolean codegen under ANSI mode #55937 — [SPARK-56912][SQL] Simplify Cast to boolean codegen under ANSI mode
  5. [SPARK-56914][SQL] Simplify decimal arithmetic codegen under ANSI mode #55939 — [SPARK-56914][SQL] Simplify decimal arithmetic codegen under ANSI mode (depends on [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode #55936)
  6. [SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode #55938 — [SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode (independent)
  7. [SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode #55940 — [SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode (independent)
  8. [SPARK-56916][SQL] Simplify ElementAt array codegen under ANSI mode #55941 — [SPARK-56916][SQL] Simplify ElementAt array codegen under ANSI mode (independent)

PRs 1-4 are linearly stacked on each other (each branch is based on the previous one). PR 5 (decimal arithmetic) is stacked on top of PR 3 (cast decimal) since it uses CastUtils.changePrecisionExact. PRs 6, 7, 8 branch off master independently.

if (failOnError) {
val utils = classOf[DateTimeConstructorUtils].getName
nullSafeCodeGen(ctx, ev, (year, month, week, day, hour, min, sec) => {
val secFrac = sec.getOrElse("org.apache.spark.sql.types.Decimal$.MODULE$.apply(0, " +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secFrac fallback in ANSI codegen is opaque.

Two problems:
- Dead code in practice. MakeInterval.children always has 7 elements (all the auxiliary this(...) constructors pass Literal(0) / Literal(Decimal(...)) to fill missing slots), so SeptenaryExpression.nullSafeCodeGen always passes Some(...) to the lambda. getOrElse never fires.
- Inconsistency with the non-ANSI branch in the same file: val secFrac = sec.getOrElse("0") // non-ANSI branch. The non-ANSI branch's "0" is int-typed but IntervalUtils.makeInterval's 7th argument is Decimal. If the getOrElse ever fired, the non-ANSI path would emit … makeInterval(…, 0) which wouldn't compile, while the ANSI path would emit a valid Decimal$.MODULE$.apply(0, ...). You appear to have noticed the type issue but only fixed one branch. Both branches' fallbacks are unreachable, but it's still an inconsistency that will confuse future readers

The pre-PR code had the same "0" dead-code fallback. So the PR didn't introduce the dead code — but it did expand it into a more elaborate (still dead) form on one branch only. Cleanest fix: replace sec.getOrElse(...) with sec.get (assertive) and add a comment that MakeInterval always provides 7 children. Or stick with getOrElse but use the same string on both branches.

Comment on lines +41 to +45
/**
* Builds a day count for {@code MakeDate(year, month, day)} in ANSI mode.
* Throws a {@code SparkDateTimeException} if the arguments don't form a
* valid {@link LocalDate}.
*/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tighten makeDateExact Javadoc to clarify it only catches DateTimeException, not all errors from constructing/converting a LocalDate (e.g., the ArithmeticException from localDateToDays's toIntExact is not caught).

* the user-facing {@code ansiDateTimeArgumentOutOfRange} /
* {@code arithmeticOverflowError}.
*/
public final class DateTimeConstructorUtils {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the broad name is forward-looking, document the planned scope in the PR description so the name is justified. Otherwise, rename it to match the current contents (e.g. MakeDateAndIntervalUtils).

@gengliangwang gengliangwang force-pushed the SPARK-56915-make-date-interval branch from 9ddcf71 to c13127e Compare May 18, 2026 21:57
@gengliangwang
Copy link
Copy Markdown
Member Author

Addressed in c13127e, thanks @viirya:

  • Class renamed DateTimeConstructorUtilsMakeDateAndIntervalUtils. The new name matches the current contents (only makeDateExact + makeIntervalExact) and avoids the magnet-class risk you flagged. If future PRs add more constructor-style helpers, we can rename again at that point.
  • makeDateExact javadoc tightened to clarify it only catches the DateTimeException thrown by LocalDate.of(...) for invalid year/month/day. Anything else (e.g. the ArithmeticException from DateTimeUtils.localDateToDays's internal toIntExact on day-count overflow) propagates to the caller unchanged.
  • secFrac fallback fixed. Replaced both branches' sec.getOrElse("…") (the non-ANSI's "0" was the wrong Java type anyway) with sec.get plus a comment noting that MakeInterval always passes 7 children — the auxiliary this(…) constructors fill missing slots with Literal(0) / Literal(Decimal(0, …)), so SeptenaryExpression.nullSafeCodeGen always provides Some(…) to the lambda.

PTAL.

* the user-facing {@code ansiDateTimeArgumentOutOfRange} /
* {@code arithmeticOverflowError}.
*/
public final class MakeDateAndIntervalUtils {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many similar PRs ongoing, do we have a plan for adding and consolidating these util java functions? I feel the current one-expression-one-java-class approach is a bit overkill, shall we group them?

…mode

### What changes were proposed in this pull request?

Introduce `DateTimeConstructorUtils.java` with two static helpers:
* `makeDateExact(int year, int month, int day)`: wraps
  `LocalDate.of(...) + DateTimeUtils.localDateToDays(...)` in the
  `ansiDateTimeArgumentOutOfRange(DateTimeException)` try/catch.
* `makeIntervalExact(int years, int months, int weeks, int days,
  int hours, int mins, Decimal secs)`: wraps `IntervalUtils.makeInterval`
  in the `arithmeticOverflowError(message, "", null)` try/catch.

`datetimeExpressions.MakeDate` and `intervalExpressions.MakeInterval`
delegate to the helpers in their `failOnError = true` codegen + eval
paths. The non-ANSI branch keeps the inline `try/catch -> null` form
because it sets `isNull = true` on failure.

### Why are the changes needed?

Part of SPARK-56908 (umbrella). The 7-line try/catch wrapper that maps
`DateTimeException` / `ArithmeticException` to the user-facing ANSI
error was emitted inline in `doGenCode`; the helper collapses it to a
single call per call site without losing the wrapped exception's
message (no `QueryContext` argument is involved, so this PR introduces
no per-row `references[]` regression).

### Does this PR introduce _any_ user-facing change?

No. The compiled behavior is identical; only the emitted Java source
text changes.

### How was this patch tested?

```
build/sbt "catalyst/testOnly *DateExpressionsSuite *IntervalExpressionsSuite"
```

96/96 pass.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x
@gengliangwang gengliangwang force-pushed the SPARK-56915-make-date-interval branch from c13127e to d118961 Compare May 19, 2026 04:50
@gengliangwang
Copy link
Copy Markdown
Member Author

Good point, thanks @cloud-fan. I followed the codebase convention: existing utility files like ArrayExpressionUtils.java, BitmapExpressionUtils.java, ToJavaArrayUtils.java, ExpressionImplUtils.java, etc. are category-scoped (not per-expression), so the per-expression naming I introduced here was inconsistent.

Plan across the SPARK-56908 stack:

Util class Scope Status
CastUtils.java All Cast ANSI helpers (PRs #55934-#55937, #55939) Keep — already broad
ArithmeticUtils.java All BinaryArithmetic ANSI helpers (PR #55938) Keep — already broad
DateTimeExpressionUtils.java (was MakeDateAndIntervalUtils) Shared by date/time/interval expression ANSI codegen (this PR) Renamed here in d118961
ElementAtUtils.java → fold into existing ArrayExpressionUtils.java (PR #55941) TBD on #55941 Follow-up

The new name parallels ArrayExpressionUtils.java and gives future date/time/interval ANSI helpers a natural home without rev-locking another file rename. PTAL.

@viirya
Copy link
Copy Markdown
Member

viirya commented May 19, 2026

The PR description still references DateTimeConstructorUtils.java and describes makeDateExact as wrapping LocalDate.of(...) + DateTimeUtils.localDateToDays(...) in the ansiDateTimeArgumentOutOfRange try/catch — but the final class is DateTimeExpressionUtils.java, and per the javadoc tightening in c13127e, the helper only catches the DateTimeException from LocalDate.of(...) (the ArithmeticException from localDateToDays's internal toIntExact propagates).

Could you refresh the PR body so the squash commit message picks up the accurate description? Thanks!

@gengliangwang
Copy link
Copy Markdown
Member Author

Refreshed the PR body — makeDateExact now describes the narrower catch (only DateTimeException from LocalDate.of(...); the ArithmeticException from localDateToDays's toIntExact propagates as-is), and the "Why" section calls out that the mapped exceptions differ per helper (DateTimeException for MakeDate, ArithmeticException for MakeInterval). Thanks for the catch!

@gengliangwang
Copy link
Copy Markdown
Member Author

@cloud-fan @viirya thanks for the review. Merging to master/4.x

gengliangwang added a commit that referenced this pull request May 19, 2026
…mode

### What changes were proposed in this pull request?

Introduce `DateTimeExpressionUtils.java` with two static helpers:
* `makeDateExact(int year, int month, int day)`: wraps `LocalDate.of(...)` in an `ansiDateTimeArgumentOutOfRange(DateTimeException)` try/catch, then calls `DateTimeUtils.localDateToDays(...)`. The `ArithmeticException` from `localDateToDays`'s internal `toIntExact` is intentionally not caught — it propagates as-is (matching the pre-PR behavior for that overflow case).
* `makeIntervalExact(int years, int months, int weeks, int days, int hours, int mins, Decimal secs)`: wraps `IntervalUtils.makeInterval` in the `arithmeticOverflowError(message, "", null)` try/catch.

`datetimeExpressions.MakeDate` and `intervalExpressions.MakeInterval` delegate to the helpers in their `failOnError = true` codegen + eval paths. The non-ANSI branch keeps the inline `try/catch -> null` form because it sets `isNull = true` on failure.

### Why are the changes needed?

Part of SPARK-56908 (umbrella). The try/catch wrapper that maps `DateTimeException` (for `MakeDate`) / `ArithmeticException` (for `MakeInterval`) to the user-facing ANSI error was emitted inline in `doGenCode`; the helper collapses it to a single call per call site without losing the wrapped exception's message (no `QueryContext` argument is involved, so this PR introduces no per-row `references[]` regression).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
build/sbt "catalyst/testOnly *DateExpressionsSuite *IntervalExpressionsSuite"
```

96/96 pass.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x

Closes #55940 from gengliangwang/SPARK-56915-make-date-interval.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit dde0863)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants