Skip to content

[Feature] Support Spark expression: make_interval #3099

@andygrove

Description

@andygrove

What is the problem the feature request solves?

Note: This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.

Comet does not currently support the Spark make_interval function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.

The MakeInterval expression creates a calendar interval from separate year, month, week, day, hour, minute, and second components. It supports flexible argument lists allowing omission of trailing components, which default to zero.

Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.

Describe the potential solution

Spark Specification

Syntax:

make_interval(years, months, weeks, days, hours, mins, secs)
make_interval(years, months, weeks, days, hours, mins)
make_interval(years, months, weeks, days, hours)
-- ... (supports up to 0 arguments with overloaded constructors)

Arguments:

Argument Type Description
years IntegerType Number of years in the interval
months IntegerType Number of months in the interval
weeks IntegerType Number of weeks in the interval
days IntegerType Number of days in the interval
hours IntegerType Number of hours in the interval
mins IntegerType Number of minutes in the interval
secs DecimalType(MAX_LONG_DIGITS, 6) Number of seconds with microsecond precision

Return Type: CalendarIntervalType - Returns a calendar interval object containing the specified time components.

Supported Data Types:

  • Integer types: years, months, weeks, days, hours, minutes
  • Decimal type: seconds (with scale 6 for microsecond precision)
  • All arguments support implicit casting to their required types

Edge Cases:

  • Null handling: Returns null if any input argument is null (null intolerant)
  • Overflow behavior:
    • When failOnError=true (ANSI mode): Throws QueryExecutionErrors.arithmeticOverflowError
    • When failOnError=false: Returns null on arithmetic overflow
  • Default values: Missing trailing arguments default to Literal(0) except seconds which defaults to Decimal(0, MAX_LONG_DIGITS, 6)
  • Nullable result: When failOnError=false, result is always nullable; when true, nullable only if any child is nullable

Examples:

-- Create interval with all components
SELECT make_interval(1, 2, 3, 4, 5, 6, 7.123456);
-- Result: 1 years 2 months 25 days 5 hours 6 minutes 7.123456 seconds

-- Create interval with partial components
SELECT make_interval(0, 1, 0, 1);  
-- Result: 1 months 1 days

-- Handle overflow in non-ANSI mode
SELECT make_interval(999999999, 0, 0, 0, 0, 0, 0);
-- Result: NULL (on overflow)
// Example DataFrame API usage
import org.apache.spark.sql.functions._

df.select(expr("make_interval(1, 1, 0, 1, 0, 1, 40.000001)"))

// Using literal expressions
df.select(lit(1).alias("years"), lit(1).alias("months"))
  .select(expr("make_interval(years, months, 0, 1, 0, 1, 40.000001)"))

Implementation Approach

See the Comet guide on adding new expressions for detailed instructions.

  1. Scala Serde: Add expression handler in spark/src/main/scala/org/apache/comet/serde/
  2. Register: Add to appropriate map in QueryPlanSerde.scala
  3. Protobuf: Add message type in native/proto/src/proto/expr.proto if needed
  4. Rust: Implement in native/spark-expr/src/ (check if DataFusion has built-in support first)

Additional context

Difficulty: Large
Spark Expression Class: org.apache.spark.sql.catalyst.expressions.MakeInterval

Related:

  • IntervalUtils.makeInterval() - Underlying utility method
  • Calendar interval arithmetic expressions
  • extract() function for decomposing intervals
  • Date/time interval operations

This issue was auto-generated from Spark reference documentation.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions