Skip to content

feat(sql): support TIMESTAMPTZ #7020

Open
Lucas61000 wants to merge 3 commits into
Eventual-Inc:mainfrom
Lucas61000:issue-3957
Open

feat(sql): support TIMESTAMPTZ #7020
Lucas61000 wants to merge 3 commits into
Eventual-Inc:mainfrom
Lucas61000:issue-3957

Conversation

@Lucas61000
Copy link
Copy Markdown
Contributor

Changes Made

Add SQL support for TIMESTAMPTZ, backed by PostgreSQL's documented behavior.

SQL type parsing (schema.rs):

  • TIMESTAMPTZTimestamp(unit, Some("UTC")). As PostgreSQL states: "the value is stored internally as UTC, and the originally stated or assumed time zone is not retained." Our mapping is consistent with this semantics.
  • TIMESTAMP WITHOUT TIME ZONE / TIME WITHOUT TIME ZONE → parsed for completeness.
  • TIMETZ → deliberately rejected. PostgreSQL itself advises in Section 8.5.3: "The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness." Arrow's Time64 also has no timezone representation.

Supertype fix (supertype.rs):

  • Timestamp(None) vs Timestamp(Some(tz)) now resolves by promoting the naive side with "localize" semantics (no conversion, just attaches the timezone label). Equivalent to PostgreSQL with session timezone = UTC. Without this, comparisons/joins between TIMESTAMP and TIMESTAMPTZ fail with TypeError.

Tests:

  • Unit tests for type mappings and error cases in schema.rs
  • Supertype tests for None vs Some(tz), same-None, two-different-tz, and empty-tz edges in supertype.rs
  • End-to-end SQL tests in test_temporal_exprs.py

Related Issues

Closes #3957

Map TIMESTAMPTZ to Timestamp(unit, "UTC"), storing values internally as
UTC (consistent with PostgreSQL semantics).

Also parse TIMESTAMP WITHOUT TIME ZONE and TIME WITHOUT TIME ZONE for
specification completeness.

TIMETZ is deliberately rejected. Arrow's Time64 type has no timezone
representation. This aligns with PostgreSQL's own documentation which
notes that the TIME WITH TIME ZONE type "exhibits properties which lead
to questionable usefulness" and recommends using TIMESTAMPTZ instead [1].

Fix supertype computation for Timestamp(None) vs Timestamp(Some(tz)) to
enable comparisons between timezone-naive and timezone-aware timestamps,
using "localize" semantics (equivalent to PostgreSQL with session
timezone = UTC).

[1] https://www.postgresql.org/docs/current/datatype-datetime.html

Closes Eventual-Inc#3957
@Lucas61000 Lucas61000 requested a review from a team as a code owner May 29, 2026 06:40
@github-actions github-actions Bot added the feat label May 29, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 29, 2026

Greptile Summary

This PR adds SQL support for TIMESTAMPTZ (and TIMESTAMP WITH TIME ZONE), mapping it to Timestamp(unit, Some("UTC")) in Daft's type system, and deliberately rejects TIMETZ/TIME WITH TIME ZONE because Arrow's Time64 has no timezone representation. It also widens the supertype resolution in supertype.rs so that mixing a timezone-naive and a timezone-aware Timestamp no longer throws a TypeError — instead the naive side is "localized" (epoch value unchanged, timezone label attached), consistent with PostgreSQL's session-timezone semantics.

  • schema.rs: TimezoneInfo::WithTimeZone | TimezoneInfo::Tz arms added for both Time (error) and Timestamp (UTC), with exhaustive match arms and rstest coverage.
  • supertype.rs: New (None, Some(tz)) | (Some(tz), None) arm with !tz.is_empty() guard inserted between the "two different Some(tz)" arm and the generic "same-tz / both-None" arm; six unit tests cover UTC, non-UTC, symmetric, two-tz, same-None, and empty-tz edge cases.
  • Tests: E2E SQL tests for CAST(... AS TIMESTAMPTZ), TIMETZ rejection, and cross-type comparison are added; the existing test_compare_timestamps_one_tz is updated from an expected error to an expected success.

Confidence Score: 4/5

The core SQL type mapping and supertype logic are correct and well-tested; the main concern is that silently accepting naive/aware timestamp comparisons (previously a hard error) is a behavioral change that should be flagged in the PR title.

The type-mapping logic in schema.rs and the match-arm ordering in supertype.rs are correct, the edge-case guards are sound, and six targeted unit tests cover the key scenarios. The behavioral shift from TypeError to silent success in test_compare_timestamps_one_tz is intentional and documented, but the PR title is missing the ! marker that the team uses to signal breaking changes. The cross-type comparison E2E test is also trivially correct since it compares the same epoch value to itself.

tests/sql/test_temporal_exprs.py — the test_timestamptz_vs_timestamp_comparison test would benefit from a non-trivial epoch-offset case to actually exercise the localize semantics.

Important Files Changed

Filename Overview
src/daft-core/src/utils/supertype.rs Adds a new match arm to promote Timestamp(None) to Timestamp(Some(tz)) when mixed; ordering and guards are correct, and six well-targeted unit tests cover the edge cases.
src/daft-sql/src/schema.rs TIMESTAMPTZ / TIMESTAMP WITH TIME ZONE → Timestamp(unit, Some("UTC")), TIMETZ / TIME WITH TIME ZONE → clear error; match arms are exhaustive and covered by rstest cases.
tests/series/test_comparisons.py Removes the expected ValueError for naive/aware comparison; behavioral change is correct but the PR title should carry a breaking-change marker.
tests/sql/test_temporal_exprs.py New SQL-level tests for TIMESTAMPTZ casting and TIMETZ rejection are good; the cross-type comparison test only exercises the trivial same-value case.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["SQL TIMESTAMP type token"] --> B{TimezoneInfo?}
    B -->|None / WITHOUT TIME ZONE| C["Timestamp(unit, None)"]
    B -->|WITH TIME ZONE / Tz| D["Timestamp(unit, Some('UTC'))"]

    E["SQL TIME type token"] --> F{TimezoneInfo?}
    F -->|None / WITHOUT TIME ZONE| G["Time(unit)"]
    F -->|WITH TIME ZONE / Tz| H["Error: Arrow Time64 has no timezone"]

    C --> I[supertype resolution]
    D --> I
    J["Timestamp(unit, None)"] --> I

    I --> K{Both sides of comparison}
    K -->|"None + None"| L["Timestamp(max_tu, None)"]
    K -->|"Some(tz) + Some(tz) same"| M["Timestamp(max_tu, Some(tz))"]
    K -->|"Some(tz1) + Some(tz2) different"| N["Timestamp(max_tu, Some('UTC'))"]
    K -->|"None + Some(tz) [NEW]"| O["Timestamp(max_tu, Some(tz)) — localize"]
Loading

Reviews (1): Last reviewed commit: "feat(sql): support TIMESTAMPTZ" | Re-trigger Greptile

Comment thread tests/series/test_comparisons.py
Comment thread tests/sql/test_temporal_exprs.py
Add a test case with a non-UTC offset (+05:30) where the naive and
timezone-aware epochs differ by the offset amount, verifying that
localize semantics produce the correct (not equal) result.
The get_time_units function was selecting the wrong time unit when
comparing timestamps with different precisions (e.g., Microseconds vs
Milliseconds). Fix it to always return the higher-precision unit
(Nanoseconds > Microseconds > Milliseconds).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support TIMEZ and TIMESTAMPZ

1 participant