Skip to content

bug: inconsistent cast to timezone #11965

@aricohen93

Description

@aricohen93

What happened?

Hi, I'm using Ibis with the Trino backend. I'm performing operations on a timestamp column that is stored in UTC in the database.

I want to extract the hour in another timezone (Europe/Paris). I can cast the timestamp to the new timezone, but when I use the time() or hour() functions, it still returns the hour in the original timezone.

I wonder if the translation from the pyhon API to the SQL query should be performeddatetime AT TIME ZONE 'Europe/Paris' instead of performeddatetime AS TIMESTAMP(3) WITH TIME ZONE

Here is my code and an example:

import ibis.expr.datatypes as dt

connection = TrinoConfig(timezone="Europe/Paris", **connection_info).get_connection()
procedure = get_table(
    table="procedure",
    connection=connection,
    select_cols=["id", "performeddatetime"],
)
procedure = procedure.mutate(
    performeddatetime_casted=procedure.performeddatetime.cast(
        dt.Timestamp(timezone="Europe/Paris", scale=3)
    )
)


procedure = procedure.filter(procedure.id == _id)
procedure = procedure.filter(
    procedure.performeddatetime
    == ibis.timestamp("2017-01-19 00:00:00", timezone="Europe/Paris")
)
procedure = procedure.mutate(hour=procedure.performeddatetime_casted.hour())
procedure = procedure.mutate(time=procedure.performeddatetime_casted.time())

procedure.to_pandas()
print(procedure_pd.to_markdown())
id performeddatetime performeddatetime_casted hour time
0 xxxx 2017-01-18 23:00:00+00:00 2017-01-19 00:00:00+01:00 23 23:00:00

When checking the datatypes of the two datetime columns:

procedure.performeddatetime.type()
# Out: Timestamp(timezone='UTC', scale=3, nullable=True)

procedure.performeddatetime_casted.type()
# Out: Timestamp(timezone='Europe/Paris', scale=3, nullable=True)

Also, when checking the SQL transformation i can see the following:

SELECT
  "t0"."id",
  "t0"."performeddatetime",
  CAST("t0"."performeddatetime" AS TIMESTAMP(3) WITH TIME ZONE) AS "performeddatetime_casted",
  EXTRACT(hour FROM CAST("t0"."performeddatetime" AS TIMESTAMP(3) WITH TIME ZONE)) AS "hour",
  CAST(CAST("t0"."performeddatetime" AS TIMESTAMP(3) WITH TIME ZONE) AS TIME) AS "time"
FROM "procedure" AS "t0"
WHERE
  "t0"."id" = 'xxxx'
  AND "t0"."performeddatetime" = CAST(FROM_ISO8601_TIMESTAMP('2017-01-19T00:00:00+01:00') AS TIMESTAMP WITH TIME ZONE)

I tried also to change the timezone in the connection but I have the same result:

from pydantic import BaseModel, Field
import ibis
from ibis.backends.trino import Backend as trino_backend
from trino.auth import BasicAuthentication

class TrinoConfig(BaseModel):
    host: str
    port: int = Field(gt=0)
    user: str
    token: str
    schema: str
    database: str
    timezone: str = "Europe/Paris"

    @property
    def auth(self):
        return BasicAuthentication(self.user, self.token)

    def get_connection(self) -> trino_backend:
        return ibis.trino.connect(
            host=self.host,
            port=self.port,
            user=self.user,
            auth=self.auth,
            schema=self.schema,
            database=self.database,
            timezone=self.timezone,
        )

    @classmethod
    def load(cls, path: str | Path) -> "TrinoConfig":
        path = Path(path)
        return cls.model_validate_json(path.read_text())

What version of ibis are you using?

12.0.0

What backend(s) are you using, if any?

Trino

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions