Skip to content

bug: Incorrect mapping for timezone-less timestamp between Ibis schema and sqlglot #11062

Open
@h2o1

Description

@h2o1

What happened?

When converting an Ibis schema containing a timezone-less Timestamp column (i.e. dt.Timestamp with timezone set to None) via a Databricks backend, the column is mapped to the sqlglot TIMESTAMP (defined in _from_ibis_Timestamp in the SqlglotType class). However, in Databricks, TIMESTAMP corresponds to the timestamp type with timezone information (https://docs.databricks.com/aws/en/sql/language-manual/data-types/timestamp-type). The correct type to convert to should be TIMESTAMP_NTZ (https://docs.databricks.com/aws/en/sql/language-manual/data-types/timestamp-ntz-type).

Example:

>>> conn = ibis.databricks.connect(...)
>>> schema = ibis.schema([('ts_col', 'timestamp')])
>>> schema['ts_col']
Timestamp(timezone=None, scale=None, nullable=True)
>>> conn.create_table('ts_col_table', schema=schema)
DatabaseTable: catalog_test.default.ts_col_table
  ts_col timestamp('UTC')
>>> read_schema = conn.table('ts_col_table').schema()
>>> read_schema
ibis.Schema {
  ts_col  timestamp('UTC')
}

As a workaround, I was able to monkey patch this behavior locally by implementing a custom DatabricksTypeMapper with a def _from_ibis_Timestamp method that returns sge.DataType(this=sge.DataType.Type.TIMESTAMPNTZ) when the input timestamp's timezone is None and wiring this TypeMapper into the DatabricksCompiler:

class DatabricksTypeMapper(PySparkType):
  @classmethod
  def _from_ibis_Timestamp(cls, dtype: dt.Timestamp) -> sge.DataType:
    code = sge.DataType.Type.TIMESTAMPNTZ if dtype.timezone is None else sge.DataType.Type.TIMESTAMP
    if dtype.scale is not None:
      scale = sge.DataTypeParam(this=sge.Literal.number(dtype.scale))
      return sge.DataType(this=code, expressions=[scale])
    else:
      return sge.DataType(this=code)

DatabricksCompiler.type_mapper = DatabricksTypeMapper
ibis.backends.sql.compilers.databricks.compiler = DatabricksCompiler()
ibis.backends.sql.datatypes.TYPE_MAPPERS[ibis.backends.sql.compilers.databricks.compiler.dialect] = DatabricksTypeMapper

What version of ibis are you using?

10.3.1

What backend(s) are you using, if any?

Databricks

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibisdatabricksThe Databricks backenddatatypesIssues relating to ibis's datatypes (under `ibis.expr.datatypes`)timestampsIssues related to the timestamp API

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions