TIMESTAMP_NTZ datatype is currently not supported in the converter

https://docs.databricks.com/aws/en/sql/language-manual/data-types/timestamp-ntz-type

Timestamp_NTZ is a data type that can be used for partitioning, so according to the converter function, it should be supported, but the elif branch for it is missing. To be precise, I actually think the current `timestamp` mapping should be `timestamp_ntz`, as the pd.Timestamp constructed doesnt get passed any `tz_info`.

```python
# converter.py
def to_converter(schema_type) -> Callable[[str], Any]:
    """
    For types that support partitioning, a lambda to parse data into the
    corresponding type is returned. For data types that cannot be partitioned
    on, we return None. The caller is expected to check if the value is None before using.
    :param schema_type: str or json representing a data type
    :return: converter function or None
    """
    if schema_type == "boolean":
        return lambda x: None if (x is None or x == "") else (x is True or x == "true")
    elif schema_type == "byte":
        return lambda x: np.nan if (x is None or x == "") else np.int8(x)
    elif schema_type == "short":
        return lambda x: np.nan if (x is None or x == "") else np.int16(x)
    elif schema_type == "integer":
        return lambda x: np.nan if (x is None or x == "") else np.int32(x)
    elif schema_type == "long":
        return lambda x: np.nan if (x is None or x == "") else np.int64(x)
    elif schema_type == "float":
        return lambda x: np.nan if (x is None or x == "") else np.float32(x)
    elif schema_type == "double":
        return lambda x: np.nan if (x is None or x == "") else np.float64(x)
    elif isinstance(schema_type, str) and schema_type.startswith("decimal"):
        return lambda x: None if (x is None or x == "") else Decimal(x)
    elif schema_type == "string":
        return lambda x: None if (x is None or x == "") else str(x)
    elif schema_type == "date":
        return lambda x: None if (x is None or x == "") else pd.Timestamp(x).date()
    elif schema_type == "timestamp":
        return lambda x: pd.NaT if (x is None or x == "") else pd.Timestamp(x)
    elif schema_type == "binary":
        return None  # partition on binary column not supported
    elif isinstance(schema_type, dict) and schema_type["type"] in ("array", "struct", "map"):
        return None  # partition on complex column not supported

    raise ValueError(f"Could not parse datatype: {schema_type}")
```

How to reproduce: Try reading a table with TIMESTAMP_NTZ column with the following interace:

```
df = delta_sharing.load_as_pandas(table_url, convert_in_batches=True, use_delta_format=False)
```

Adding another elif branch with timestamp_ntz solves the issue, I can create a PR if you like.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TIMESTAMP_NTZ datatype is currently not supported in the converter #745

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

TIMESTAMP_NTZ datatype is currently not supported in the converter #745

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions