Support GEOMETRY in Spark SQL table creation DDL #1934
Replies: 2 comments 1 reply
-
Honestly you should not be able to write Sedona Geometry to Delta in both 1.7.0 and 1.7.1. DeltaLake standard must support native geo types in order to bypass this. We recently added the geometry type to Iceberg standard and are working on implementing Iceberg reader / writer. Before we implement this into DeltaLake, which may take a few more month, I recommend you save the geometry column into DeltaLake binary format using ST_AsEWKB and read it back using ST_GeomFromEWKB. To achieve predicate pushdown, save 4 more columns xmin, xmax, ymin, ymax using ST_Xmin, ... WKB format has great encoding speed and less storage overhead. If you just want to save Sedona tables into Parquet files, I recommend you save it directly as |
Beta Was this translation helpful? Give feedback.
-
Thank you for the explanation, I thought that was the intended behavior for Delta tables as well. Then I will do the conversion to WKB. Yesterday, I looked into how the geometry column is being saved. In the Catalog tab, if I check the table column it shows binary for the geometry column. If I do "DESCRIBE TABLE mytable" in the notebook, I get geometry as geometry type. I thought in the notebook the binary implicitly is getting deserialized and then the geometry type is recognized. Please see the images below: |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am using Sedona in Databricks with Delta table and UC. In version 1.7.0 of Sedona, this DDL works:
%sql
CREATE OR REPLACE TABLE mycatalog.myschema.mytable (
id STRING,
geometry GEOMETRY
)
USING DELTA;
But when I use version 1.7.1 of Sedona I get :
[DELTA_UNSUPPORTED_DATA_TYPES] Found columns using unsupported data types: [geometry: GeometryType]. You can set 'spark.databricks.delta.schema.typeCheck.enabled' to 'false' to disable the type check. Disabling this type check may allow users to create unsupported Delta tables and should only be used when trying to read/write legacy tables. SQLSTATE: 0AKDC
With both versions of 1.7.0 and 1.7.1, CTAS and saveAsTable, it works perfectly fine with Sedona Geometry type. I understand the geometry type of Sedona is not supported in Delta format, but actually I like the idea of not transforming the geometry column to WKT every time I am writing the dataframe into a Delta table.
So my question is: will this feature be back in the next versions or is it better for me to stick to transforming the geometry column to the known types by Delta?
Thank you.
Cheers,
Melika
Beta Was this translation helpful? Give feedback.
All reactions