JVM Spark has the following behavior:
>>> spark.sql("SELECT v FROM VALUES (map(1, 'a', 2, 'b')) AS t(v)").printSchema()
root
|-- v: map (nullable = false)
| |-- key: integer
| |-- value: string (valueContainsNull = false)
>>> spark.sql("SELECT v FROM VALUES (map(1, 'a', 2, 'b')), (NULL) AS t(v)").printSchema()
root
|-- v: map (nullable = true)
| |-- key: integer
| |-- value: string (valueContainsNull = false)
However, in Sail, the map field always has nullable = true.
This issue possibly also affects type inference for certain map functions. For example, this is the behavior in JVM Spark:
>>> spark.sql("SELECT map_entries(v) FROM VALUES (map(1, 'a', 2, 'b')) AS t(v)").printSchema()
root
|-- map_entries(v): array (nullable = false)
| |-- element: struct (containsNull = false)
| | |-- key: integer (nullable = false)
| | |-- value: string (nullable = false)
>>> spark.sql("SELECT map_entries(v) FROM VALUES (map(1, 'a', 2, 'b')), (NULL) AS t(v)").printSchema()
root
|-- map_entries(v): array (nullable = true)
| |-- element: struct (containsNull = false)
| | |-- key: integer (nullable = false)
| | |-- value: string (nullable = false)
>>> spark.sql("SELECT map_entries(v) FROM VALUES (map(1, 'a', 2, NULL)) AS t(v)").printSchema()
root
|-- map_entries(v): array (nullable = false)
| |-- element: struct (containsNull = false)
| | |-- key: integer (nullable = false)
| | |-- value: string (nullable = true)
But in Sail the top-level array field always has nullable = true.
JVM Spark has the following behavior:
However, in Sail, the
mapfield always hasnullable = true.This issue possibly also affects type inference for certain map functions. For example, this is the behavior in JVM Spark:
But in Sail the top-level
arrayfield always hasnullable = true.