TL;DR The synthesized hidden $row_id column is not unique per row on non-transactional Hive tables, yet the Hive connector reports $row_id as the table's unique column (ConnectorTableLayout.getUniqueColumn()), which is consumed by utilize-unique-property-in-query-planning (enabled by default). Planning optimizations that rely on $row_id uniqueness can therefore produce incorrect results.
Observed on both TEXTFILE and ORC (non-transactional) tables.
Reproduction
-- non-transactional table; reproduces on both ORC and TEXTFILE
CREATE TABLE t_rowid (col bigint, val varchar, ds varchar)
WITH (format = 'ORC', partitioned_by = ARRAY['ds'], bucketed_by = ARRAY['col'], bucket_count = 8);
INSERT INTO t_rowid
SELECT col, 'v' || CAST(col AS varchar), ds
FROM UNNEST(sequence(1, 100)) AS x(col)
CROSS JOIN (VALUES '2024-01-01','2024-01-02','2024-01-03') AS d(ds); -- 300 rows
SELECT count(*), count(DISTINCT "$row_id") FROM t_rowid;
-- observed: [300, 4] expected: [300, 300]
-- self-join on $row_id should return 300 (each row matches itself) but is many-to-many:
SELECT count(*) FROM t_rowid a JOIN t_rowid b ON a."$row_id" = b."$row_id";
-- observed: 22500 (many-to-many) instead of 300
The same is observed with format = 'TEXTFILE'. (We have not confirmed the behavior on transactional/ACID tables, where $row_id encodes {writeId, bucketId, rowId} and is presumably unique — maintainers please advise whether non-transactional $row_id is intended to be unique at all.)
Expected
$row_id should be unique per row. If it cannot be unique for non-transactional tables, the Hive connector should not advertise it as the table's unique column (so uniqueness-based planning does not kick in) for those tables.
Impact
utilize-unique-property-in-query-planning (default ON) treats $row_id as unique; on non-transactional tables that assumption is false, so optimizations built on it can return incorrect results.
- Any feature relying on
$row_id identity/uniqueness (row-level operations, joins/grouping on $row_id) is unsound on non-transactional tables.
Environment
- Presto master (0.29x), Hive connector, non-transactional tables (
TEXTFILE and ORC).
TL;DR The synthesized hidden
$row_idcolumn is not unique per row on non-transactional Hive tables, yet the Hive connector reports$row_idas the table's unique column (ConnectorTableLayout.getUniqueColumn()), which is consumed byutilize-unique-property-in-query-planning(enabled by default). Planning optimizations that rely on$row_iduniqueness can therefore produce incorrect results.Observed on both
TEXTFILEandORC(non-transactional) tables.Reproduction
The same is observed with
format = 'TEXTFILE'. (We have not confirmed the behavior on transactional/ACID tables, where$row_idencodes{writeId, bucketId, rowId}and is presumably unique — maintainers please advise whether non-transactional$row_idis intended to be unique at all.)Expected
$row_idshould be unique per row. If it cannot be unique for non-transactional tables, the Hive connector should not advertise it as the table's unique column (so uniqueness-based planning does not kick in) for those tables.Impact
utilize-unique-property-in-query-planning(default ON) treats$row_idas unique; on non-transactional tables that assumption is false, so optimizations built on it can return incorrect results.$row_ididentity/uniqueness (row-level operations, joins/grouping on$row_id) is unsound on non-transactional tables.Environment
TEXTFILEandORC).