Skip to content

Hive: $row_id is not unique on non-transactional tables but is declared as the table's unique column #28062

Description

@kaikalur

TL;DR The synthesized hidden $row_id column is not unique per row on non-transactional Hive tables, yet the Hive connector reports $row_id as the table's unique column (ConnectorTableLayout.getUniqueColumn()), which is consumed by utilize-unique-property-in-query-planning (enabled by default). Planning optimizations that rely on $row_id uniqueness can therefore produce incorrect results.

Observed on both TEXTFILE and ORC (non-transactional) tables.

Reproduction

-- non-transactional table; reproduces on both ORC and TEXTFILE
CREATE TABLE t_rowid (col bigint, val varchar, ds varchar)
WITH (format = 'ORC', partitioned_by = ARRAY['ds'], bucketed_by = ARRAY['col'], bucket_count = 8);

INSERT INTO t_rowid
SELECT col, 'v' || CAST(col AS varchar), ds
FROM UNNEST(sequence(1, 100)) AS x(col)
CROSS JOIN (VALUES '2024-01-01','2024-01-02','2024-01-03') AS d(ds);  -- 300 rows

SELECT count(*), count(DISTINCT "$row_id") FROM t_rowid;
-- observed: [300, 4]      expected: [300, 300]

-- self-join on $row_id should return 300 (each row matches itself) but is many-to-many:
SELECT count(*) FROM t_rowid a JOIN t_rowid b ON a."$row_id" = b."$row_id";
-- observed: 22500 (many-to-many) instead of 300

The same is observed with format = 'TEXTFILE'. (We have not confirmed the behavior on transactional/ACID tables, where $row_id encodes {writeId, bucketId, rowId} and is presumably unique — maintainers please advise whether non-transactional $row_id is intended to be unique at all.)

Expected

$row_id should be unique per row. If it cannot be unique for non-transactional tables, the Hive connector should not advertise it as the table's unique column (so uniqueness-based planning does not kick in) for those tables.

Impact

  • utilize-unique-property-in-query-planning (default ON) treats $row_id as unique; on non-transactional tables that assumption is false, so optimizations built on it can return incorrect results.
  • Any feature relying on $row_id identity/uniqueness (row-level operations, joins/grouping on $row_id) is unsound on non-transactional tables.

Environment

  • Presto master (0.29x), Hive connector, non-transactional tables (TEXTFILE and ORC).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions