Skip to content

Hive4 Partition Overwrite fails for non-managed tables #28330

@PuszekSE

Description

@PuszekSE

After upgrading from 476 to 479, I have observed and already reproduced that behavior using:
io.trino.plugin.hive.s3.S3HiveQueryRunner.S3Hive4QueryRunner#main

After adding .addHiveProperty("hive.non-managed-table-writes-enabled", "true"), it's no longer possible to overwrite existing partition.

Underlying behavior is extremely destructive, since it removes overwritten data, as it's just stats update that ends up failing, and the partition ends up being empty as a result.

Steps to reproduce, using Hive4:

trino:tpch> use hive.tpch;
USE
trino:tpch> SET SESSION hive.insert_existing_partitions_behavior = 'OVERWRITE';
     
CREATE TABLE test (
  history_batch_timestamp TIMESTAMP,
  dt VARCHAR
)
WITH (
  partitioned_by = ARRAY['dt']
);

INSERT INTO test VALUES (CAST(CURRENT_TIMESTAMP AS TIMESTAMP), '2026-02-03');
INSERT INTO test VALUES (CAST(CURRENT_TIMESTAMP AS TIMESTAMP), '2026-02-03');

That results with:
Query 20260217_061938_00031_jdjp9 failed: Invalid column statistics data: ColumnStatisticsObj(colName:history_batch_timestamp, colType:timestamp, statsData:<ColumnStatisticsData >)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions