Problem Statement: clean_expired_metadata param in in expire snapshot not working as expected
Expected: Clean expire metadata should also clear the unsused metadata file in storage
Observed: The paramter not cleaning up the files
Performed Steps
- Created schema name iceberg.optimization_bucket.
- Created table orders with partition
- Inserted some data
- Took note of current counts of metadata and snapshots
- Executed expire_snapshots without clean_expired_metadata:
Observation:
The older snapshot file gets deleted and a new manifest JSON file is added which points to the latest snapshot.The older manifest JSON files did not get cleared from the bucket.6) Inserted some more data
- Ran the expire_snapshots query with clean_expired_metadata => TRUE:
SET SESSION iceberg.expire_snapshots_min_retention = '0s';
ALTER TABLE iceberg.optimization_bucket.orders
EXECUTE expire_snapshots(retention_threshold => '0s', retain_last => 1,
ExpiresnapshotparamNotWorkingAsExpected.docx
=> TRUE );
Observation: Same behaviour as running without it — older manifest JSON files still present in the bucket. Question: Based on Trino 479 release notes (https://trino.io/docs/current/release/release-479.html[)](https://trino.io/docs/current/release/release-479.html)),
my expectation is that clean_expired_metadata should also delete the old (unreferenced) manifest JSON files from the warehouse. Is this expectation correct for Trino 479 with Iceberg REST catalog, or is the scope of clean_expired_metadata?
Problem Statement: clean_expired_metadata param in in expire snapshot not working as expected
Expected: Clean expire metadata should also clear the unsused metadata file in storage
Observed: The paramter not cleaning up the files
Performed Steps
Observation:
The older snapshot file gets deleted and a new manifest JSON file is added which points to the latest snapshot.The older manifest JSON files did not get cleared from the bucket.6) Inserted some more data
SET SESSION iceberg.expire_snapshots_min_retention = '0s';
ALTER TABLE iceberg.optimization_bucket.orders
EXECUTE expire_snapshots(retention_threshold => '0s', retain_last => 1,
ExpiresnapshotparamNotWorkingAsExpected.docx
=> TRUE );
Observation: Same behaviour as running without it — older manifest JSON files still present in the bucket. Question: Based on Trino 479 release notes (https://trino.io/docs/current/release/release-479.html[)](https://trino.io/docs/current/release/release-479.html)),
my expectation is that clean_expired_metadata should also delete the old (unreferenced) manifest JSON files from the warehouse. Is this expectation correct for Trino 479 with Iceberg REST catalog, or is the scope of clean_expired_metadata?