-
Notifications
You must be signed in to change notification settings - Fork 771
Description
Describe the feature
Currently, Iceberg snapshot maintenance procedures available in the native Trino Iceberg Connector are not supported through the Gravitino Trino Connector.
This feature request proposes full support for Iceberg system procedures delegation through the Gravitino Trino Connector, so that users can manage snapshot lifecycle entirely within Trino without needing to rely on external tools such as Spark or the Iceberg Java API.
Motivation
When using Gravitino as a unified metadata layer, users naturally expect that all catalog operations — including maintenance tasks — are accessible through the same interface.
Currently, snapshot cleanup must be performed via a separate tool (e.g., Spark, Iceberg Java API), which introduces operational complexity and breaks the unified access model that Gravitino aims to provide.
Supporting these procedures through the Gravitino Trino Connector would:
- Allow users to fully manage Iceberg table lifecycle within a single Trino interface.
- Eliminate the need to maintain a separate Spark or Java-based pipeline solely for snapshot cleanup.
- Strengthen Gravitino's value as a truly unified metadata and catalog management layer.
Describe the solution
The following Iceberg system procedures should be supported via the Gravitino Trino Connector:
| Procedure | Description |
|---|---|
| system.expire_snapshots | Remove old snapshots older than a given timestamp |
| system.remove_orphan_files | Delete orphan data files not referenced by any snapshot |
| system.rewrite_data_files | Compact small data files into larger ones |
| system.rewrite_manifests | Rewrite manifest files for improved query performance |
Example Usage (Expected to work after this feature is implemented)
-- Expire old snapshots
CALL gravitino_catalog.system.expire_snapshots(
schema_name => 'my_schema',
table_name => 'my_table',
older_than => TIMESTAMP '2024-01-01 00:00:00'
);
-- Remove orphan files
CALL gravitino_catalog.system.remove_orphan_files(
schema_name => 'my_schema',
table_name => 'my_table'
);
-- Compact small files
CALL gravitino_catalog.system.rewrite_data_files(
schema_name => 'my_schema',
table_name => 'my_table'
);Current Behavior
These procedure calls either fail with an error (e.g., procedure not found, unsupported operation) or complete silently without actually performing the expected maintenance operations.
This is because the Gravitino Trino Connector acts as a metadata proxy layer and does not currently delegate Iceberg-specific system procedures to the underlying catalog.
Expected Behavior
The Gravitino Trino Connector should properly intercept and delegate Iceberg system procedure calls to the underlying Iceberg catalog, in the same way that the native Trino Iceberg Connector handles them.
Environment
Apache Gravitino version: (1.2.0-rc6)
Trino version: (472)
Iceberg version: (1.8)
Catalog type: Iceberg (backed by REST)
Additional context
This feature is particularly important for production environments where automated snapshot expiration and storage cost management are critical operational requirements.
Without this feature, Gravitino cannot be adopted as a complete metadata management solution for Iceberg-heavy workloads.
Thank you for considering this feature request!