Presto/Trino support for files produced by metadata boostrap #18137
Replies: 2 comments
-
|
@bhasudha please take a look. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @vamshipasunuru Thanks for raising the feature request. Currently, these are the Hudi support in Presto and Trino:
The bootstrap read is already supported by the new File Group Reader implementation. It requires passing in the right information and reader context implementation for bootstrap merging for the File Group Reader to work on bootstrap file groups. To support bootstrap read in the Hudi connector, the following needs to be done:
In general, the steps are
All of these should happen on the latest OSS releases. @vamshipasunuru let us know if this makes sense. This requires upgrading Trino and Presto as the feature will be implemented on top of the latest master (backporting to older releases might be possible and may take more time). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
RFC-12 introduced support for metadata bootstrap, this can be powerful feature to adopt Hudi for large non-hudi (eg., Hive) tables without the need to re-write the existing files.
However support for reading this data varies, Spark is fully supported but query engines like presto/trino can't query the data.
We need to add support in Hudi connector to;
This feature will be mainly used for 0.14 and 1.2 Hudi and Presto 0.287 version within Uber.
Beta Was this translation helpful? Give feedback.
All reactions