Open
Description
Is your feature request related to a problem or challenge?
For now, our arrow reader accepts the FileScanTask and returns the RecordBatchStream to the user. After #630, the reader can process the delete file and merge it with the data file, which it's good to ready to use out of the box. However, for some compute engines, they hope to process delete file by themselves so that they can utilize the existing join executor and storage to spill the data. This require to read the delete file directly rather than process the delete file internally.
Based on this, I suggest providing different read interface so that it satisfy different requirement:
- read: process data and delete file of FileScanTask internally
- read_data: read data file of FileScanTask internally
- read_pos_delete: read position delete file of FileScanTask and return result directly
- read_eq_delete: read equality delete file of FileScanTask and return result directly
Describe the solution you'd like
No response
Willingness to contribute
- I can contribute to this feature independently
- I would be willing to contribute to this feature with guidance from the Iceberg Rust community
- I cannot contribute to this feature at this time