-
Notifications
You must be signed in to change notification settings - Fork 235
feat: Add optional prefetch hint for parsing Puffin Footer #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi, I believe this hint should be optional and we shouldn't require all users to provide it. |
@Xuanwo It is optional, let me change the title. Or do you mean that i should add a function like |
Yep. I believe we should not change the |
@Xuanwo Thanks for the review, it is fixed! |
pub struct FileMetadata { | ||
pub(crate) blobs: Vec<BlobMetadata>, | ||
#[serde(skip_serializing_if = "HashMap::is_empty")] | ||
#[serde(default)] | ||
pub(crate) properties: HashMap<String, String>, | ||
/// Optional prefetch hint you can pass for retrieving footer | ||
#[serde(skip_serializing_if = "Option::is_none")] | ||
pub(crate) prefetch_hint: Option<u8>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns adding this field for two reasons:
- It looks impractical to ask user to pass in the
prefetch_hint
. - This format is define by iceberg spec, and adding another field may lead to wrong file.
/// `prefetch_hint` is used to try to fetch the entire footer in one read. If | ||
/// the entire footer isn't fetched in one read the function will call the `read_no_prefetch` | ||
/// option. | ||
pub(crate) async fn read(&self, input_file: &InputFile) -> Result<FileMetadata> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub(crate) async fn read(&self, input_file: &InputFile) -> Result<FileMetadata> { | |
pub(crate) async fn read(input_file: &InputFile) -> Result<FileMetadata> { |
Suggestions:
- Keep original api
- Add anoter method
read_with_hint
to implement the method with a hint.
Which issue does this PR close?
What changes are included in this PR?
Add prefetch hint for parsing puffin footer + change some docs
Are these changes tested?
Yes, unit tested