Skip to content

fix: preserve parquet metadata size hint#29

Merged
peasee merged 4 commits into
spiceai-52.5from
peasee/260421-parquet-metadata-size-hint
Apr 22, 2026
Merged

fix: preserve parquet metadata size hint#29
peasee merged 4 commits into
spiceai-52.5from
peasee/260421-parquet-metadata-size-hint

Conversation

@peasee
Copy link
Copy Markdown

@peasee peasee commented Apr 21, 2026

  • Ensures that the parquet metadata size hint is preserved during distributed plan de/serialization.
  • Without the metadata size hint set, it defaults to no size hint - causing 2 requests instead of 1 (1 to get the size, 1 to retrieve the metadata vs just 1 to retrieve the metadata)

@peasee peasee self-assigned this Apr 21, 2026
Copilot AI review requested due to automatic review settings April 21, 2026 07:47
@peasee peasee added the bug Something isn't working label Apr 21, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a performance regression in distributed query execution by restoring Parquet metadata size hint information after the physical plan is deserialized from protobuf, avoiding extra object-store HTTP round trips when reading Parquet metadata.

Changes:

  • Add a plan-tree rewrite that rehydrates ParquetSource.metadata_size_hint from TableParquetOptions after protobuf round-trip.
  • Apply this rewrite at the start of executor stage creation so downstream Parquet reads benefit automatically.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings April 22, 2026 00:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ballista/executor/src/execution_engine.rs
@peasee peasee merged commit c9e7857 into spiceai-52.5 Apr 22, 2026
29 checks passed
@peasee peasee deleted the peasee/260421-parquet-metadata-size-hint branch April 22, 2026 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working needs-upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants