-
Notifications
You must be signed in to change notification settings - Fork 261
feat!: bump arrow to 53, datafusion to 43 and sqlparser to 0.51 #1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5e5697b to
68c48be
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1105 +/- ##
==========================================
- Coverage 96.45% 96.44% -0.02%
==========================================
Files 293 293
Lines 52149 52173 +24
==========================================
+ Hits 50301 50318 +17
- Misses 1848 1855 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR upgrades core dependencies to their latest versions: DataFusion from 38 to 43, Arrow from 51 to 53, and SQLParser from 0.45 to 0.51. The changes address breaking API changes in these libraries, particularly around how DataFusion handles aggregate functions (now requiring UDF registration) and several method/structure renames across all three dependencies.
Key Changes:
- Aggregate Functions as UDAFs: DataFusion 43 requires explicit registration of SUM and COUNT as user-defined aggregate functions via
ContextProvider, with function names now returned in lowercase (e.g., "sum" instead of "SUM") - API Method Updates: Replaced deprecated methods (
all_fields()→flattened_fields(),display_name()→schema_name()) and updated import locations (catalog::TableReference→sql::TableReference) - Expression-based Limits: LIMIT and OFFSET values changed from direct
usizetoExprtypes, requiring extraction logic via a newexpr_to_usize()helper function
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml | Bumps arrow to 53, datafusion to 43, sqlparser to 0.51, and updates sqlparser git patch |
| crates/proof-of-sql-benches/Cargo.toml | Updates datafusion and sqlparser versions for benchmarks crate |
| .github/workflows/lint-and-test.yml | Changes test runner to use larger GitHub runner instance |
| crates/proof-of-sql/src/utils/parse.rs | Updates CreateTable pattern matching for sqlparser 0.51 tuple variant syntax |
| crates/proof-of-sql-planner/src/context.rs | Implements aggregate UDF registration for sum/count, renames udf methods, updates schema method |
| crates/proof-of-sql-planner/src/util.rs | Replaces all_fields() with flattened_fields() |
| crates/proof-of-sql-planner/src/df_util.rs | Updates TableReference import location |
| crates/proof-of-sql-planner/src/conversion.rs | Adds new ParserOptions fields for DataFusion 43 compatibility |
| crates/proof-of-sql-planner/src/error.rs | Renames UnsupportedAggregateOperation to UnsupportedAggregateFunctionName |
| crates/proof-of-sql-planner/src/aggregate.rs | Migrates from built-in aggregate functions to UDF-based approach, updates test helpers |
| crates/proof-of-sql-planner/src/expr.rs | Updates aggregate function construction to use new_udf(), updates TableReference import |
| crates/proof-of-sql-planner/src/plan.rs | Adds expr_to_usize() helper, uses schema_name() instead of display_name(), updates aggregate function construction, converts limit/offset to expression-based, updates all test assertions for lowercase function names |
| crates/proof-of-sql-planner/tests/e2e_tests.rs | Updates expected column names to use lowercase function names |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f0fa2b8 to
62d4ab8
Compare
62d4ab8 to
7f5b831
Compare
7f5b831 to
056cb23
Compare
Summary of Changes
Bump the dependencies to the target versions
(datafusion==43, arrow==53, sqlparser==0.51.0). The main work was fixing
test failures caused by API changes:
Key Changes Made:
crates/proof-of-sql-planner/src/plan.rs:
("sum(table.b)" instead of "SUM(table.b)", "count(Int64(1))" instead of
"COUNT(Int64(1))", etc.) because schema_name() returns lowercase function
names in DataFusion 43+
crates/proof-of-sql-planner/src/context.rs:
datafusion::functions_aggregate
"count" aggregate functions
functions
crates/proof-of-sql-planner/tests/e2e_tests.rs:
(e.g., "count(Int64(1))" instead of "COUNT(Int64(1))")
Root Causes:
must be explicitly registered with the context provider
aggregates