Skip to content

Conversation

@iajoiner
Copy link
Contributor

@iajoiner iajoiner commented Dec 5, 2025

Summary of Changes

Bump the dependencies to the target versions
(datafusion==43, arrow==53, sqlparser==0.51.0). The main work was fixing
test failures caused by API changes:

Key Changes Made:

  • crates/proof-of-sql-planner/src/plan.rs:

    • Updated test alias_map keys to use lowercase function names
      ("sum(table.b)" instead of "SUM(table.b)", "count(Int64(1))" instead of
      "COUNT(Int64(1))", etc.) because schema_name() returns lowercase function
      names in DataFusion 43+
  • crates/proof-of-sql-planner/src/context.rs:

    • Added import for count_udaf and sum_udaf from
      datafusion::functions_aggregate
    • Implemented get_aggregate_meta() to return the UDFs for "sum" and
      "count" aggregate functions
    • Updated udaf_names() to return the list of available aggregate
      functions
    • Updated tests to reflect the new behavior
  • crates/proof-of-sql-planner/tests/e2e_tests.rs:

    • Updated expected result column names to use lowercase function names
      (e.g., "count(Int64(1))" instead of "COUNT(Int64(1))")

    Root Causes:

    • In DataFusion 43+, aggregate functions (SUM, COUNT) are now UDFs that
      must be explicitly registered with the context provider
    • The schema_name() method returns lowercase function names for UDF-based
      aggregates

@iajoiner iajoiner force-pushed the feat/bump-arrow-53 branch 2 times, most recently from 5e5697b to 68c48be Compare December 5, 2025 21:37
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 96.35417% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.44%. Comparing base (4312e52) to head (056cb23).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/proof-of-sql-planner/src/plan.rs 93.13% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1105      +/-   ##
==========================================
- Coverage   96.45%   96.44%   -0.02%     
==========================================
  Files         293      293              
  Lines       52149    52173      +24     
==========================================
+ Hits        50301    50318      +17     
- Misses       1848     1855       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@iajoiner iajoiner marked this pull request as ready for review December 8, 2025 17:05
Copilot AI review requested due to automatic review settings December 8, 2025 17:05
@iajoiner iajoiner enabled auto-merge (squash) December 8, 2025 17:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades core dependencies to their latest versions: DataFusion from 38 to 43, Arrow from 51 to 53, and SQLParser from 0.45 to 0.51. The changes address breaking API changes in these libraries, particularly around how DataFusion handles aggregate functions (now requiring UDF registration) and several method/structure renames across all three dependencies.

Key Changes:

  • Aggregate Functions as UDAFs: DataFusion 43 requires explicit registration of SUM and COUNT as user-defined aggregate functions via ContextProvider, with function names now returned in lowercase (e.g., "sum" instead of "SUM")
  • API Method Updates: Replaced deprecated methods (all_fields()flattened_fields(), display_name()schema_name()) and updated import locations (catalog::TableReferencesql::TableReference)
  • Expression-based Limits: LIMIT and OFFSET values changed from direct usize to Expr types, requiring extraction logic via a new expr_to_usize() helper function

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
Cargo.toml Bumps arrow to 53, datafusion to 43, sqlparser to 0.51, and updates sqlparser git patch
crates/proof-of-sql-benches/Cargo.toml Updates datafusion and sqlparser versions for benchmarks crate
.github/workflows/lint-and-test.yml Changes test runner to use larger GitHub runner instance
crates/proof-of-sql/src/utils/parse.rs Updates CreateTable pattern matching for sqlparser 0.51 tuple variant syntax
crates/proof-of-sql-planner/src/context.rs Implements aggregate UDF registration for sum/count, renames udf methods, updates schema method
crates/proof-of-sql-planner/src/util.rs Replaces all_fields() with flattened_fields()
crates/proof-of-sql-planner/src/df_util.rs Updates TableReference import location
crates/proof-of-sql-planner/src/conversion.rs Adds new ParserOptions fields for DataFusion 43 compatibility
crates/proof-of-sql-planner/src/error.rs Renames UnsupportedAggregateOperation to UnsupportedAggregateFunctionName
crates/proof-of-sql-planner/src/aggregate.rs Migrates from built-in aggregate functions to UDF-based approach, updates test helpers
crates/proof-of-sql-planner/src/expr.rs Updates aggregate function construction to use new_udf(), updates TableReference import
crates/proof-of-sql-planner/src/plan.rs Adds expr_to_usize() helper, uses schema_name() instead of display_name(), updates aggregate function construction, converts limit/offset to expression-based, updates all test assertions for lowercase function names
crates/proof-of-sql-planner/tests/e2e_tests.rs Updates expected column names to use lowercase function names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@iajoiner iajoiner enabled auto-merge (squash) December 8, 2025 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants