Skip to content

feat: support_ansi-mode_aggregated_benchmarking#2901

Merged
andygrove merged 1 commit into
apache:mainfrom
coderfender:support_ansi_mode_agg_benchmarks
Dec 13, 2025
Merged

feat: support_ansi-mode_aggregated_benchmarking#2901
andygrove merged 1 commit into
apache:mainfrom
coderfender:support_ansi_mode_agg_benchmarks

Conversation

@coderfender
Copy link
Copy Markdown
Contributor

@coderfender coderfender commented Dec 13, 2025

Which issue does this PR close?

Closes #2883 .

Rationale for this change

Current comet aggregated benchmarking tests do not support ANSI Mode (throws an exception) . These changes essentially allow us to test all aggregated comet expressions (with ANSI mode enabled) so as to derive benchmarks

What changes are included in this PR?

How are these changes tested?

@coderfender
Copy link
Copy Markdown
Contributor Author

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM, ansi mode enabled : true
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 32 iterations, 2013 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 38 iterations, 2032 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM, ansi mode enabled : true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                                   48             63          17        219.2           4.6       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                                   36             53          17        289.7           3.5       1.3X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM, ansi mode enabled : true
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 30 iterations, 2071 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 36 iterations, 2011 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM, ansi mode enabled : true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                                    47             69          13        221.2           4.5       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                                    35             56          33        303.5           3.3       1.4X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM, ansi mode enabled : true
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 33 iterations, 2049 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 33 iterations, 2035 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM, ansi mode enabled : true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                                       50             62          17        211.4           4.7       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                                       37             62          33        281.9           3.5       1.3X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 100), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 32 iterations, 2026 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 34 iterations, 2010 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 100), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                    49             63          23        213.9           4.7       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                    36             59          23        287.7           3.5       1.3X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 1024), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 29 iterations, 2131 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 43 iterations, 2014 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 1024), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                     48             73          32        217.1           4.6       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                     36             47          18        290.1           3.4       1.3X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 1048576), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 2 iterations, 2952 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 5 iterations, 2276 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 1048576), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                      1463           1476          18          7.2         139.5       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                       396            455          92         26.5          37.8       3.7X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 100), multiple aggregates SUM isANSIMode: true
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 13 iterations, 2075 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 35 iterations, 2051 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 100), multiple aggregates SUM isANSIMode: true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                                     56            160          78        188.0           5.3       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                                     34             59          45        309.5           3.2       1.6X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1024), multiple aggregates SUM isANSIMode: true
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 16 iterations, 2103 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 37 iterations, 2016 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1024), multiple aggregates SUM isANSIMode: true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                                      55            131          97        189.6           5.3       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                                      38             54          30        273.0           3.7       1.4X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1048576), multiple aggregates SUM isANSIMode: true
  Running case: SQL Parquet - Spark (SUM) isANSIMode: true
  Stopped after 20 iterations, 2118 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: true
  Stopped after 22 iterations, 2025 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1048576), multiple aggregates SUM isANSIMode: true:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: true                                                                         55            106          71        189.8           5.3       1.0X
SQL Parquet - Comet (SUM) isANSIMode: true                                                                         35             92          73        301.9           3.3       1.6X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM on decimal
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 2 iterations, 2658 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 9 iterations, 2233 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM on decimal:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                  1257           1329         102          8.3         119.9       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                   237            248          15         44.2          22.6       5.3X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM on decimal
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 2 iterations, 2615 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 8 iterations, 2222 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM on decimal:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                   1307           1308           0          8.0         124.7       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                    257            278          29         40.7          24.6       5.1X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM on decimal
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : true
  Stopped after 2 iterations, 8222 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : true
  Stopped after 2 iterations, 2853 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM on decimal:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : true                                                      4082           4111          42          2.6         389.3       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : true                                                      1412           1427          20          7.4         134.7       2.9X




Running benchmark: Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM, ansi mode enabled : false

  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : false
  Stopped after 9 iterations, 2064 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : false
  Stopped after 12 iterations, 2025 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 100), single aggregate SUM, ansi mode enabled : false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : false                                                                  221            229          12         47.5          21.0       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : false                                                                  153            169          25         68.5          14.6       1.4X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM, ansi mode enabled : false
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : false
  Stopped after 9 iterations, 2207 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : false
  Stopped after 12 iterations, 2053 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1024), single aggregate SUM, ansi mode enabled : false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : false                                                                   226            245          30         46.4          21.6       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : false                                                                   165            171           4         63.4          15.8       1.4X

Running benchmark: Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM, ansi mode enabled : false
  Running case: SQL Parquet - Spark (SUM) ansi mode enabled : false
  Stopped after 2 iterations, 4153 ms
  Running case: SQL Parquet - Comet (SUM) ansi mode enabled : false
  Stopped after 3 iterations, 2645 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: single group key (cardinality 1048576), single aggregate SUM, ansi mode enabled : false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) ansi mode enabled : false                                                                     2069           2077          11          5.1         197.3       1.0X
SQL Parquet - Comet (SUM) ansi mode enabled : false                                                                      841            882          49         12.5          80.2       2.5X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 100), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: false
  Stopped after 4 iterations, 2340 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: false
  Stopped after 8 iterations, 2320 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 100), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: false                                                  574            585          12         18.3          54.8       1.0X
SQL Parquet - Comet (SUM) isANSIMode: false                                                  269            290          33         39.0          25.6       2.1X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 1024), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: false
  Stopped after 2 iterations, 5877 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: false
  Stopped after 2 iterations, 2924 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 1024), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: false                                                  2896           2939          60          3.6         276.2       1.0X
SQL Parquet - Comet (SUM) isANSIMode: false                                                  1400           1462          88          7.5         133.5       2.1X

Running benchmark: Grouped HashAgg Exec: multiple group keys (cardinality 1048576), single aggregate SUM
  Running case: SQL Parquet - Spark (SUM) isANSIMode: false
   Stopped after 2 iterations, 13239 ms
  Running case: SQL Parquet - Comet (SUM) isANSIMode: false
  Stopped after 2 iterations, 10603 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 16.0
Apple M2 Max
Grouped HashAgg Exec: multiple group keys (cardinality 1048576), single aggregate SUM:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Parquet - Spark (SUM) isANSIMode: false                                                     6580           6620          57          1.6         627.5       1.0X

@coderfender coderfender changed the title feat: support_spark_4_cast_fix_tests feat: support_ansi- Dec 13, 2025
@coderfender coderfender changed the title feat: support_ansi- feat: support_ansi-mode_aggregated_benchmarking Dec 13, 2025
Copy link
Copy Markdown
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @coderfender. This is very helpful!

@coderfender
Copy link
Copy Markdown
Contributor Author

Thank you for the approval @andygrove

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.44%. Comparing base (f09f8af) to head (97d2a28).
⚠️ Report is 760 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2901      +/-   ##
============================================
+ Coverage     56.12%   59.44%   +3.31%     
- Complexity      976     1376     +400     
============================================
  Files           119      167      +48     
  Lines         11743    15335    +3592     
  Branches       2251     2548     +297     
============================================
+ Hits           6591     9116    +2525     
- Misses         4012     4935     +923     
- Partials       1140     1284     +144     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderfender
Copy link
Copy Markdown
Contributor Author

@andygrove The checks have all passed . Please merge the branch whenever you get a chance and I will rebase my other ANSI support branches and run the benchmarks

@andygrove andygrove merged commit ecbaa7f into apache:main Dec 13, 2025
120 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature : Support Aggregated benchmark generation with ANSI support

3 participants