Skip to content

[EPIC] Support TPC-DS benchmarks #4763

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to be able to run all TPC-DS queries with DataFusion, but some are not yet supported.

Old description:

I am testing with [SQLBench-DS](https://github.com/sql-benchmarks/sqlbench-ds) and I am seeing some failures. Many of these affect multiple queries but I have just listed a single example query here for each type of error.

- https://github.com/apache/arrow-datafusion/issues/4794
- https://github.com/apache/arrow-datafusion/issues/123
- `At least two values are needed to calculate variance` (q17)
- `The type of Int32 = Int64 of binary physical should be same` (q72)
- `physical plan is not yet implemented for GROUPING aggregate function` (q27)
- `Projections require unique expression names but the expression "MAX(customer_demographics.cd_dep_count)" at position 6 and "MAX(customer_demographics.cd_dep_count)" at position 7 have the same name. Consider aliasing ("AS") one of them.` (q35)
- `The function Stddev does not support inputs of type Decimal128(7, 2).` (q74)

Describe the solution you'd like
Support all the queries.

Describe alternatives you've considered
N/A

Additional context
N/A

Metadata

Metadata

Assignees

Labels

PROPOSAL EPICA proposal being discussed that is not yet fully underwayenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions