[Data] Add TPCH queries 5,7,8,9 for benchmarking#60662
[Data] Add TPCH queries 5,7,8,9 for benchmarking#60662daiping8 wants to merge 9 commits intoray-project:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds TPC-H queries 5, 7, 8, and 9 for benchmarking purposes. The overall structure of the new query files is consistent. However, I've found several correctness issues where the implementations deviate significantly from the TPC-H specifications for queries 7, 8, and 9. These need to be addressed to ensure the benchmarks are valid. Additionally, there are opportunities to improve performance in queries 5 and 9 by optimizing the join logic. The configuration changes in the YAML file are appropriate.
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: ZTE Ray <dai.ping88@zte.com.cn>
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
… for improved clarity and consistency. Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
|
@owenowenisme Please review the code. Looking forward to any suggestions. |
|
Hi @daiping8, can you help me understand why are you adding these benchmarks? |
Hi. This is a task assigned by the Ray Data Team. https://docs.google.com/document/d/1OFFp2jMMnrCPiE0Gxdi0ronXGVqtDYDbUoS3fsNc54Q/edit?pli=1&tab=t.0 |
owenowenisme
left a comment
There was a problem hiding this comment.
I think you're missing some tables in common.py ? How about let's open up a pr first to add the name mapping?
FYI
=== region ===
Column names: ['column0', 'column1', 'column2', 'column3']
column0: int64
column1: string
column2: string
column3: string
=== supplier ===
Column names: ['column0', 'column1', 'column2', 'column3', 'column4', 'column5', 'column6', 'column7']
column0: int64
column1: string
column2: string
column3: int64
column4: string
column5: double
column6: string
column7: string
… nation, supplier, customer, orders, part, and partsupp Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Description
Adding Query Q5, Q7, Q8, Q9 for TPCH tests