Skip to content

[Data] Add TPCH queries 3, 10, and 18 for benchmarking#60667

Open
daiping8 wants to merge 5 commits intoray-project:masterfrom
daiping8:tpchq3
Open

[Data] Add TPCH queries 3, 10, and 18 for benchmarking#60667
daiping8 wants to merge 5 commits intoray-project:masterfrom
daiping8:tpchq3

Conversation

@daiping8
Copy link
Contributor

@daiping8 daiping8 commented Feb 2, 2026

Description

Adding Query Q3, Q10, Q18, for TPCH tests

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds several new TPC-H query benchmarks (Q3, Q5, Q7, Q8, Q9, Q10, and Q18). The implementations are a great addition. However, there is a recurring critical issue across all new query files regarding the usage of Dataset.join. The method is consistently called with on and right_on parameters for joining columns with different names, which is not the correct API usage. The correct parameters are left_on and right_on. This needs to be fixed to ensure the queries run correctly. Additionally, I've identified some opportunities for performance improvements and code simplification in queries Q5, Q8, and Q10.

@daiping8 daiping8 force-pushed the tpchq3 branch 3 times, most recently from 79af0a5 to bf0c428 Compare February 2, 2026 11:55
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
@daiping8 daiping8 changed the title [Data] Add TPCH queries 3, 10, and 18 for benchmarking [WIP][Data] Add TPCH queries 3, 10, and 18 for benchmarking Feb 2, 2026
@daiping8 daiping8 marked this pull request as ready for review February 3, 2026 01:26
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
@ray-gardener ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Feb 3, 2026
…on tables

Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
@daiping8 daiping8 changed the title [WIP][Data] Add TPCH queries 3, 10, and 18 for benchmarking [Data] Add TPCH queries 3, 10, and 18 for benchmarking Feb 3, 2026
@daiping8
Copy link
Contributor Author

daiping8 commented Feb 3, 2026

@owenowenisme Please review the code. Looking forward to any suggestions.

…tions

Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants