Improve performance of SingleRestrictionEstimatedRowCountTest #1502

k-rus · 2025-01-14T17:17:29Z

Reduces amount of created tables by creating all needed tables in advance. As the result the test can be placed into single test function.
This improves local test execution time from 5.5 seconds down to 1.4 seconds. Reduction in CI from 13 to 5 seconds.

Also removes disabling optimizer, which wasn't necessary.

Checklist before you submit for review

Make sure there is a PR in the CNDB project updating the Converged Cassandra version
Use NoSpamLogger for log lines that may appear frequently in the logs
Verify test results on Butler
Test coverage for new/modified code is > 80%
Proper code formatting
Proper title for each commit staring with the project-issue number, like CNDB-1234
Each commit has a meaningful description
Each commit is not very long and contains related changes
Renames, moves and reformatting are in distinct commits

There is no need to disable the optimizer, since it cannot optimize away anything. It was necessary originally during introducing anti-join node.

Fails due to flush when next table is created, and on cleanup after a test run.

Reduces amount of created tables by creating all needed tables in advance. As the result the test can be placed into single test function. This improves local test execution time from 5.5 seconds down to 1.4 seconds.

sonarqubecloud · 2025-01-15T11:03:35Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-01-15T11:19:54Z

✔️ Build ds-cassandra-pr-gate/PR-1502 approved by Butler

Approved by Butler
See build details here

pkolaczk · 2025-01-20T14:20:57Z

test/unit/org/apache/cassandra/index/sai/plan/SingleRestrictionEstimatedRowCountTest.java

        test.doTest(Version.DB, INT, 97.0);
        test.doTest(Version.EB, INT, 97.0);
        // Truncated numeric types planned differently
        test.doTest(Version.DB, DECIMAL, 97.0);
        test.doTest(Version.EB, DECIMAL, 97.0);
        test.doTest(Version.EB, VARINT, 97.0);


Suggestion: SAITester supports versioning. Why not use that feature instead of manually passing the version?

Suggestion: SAITester supports versioning. Why not use that feature instead of manually passing the version?

I don't understand how it can be used. To my understanding it will require to rearrange the test cases per SSTables version, which will make tests less useful, i.e., impossible to see the count differences per restriction. Also manual passing allows to see how different versions affect the count.
What do I miss in your proposal?

You'd also see different counts, because there would have to be some check like if (version.onOrAfter(Version.EB)) in any place where versions differ. The upside is that it would automatically test all other versions and you'd get tests for new versions for free, if they don't change anything. Just add a version to a list of versions and voila, the test runs on new format.

But it's up to you. I'm not insisting, that's why it was a suggestion.

As I mentioned in my previous comment your suggestion will hide important differentiation that different versions calculate row counts differently. One difference comes from maintaining cached histograms for the latest version. Different formats of index data can be also a reason, but it wasn't observed and wasn't exhaustively tested.

@pkolaczk What is the functional requirement for the test that you brought the suggestion? Is it because not all versions are tested and more specifically introducing new version will not be covered by the test? I.e., difficult to maintain the test and run into obsolete test?

I can think about providing row counts per version groups and have latest group unbound, i.e., unknown versions will be assuming to implement the histogram. If it sounds good, I think to address it in a separate PR and merge this PR with the current limited approach. What do you think?

I feel like if there is a commonly used mechanism to do multi version tests, it should be the default way of testing, not implementing multiple versions in a different way manually. Just for consistency. And yes, being able to quickly add new versions without duplicating most tests is a bonus of using an existing system. But as I said earlier, it is fine to not do this in this PR. I just highlighted that there is this functionality available in the SAITester, and it's really up to you if you find it useful. If you think this would introduce unnecessary complexity on this particular test, no problem, let's merge it. Don't want to hold perfectly fine functionality just to make tests look nicer.

https://github.com/riptano/cndb/issues/12559

I feel like if there is a commonly used mechanism to do multi version tests, it should be the default way of testing

Applying this default violates the purpose of this test: to demonstrate how versions affect row counts.

Reduces amount of created tables by creating all needed tables in advance. As the result the test can be placed into single test function. This improves local test execution time from 5.5 seconds down to 1.4 seconds. Reduction in CI from 13 to 5 seconds. Also removes disabling optimizer, which wasn't necessary.

k-rus force-pushed the rf-row-count-test-faster branch from cb4f9c5 to a0fe1cb Compare January 14, 2025 20:32

k-rus added 3 commits January 14, 2025 22:04

Allow to run optimizer

54f6c5f

There is no need to disable the optimizer, since it cannot optimize away anything. It was necessary originally during introducing anti-join node.

Store created CFS in a map

464948e

Fails due to flush when next table is created, and on cleanup after a test run.

Improve performance of SingleRestrictionEstimatedRowCountTest

0b231f9

Reduces amount of created tables by creating all needed tables in advance. As the result the test can be placed into single test function. This improves local test execution time from 5.5 seconds down to 1.4 seconds.

k-rus force-pushed the rf-row-count-test-faster branch from a0fe1cb to 0b231f9 Compare January 14, 2025 21:05

k-rus requested a review from a team January 15, 2025 08:45

Remove use of var and other minor improvements

44974c0

pkolaczk approved these changes Jan 20, 2025

View reviewed changes

k-rus mentioned this pull request Jan 21, 2025

Fix expected values in SingleRestrictionEstimatedRowCountTest #1523

Merged

9 tasks

k-rus merged commit bf469c7 into main Jan 24, 2025
465 of 472 checks passed

k-rus deleted the rf-row-count-test-faster branch January 24, 2025 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of SingleRestrictionEstimatedRowCountTest #1502

Improve performance of SingleRestrictionEstimatedRowCountTest #1502

Uh oh!

k-rus commented Jan 14, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Jan 15, 2025

Uh oh!

cassci-bot commented Jan 15, 2025

Uh oh!

pkolaczk Jan 20, 2025

Uh oh!

k-rus Jan 21, 2025 •

edited

Loading

Uh oh!

pkolaczk Jan 24, 2025

Uh oh!

k-rus Jan 24, 2025

Uh oh!

pkolaczk Jan 24, 2025

Uh oh!

k-rus Jan 24, 2025

Uh oh!

k-rus Jan 24, 2025

Uh oh!

Uh oh!

Uh oh!

Improve performance of SingleRestrictionEstimatedRowCountTest #1502

Improve performance of SingleRestrictionEstimatedRowCountTest #1502

Uh oh!

Conversation

k-rus commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist before you submit for review

Uh oh!

sonarqubecloud bot commented Jan 15, 2025

Quality Gate passed

Uh oh!

cassci-bot commented Jan 15, 2025

✔️ Build ds-cassandra-pr-gate/PR-1502 approved by Butler

Uh oh!

pkolaczk Jan 20, 2025

Choose a reason for hiding this comment

Uh oh!

k-rus Jan 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pkolaczk Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

k-rus Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

pkolaczk Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

k-rus Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

k-rus Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

k-rus commented Jan 14, 2025 •

edited

Loading

k-rus Jan 21, 2025 •

edited

Loading