Skip to content

Reuse CRC when appending index components #1272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

pkolaczk
Copy link

Fixes #10873

@pkolaczk pkolaczk requested a review from jasonstack September 11, 2024 20:16
@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch 4 times, most recently from a7fdb55 to 4697f26 Compare September 11, 2024 20:26
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have unit tests that cover this change?

@pkolaczk
Copy link
Author

Yes, there is.
NativeIndexDDLTest.verifyRebuildCorruptedFiles.
And it is failing. Huh, checking that.

@pkolaczk pkolaczk marked this pull request as draft September 12, 2024 07:25
@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch from 4697f26 to 8ba1820 Compare September 12, 2024 08:34
@pkolaczk pkolaczk marked this pull request as ready for review September 12, 2024 08:34
@pkolaczk
Copy link
Author

Ok, apparently our usage is a bit more complex.
I made the implementation safer and less sensitive to usage patterns.

@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch from 8ba1820 to 4b8aae8 Compare September 12, 2024 18:40
@pkolaczk
Copy link
Author

@eolivelli can you have another look? I hardened it. It's not perfect but I can only go so far in Java without affine types ;)

@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch from 4b8aae8 to 5915713 Compare September 12, 2024 18:45
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the current approach works, but I would feel more confident if we add some unit tests about concurrent access.

@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch from 48ffd81 to f506fc9 Compare September 13, 2024 11:06
@pkolaczk
Copy link
Author

pkolaczk commented Sep 13, 2024

I think that the current approach works, but I would feel more confident if we add some unit tests about concurrent access.

There are plenty of scenarios tested in NativeIndexDDLTest, which also test for things like e.g. interrupting index build in the middle or parallel index builds.
Also, not sure what kind of concurrent testing I should perform on a class that's not thread safe anyways.
You must not run two writers on the same file.

@eolivelli
Copy link

a class that's not thread safe anyways

The class is not thread safe, but we are adding some shared state, so it is not a matter of testing a single instance, but multiple instances accessing the same entries in the shared state

@pkolaczk
Copy link
Author

pkolaczk commented Sep 13, 2024

but we are adding some shared state

No, the added state is still split by the file name; it is not shared. The only thing that's shared is the cache instance, but that one is thread safe, and I guess it's well tested and does the right thing. Should I test library code?

Earlier:
A. 2 writer instances operating on different files concurrently: ok
B. 2 writer instances operating the same file concurrently: incorrect (this is what I meant by saying it's not thread safe)
C. 2 threads calling into 1 writer instance: incorrect

Now:
A. 2 writer instances operating on different files concurrently: ok (because each gets their own FileChecksum by how Cache works)
B. 2 writer instances operating on the same file concurrently: incorrect
C. 2 threads calling into 1 instance: incorrect

What particular scenario do you have in mind?
The concurrent index build test + other NativeDDLIndexTests should already cover scenario A.
Do you want me to test scenarios B and C, so I can show they are not "more incorrect" than before?

Copy link

@jasonstack jasonstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you create corresponding CNDB pr and make sure the multi-segment index tests in SingleCompactorTest passes?

This commit improves performance of appending SAI components
by avoiding unnecessary computation of CRC from the beginning
of the file each time it is opened for appending.

Fixes riptano/cndb#10783
@pkolaczk pkolaczk force-pushed the cndb-10873-segment-checksum branch from 5bed2c7 to f4a176c Compare October 2, 2024 15:19
Copy link

sonarqubecloud bot commented Oct 2, 2024

@pkolaczk
Copy link
Author

pkolaczk commented Oct 3, 2024

I added warnings and I created a PR on the cndb side to test SingleCompactorTest: https://github.com/riptano/cndb/pull/11145

@jasonstack is it good now? Can I merge?

@pkolaczk
Copy link
Author

pkolaczk commented Oct 7, 2024

Zhao posted his answer on the cndb PR, so crosslinking it here:
https://github.com/riptano/cndb/pull/11145#issuecomment-2393747297

LeaderTableTest and CompactionTasksOrganizerTest are not related. The CI looks good. Let's rebase and megre

Merging then. Thank you @jasonstack !

Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@pkolaczk pkolaczk merged commit b42d418 into main Oct 7, 2024
455 of 472 checks passed
@pkolaczk pkolaczk deleted the cndb-10873-segment-checksum branch October 7, 2024 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants