-
Couldn't load subscription status.
- Fork 537
feat: if the user asks not to load files, make EagerSnapshot _never_ load files #3580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3580 +/- ##
==========================================
- Coverage 74.51% 74.48% -0.03%
==========================================
Files 146 146
Lines 44875 44898 +23
Branches 44875 44898 +23
==========================================
+ Hits 33437 33441 +4
- Misses 9245 9250 +5
- Partials 2193 2207 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
It's difficult for concurrent tests to be a thorough test of correctness, since there are many possible interleaving execution paths. |
|
@itamarst I think I should be able to come up with a test which doesn't necessarily require the concurrency that your linked issue has. If you're okay with this pull request sitting open for a few days, I will summon the brainpower and come up with something 😄 |
|
Of course, appreciate your feedback! |
Previously the files would be loaded on a conflict, now they're never loaded. Signed-off-by: Itamar Turner-Trauring <[email protected]>
7fdd69f to
7a464f4
Compare
…e read This makes it easier to understand whether or not the function has actually _read_ the ReplayStream Signed-off-by: R. Tyler Croy <[email protected]>
7a464f4 to
53ba1e8
Compare
|
@itamarst @rtyler - Currently I am working on take 3 of getting log replay via kernel in, which touches a lot of the same fields as this PR. More generally speaking, the However |
| if self.files.is_none() { | ||
| self.process_visitors(visitors)?; | ||
| return Ok(read_data); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may actually break things in unexpected ways. Specifically process_visitors does nothing without doing a replay. Its main use case right now is idempotent writes (Txn actions ...).
However also not sure how this would work with require_files right now, we may just have an existing bug ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roeap anything is possible so sure it might break things but we have no tests to prove otherwise 🤷
|
Using So the main goal here from my perspective is just "allow scaling appends". So a different approach is to maybe... reopen the database on a conflict in "append" mode, so the code sticks to the fast path? Which current abstraction layers may not allow or may just not make sense. I'll take a look later. And I guess worst case there's the hopefully successful refactor. |
Pull request was converted to draft
|
So looking at the code, the alternative solution of "reopen the database on conflict" doesn't seem to feasible without a bunch of potentially difficult refactoring. The place you'd want to do it is |
Previously the files would be loaded on a conflict even if the user requested they not be loaded, now they're never loaded.
This is a optimization, reducing the performance for loading unneeded files when in append mode.
For my reproducer benchmark (see #3528), this speeds things up by about 30%.
IMPORTANT: It is not clear to me if this is semantically correct! On the one hand, if you're doing append only, I don't see why you'd need to load the files, this shouldn't be any different than e.g. restarting from scratch by reopening the database. On the other hand, maybe this code path is used in other situations beyond appends?
Are there other tests I should write?
Related Issue(s)
Fixes #3528