-
Notifications
You must be signed in to change notification settings - Fork 460
Use double‑checked locking to eliminate redundant metadata lookups #5538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
CachedTablet now = findTabletInCache(row); | ||
if (now != null) { | ||
if (now.getTserverLocation().isPresent() && lcSession.checkLock(now) != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In support of ondemand tablets, tablets w/o a location can be cached in 4.0 (was not the case in 2.1). So these checks that look for the presence of a location do not seem correct.
try (var unused = lookupLocks.lock(ptl.getExtent())) { | ||
// See if entry was added to cache by another thread while we were waiting on the lock | ||
var cached = findTabletInCache(row); | ||
if (cached != null && cached.getCreationTimer().startedAfter(timer)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What made you suspect the timer here? I looked into the specifics of the timer and found how its implemented here https://github.com/openjdk/jdk/blob/b6b5ac1ef9042ed62a8358aa6943b8dc87dcf0ab/src/hotspot/os/posix/os_posix.cpp#L1474 . Looking at the docs for that system call it does mention that it will not go backwards but also there is no gaurantee it wil go forward between two calls. So its possible two threads got the exact same nanoTime causing the entry to look stale, that is probably ok. We could make the time comparison >=
instead of >
if that is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the >
to a >=
is iffy and probably not good so please ignore that suggestion. One way we could avoid comparing the time or anything else is to just look for change in the object reference. So maybe could do the following?
CachedTablet before = findTabletInCache(row);
try (var unused = lookupLocks.lock(ptl.getExtent())) {
CachedTablet after = findTabletInCache(row);
if(after != before && after != null){
// another thread probably did the lookup while we were waiting on the lock, so let use that
return;
}
Fixes #5535
Swap out the Timer‑based check for a double‑checked‑locking pattern on the lock, and only short‑circuit when the cached tablet actually has a valid location.
Also made the failing IT case more concurrently strenuous to better exercise this new logic.