Skip to content

Make log servers return an empty version range only when it is correct to do so #12171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

sbodagala
Copy link
Contributor

In the context of version vector/unicast, make log servers return an empty version range (on peeks) such that the receiver won't miss any versions that it is supposed to receive.

  • Make the cluster controller collect the set of log servers participated in recovery and propagate that information to the other processes

  • Extend the ServerPeekCursor to take a flag that tells the source log server whether it can return an empty version range or not (in the context of version vector/unicast), and make the source log server return an empty version range only when this flag is set

  • Make the peek APIs set the flag appropriately when initializing ServerPeekCursors

  • Also, take the internal logic used by SetPeekCursor and MergedPeekCursor into account while initializing the above mentioned flag.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

information, needed by the recovery algorithm when unicast is enabled:
- PrevVersion of a version
- List of log servers that a commit proxy sends a commit version to

And, extend the code to populate "unknownCommittedVersions" list (the
list of commit versions whose commit status is not known, and to be
determined by the recovery version computation algorithm) on a log
server restart.

NOTE: Please note that these changes will cause version incompatibility
and so additional code/logic will need to be added to make sure that
upgrades/restart related simulation tests work properly.
committed version onwards in the case where the known committed version
is behind LogData::persistentDataDurableVersion (and version vector
unicast is enabled). This is so we will populate "LogData::unknownCommittedVersions",
on log server restart, with all versions that are needed by unicast recovery.
…mmittedVersion"

onwards in log server's disk queue.
from known committed version onwards, when version vector is enabled.
apple#11815 in "main".
In case of spill by reference, log servers should use the logic that
is based over "TagData::popped" to decide how long to keep the disk
queue positions of versions in memory (instead of using the logic that
is based over "LogData::persistentDataVersion", which is applicable to
spill by value case).
an empty version range (on peeks) such that the receiver won't miss
any versions that it is supposed to receive.

- Make the cluster controller collect the set of log servers
participated in recovery and propagate that information to the other
processes
- Extend the ServerPeekCursor to take a flag that tells the source
log server whether it can return an empty version range or not (in
the context of version vector/unicast), and make the source log
server return an empty version range only when this flag is set
- Make the peek APIs set the flag appropriately when initializing
ServerPeekCursors
- Also, take the internal logic used by SetPeekCursor and MergedPeekCursor
into account while initializing the above mentioned flag.
@sbodagala sbodagala requested review from dlambrig and jzhou77 May 28, 2025 22:16
@sbodagala
Copy link
Contributor Author

NOTE: I didn't do a good job of making this branch. Please note that the very last commit is the only commit that is relevant to this task, thanks!

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 9a34167
  • Duration 0:04:48
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 9a34167
  • Duration 0:05:00
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 9a34167
  • Duration 0:05:03
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 9a34167
  • Duration 0:05:07
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 9a34167
  • Duration 0:05:07
  • Result: ❌ FAILED
  • Error: Error while executing command: if [[ $(git diff --shortstat 2> /dev/null | tail -n1) == "" ]]; then echo "CODE FORMAT CLEAN"; else echo "CODE FORMAT NOT CLEAN"; echo; echo "THE FOLLOWING FILES NEED TO BE FORMATTED"; echo; git ls-files -m; echo; if [[ $FDB_VERSION =~ 7\.\3. ]]; then echo skip; else exit 1; fi; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@sbodagala sbodagala marked this pull request as draft May 28, 2025 22:26
@@ -390,8 +393,8 @@ Future<Reference<ILogSystem::IPeekCursor>> LogRouterData::getPeekCursorData(Refe
.When(logSystemChanged,
[&](const Void&) {
if (logSystem->get()) {
result =
logSystem->get()->peekLogRouter(dbgid, beginVersion, routerTag, useSatellite, recoverAt);
result = logSystem->get()->peekLogRouter(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe LogRouter doesn't explicitly need to receive the list of participating log servers (receiving it as part of LogSystemConfig is probably good enough). I will need to look into that.

@@ -229,12 +229,15 @@ struct ILogSystem {
int fastReplies;
int unknownReplies;

bool returnEmptyIfStopped; // valid only when unicast is enabled
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source log server can return an empty version range only if this flag is set.

@@ -212,6 +213,13 @@ struct TagPartitionedLogSystem final : ILogSystem, ReferenceCounted<TagPartition
Optional<UID> debugID,
Optional<std::unordered_map<uint16_t, Version>> tpcvMap) final;

void maybeAdjustBestServer(int bestSet,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this API may need to be called in other peek calls too, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log server availability check currently done in LogSystemPeekCursor.actor.cpp (so we can avoid the need to call this logic explicitly on peek calls). Please check #12188 for details, thanks!

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 9a34167
  • Duration 0:37:52
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 9a34167
  • Duration 0:59:44
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@dlambrig
Copy link
Contributor

dlambrig commented Jun 2, 2025

NOTE: I didn't do a good job of making this branch. Please note that the very last commit is the only commit that is relevant to this task, thanks!

Can a clean branch be created which can be merged into main?

LogLockInfo lockInfo,
std::vector<Reference<AsyncVar<bool>>> failed = std::vector<Reference<AsyncVar<bool>>>(),
Optional<Version> lastEnd = Optional<Version>());
Optional<std::tuple<Version,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this tuple has become too big and hard to read! Can you please make it into a structure?

Copy link
Contributor Author

@sbodagala sbodagala Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done already (as part of another PR). We don't see that change here because this branch is behind that commit.

@@ -2255,7 +2319,7 @@ Optional<std::tuple<Version, Version>> getRecoverVersionUnicast(
// (N - replicationFactor + 1) log servers must be available.
// @todo Also do the replication policy validation check here, and enable
// that check in "getDurableVersion()" at that point.
if (!(versionAvailableTLogs[version].count() >= tLogs.count() - replicationFactor + 1)) {
if (!((versionAvailableTLogs[version].count() >= tLogs.count() - replicationFactor + 1) || !validatePolicy)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very complicated expression, can it be simplified- e.g.
if (versionAvailableTLogs[version].count() < tLogs.count() - replicationFactor + 1 && validatePolicy) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reply as above (and this change is not relevant to the peek issue)..

@@ -667,13 +668,49 @@ Future<Version> TagPartitionedLogSystem::push(const ILogSystem::PushVersionSet&
tLogCommitResults.push_back(commitSuccess);
location++;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

int& bestServer,
std::vector<Reference<LogSet>> const& logSets,
Optional<Version> end,
Optional<std::map<uint8_t, std::vector<uint16_t>>> knownLockedTLogIds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can knownLockedTLogIds be a reference ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do that. Please check the declaration of this method, thanks!

Optional<Version> end,
Optional<std::map<uint8_t, std::vector<uint16_t>>> knownLockedTLogIds) {
if (SERVER_KNOBS->ENABLE_VERSION_VECTOR_TLOG_UNICAST && bestServer >= 0 && end.present() &&
end.get() != std::numeric_limits<Version>::max()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we ASSERT that knownLockedTLogIds is present? We are in recovery because end version != -1, and recovery is the only valid situation we will enter this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -708,8 +749,10 @@ Reference<ILogSystem::IPeekCursor> TagPartitionedLogSystem::peekAll(UID dbgid,
.detail("Begin", begin)
.detail("End", end)
.detail("BestLogs", localSets[bestSet]->logServerString());
int bestServer = localSets[bestSet]->bestLocationFor(tag);
maybeAdjustBestServer(bestSetIdx, bestServer, tLogs, end, knownLockedTLogIds);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify intent for future readers of the code, rename to validateBestServerOnRecovery() ?

bestServer = -1;
return;
}
if (!IFailureMonitor::failureMonitor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these conditionals check if the best server is no longer reachable even though it was earlier stopped? Was this something you encountered while testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially, we are trying to do all the checks that "ServerPeekCursor::isActive()" does. Please check PR #12188 for a modified version of this, thanks!

@sbodagala
Copy link
Contributor Author

NOTE: I didn't do a good job of making this branch. Please note that the very last commit is the only commit that is relevant to this task, thanks!

Can a clean branch be created which can be merged into main?

Done (PR: #12188).

@sbodagala
Copy link
Contributor Author

Closing this. Please check/review PR #12188, thanks!

@sbodagala sbodagala closed this Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants