-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add replication policy doc #12179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add replication policy doc #12179
Conversation
Result of foundationdb-pr-clang-ide on Linux RHEL 9
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux RHEL 9
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr on Linux RHEL 9
|
Result of foundationdb-pr-clang on Linux RHEL 9
|
The commit proxy uses the shard map to derive which log servers to send committed data to. Each SS has a "buddy" with a single corresponding log server. The | ||
buddy is found by taking the SS id modded by the number of tLogs in the dc. | ||
|
||
When a mutation is committed, the key(s) it modifies will logically reside within one or more shards, which physically are replicated over the SS in their team. Each such SS has a buddy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 'more shards', clearRange is the only case. Am I understanding correctly? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph should be modified so it talks about about a "batch" of mutations. For each mutation in the batch we find the set of replicas and then we take a union of all the sets to get the final list of destination servers. For a single mutation I believe you are right.
- Shards move between SS as data is distributed. | ||
- New shards can be created. | ||
|
||
This means the destination team must be recalculated on each commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be specific, the "dest team" here means the SS team that the mutation goes to at a version? If yes, the "recalculation" means that given a key/range, find the source server and dest server (if data move) at the version? This is achieved by looking up the in-memory shard map at the commit proxy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I just mean FDB has to calculate a set of destination log servers on every commit. Every commit will contain a batch of mutations with keys written, and we need to enumerate them to get the modified keys. I'll clarify the sentence.
- There are code paths where the shard map is not consulted, for example, when a tag is added to a transaction as a consequence of a metadata mutation. | ||
- Two SS may have the same buddy, in which case a random server must be chosen. | ||
|
||
Though users interact with only a few predefined redundancy modes (double, triple, etc.), FoundationDB internally supports a **rich set of compositional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see this description! For better understanding the policy, can you explain what will happens to Across(3, zone_id, One()).selectRelicas() when the input alsoServers
contains two tLog IDs but the two IDs are of the same zone. Is this case possible? If yes, what will the selectRelicas do to meet the policy? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the "also" servers would not meet the policy. The code would have to search the set of "available" servers to find a combination that works. If more than one server from the available list meets the requirements, a random one from that set would be selected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description of policies is super useful!
|
||
TLogs: **0** 1 **2** **3** | ||
|
||
The "buddy" associations between the SS and LS are dynamic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this statement is completely true: the buddy association between a storage server and a log server is static, as it is decided by the mod function; the association between a shard and a log server is dynamic though.
replication policy. If they do not, the function will select additional servers at random out of the available servers in order to meet the policy. Because the | ||
shard map has replicas for all key ranges,the set of "also" servers passed to selectReplica usually already satisfies the policy, but there are exceptions. | ||
|
||
- There are code paths where the shard map is not consulted, for example, when a tag is added to a transaction as a consequence of a metadata mutation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so I fully understand, what type of metadata mutation are you referring to here? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look for the pattern addTag()
followed by writeMutation
in ApplyMetadataMutation.cpp
, such as checkSetServerKeysPrefix()
checkSetServerTagsPrefix()
, checkSetChangeFeedPrefix()
, etc. In those cases we obtain a tag without consulting the shard map, so do not have replicas.
Describe the replication policy.
When reviewing in github, click on "rich diff" to see the markup format rendered.
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormain
if this is the youngest branch)