-
Notifications
You must be signed in to change notification settings - Fork 275
Prioritized Partitions Simulation #3028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is decently risky change in the critical section of Replication.
Although this is controlled using cfg2, Is there a way to further limit this to one host?
At the very least, imo, we should test this change in Perf/Ei using binary hot reload way.
@@ -61,7 +61,13 @@ public DataNodeTracker(DataNodeId dataNodeId, List<RemoteReplicaInfo> remoteRepl | |||
|
|||
// for each of smaller array of remote replicas create active group trackers with consecutive group ids | |||
for (List<RemoteReplicaInfo> remoteReplicaList : remoteReplicaSegregatedList) { | |||
ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream() | |||
int size = remoteReplicaList.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets extract this code piece so that its a single point where this logic resides. Currently this is being evaluated at 2 places and any change in one place must be replicated to the other location. Roughly :-
list foo(inList, bool IsReplicationEnablePrioritzation, replicationMaxPrioritizedReplicasPercent) {
int size = inList.size();
if (IsReplicationEnablePrioritzation) {
int maxSize = replicationMaxPrioritizedReplicasPercent/100 * size;
size = Math.min(size, maxSize);
}
return inList.subList(0, size)
}
ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream() | ||
int size = remoteReplicaList.size(); | ||
if (isReplicaPrioritzationEnabled) { | ||
int maxSize = replicationMaxPrioritizedReplicas/100 * size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will not be able to get accurate data from this methodology. As replicas for partition can be in different threads and different data node tracker. So you will stop one replica and not stop another replica. Also after each iteration, the list could change and different partition will be picked up.
What's the next step on this PR? It has been open for nearly a month. |
Summary
This PR experiment with estimated completion time for bootstrap of X% of total partitions assigned to the host. This will validate the idea that prioritized partitions can be completed in Y hours
Testing Done
./gradlew clean build && ./gradlew allJar