Skip to content

Prioritized Partitions Simulation #3028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mudit-saxena
Copy link
Contributor

Summary

This PR experiment with estimated completion time for bootstrap of X% of total partitions assigned to the host. This will validate the idea that prioritized partitions can be completed in Y hours

Testing Done

./gradlew clean build && ./gradlew allJar

Copy link
Contributor

@DevenAhluwalia DevenAhluwalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is decently risky change in the critical section of Replication.
Although this is controlled using cfg2, Is there a way to further limit this to one host?

At the very least, imo, we should test this change in Perf/Ei using binary hot reload way.

@@ -61,7 +61,13 @@ public DataNodeTracker(DataNodeId dataNodeId, List<RemoteReplicaInfo> remoteRepl

// for each of smaller array of remote replicas create active group trackers with consecutive group ids
for (List<RemoteReplicaInfo> remoteReplicaList : remoteReplicaSegregatedList) {
ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream()
int size = remoteReplicaList.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets extract this code piece so that its a single point where this logic resides. Currently this is being evaluated at 2 places and any change in one place must be replicated to the other location. Roughly :-

list foo(inList, bool IsReplicationEnablePrioritzation, replicationMaxPrioritizedReplicasPercent) {
    int size = inList.size();
    if (IsReplicationEnablePrioritzation) {
             int maxSize = replicationMaxPrioritizedReplicasPercent/100 * size;
             size = Math.min(size, maxSize);
   }
   return inList.subList(0, size)
}

ActiveGroupTracker activeGroupTracker = new ActiveGroupTracker(currentGroupId, remoteReplicaList.stream()
int size = remoteReplicaList.size();
if (isReplicaPrioritzationEnabled) {
int maxSize = replicationMaxPrioritizedReplicas/100 * size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will not be able to get accurate data from this methodology. As replicas for partition can be in different threads and different data node tracker. So you will stop one replica and not stop another replica. Also after each iteration, the list could change and different partition will be picked up.

@gshantanu
Copy link

What's the next step on this PR? It has been open for nearly a month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants