Skip to content

KAFKA-13555: Use input partitions for StickyTaskAssignor [WIP] #19670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: trunk
Choose a base branch
from

Conversation

lorcanj
Copy link
Contributor

@lorcanj lorcanj commented May 10, 2025

Addresses: KAFKA-13555

@github-actions github-actions bot added triage PRs from the community streams labels May 10, 2025
@lorcanj
Copy link
Contributor Author

lorcanj commented May 10, 2025

The assignment logic for active tasks now primarily uses the number of input partitions as a proxy for individual task weight.

The trigger for rebalancing active tasks considers the weight of the task to be added to the current weight of the client.

A average weight buffer has been introduced to make the system less aggressive in breaking stickiness due to minor weight imbalances, aiming to reduce unnecessary task movements while still correcting significant imbalances.

Standby task assignment continues to use the traditional task count-based logic. This was done due to a lack of understanding as to whether the input partitions should be considered for these assignments.

This divergence in logic for active vs. standby tasks has introduced some awkwardness in the codebase, particularly around function signatures (e.g., findBestClientForTask needing different evaluation criteria).

The findLeastLoadedClient method has been refactored to remove an unnecessary loop, improving its efficiency by computing the required information within the initial loop using additional variables.

Some issues I’m aware of:

• Standby Assignment Strategy: The current change makes standby client selection use the current task count-based metrics. Is this approach acceptable (which has resulted in the current awkward approach of passing in separate functions) or should it use the input partition based approach, or something else?

• Unassigned tasks sorting: Should the sorting of remaining unassigned active tasks be changed from TaskId to descending weight to potentially further improve weight balance?

• Should some input partitions be excluded from the calculation of a task’s weight?

Copy link

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant