Skip to content

scale-workers#4

Draft
DolevAdas wants to merge 10 commits intollm-d-incubation:mainfrom
DolevAdas:scale-workers
Draft

scale-workers#4
DolevAdas wants to merge 10 commits intollm-d-incubation:mainfrom
DolevAdas:scale-workers

Conversation

@DolevAdas
Copy link
Copy Markdown

Execute scaling actions for llm-d prefill/decode workers.
Supports:

  • manual scaling
  • automatic WVA autoscaling
  • suspend/resume operations

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Copy link
Copy Markdown
Collaborator

@rachelt44 rachelt44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please separate to two different skills

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
… extracted to separate files

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
…d more robust for future deployments

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
…d more robust for future deployments

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
…available section

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants