Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement an example CloudFormation for a custom Deadline Cloud Balancer #81

Open
wants to merge 8 commits into
base: mainline
Choose a base branch
from

Conversation

Ahuge
Copy link
Contributor

@Ahuge Ahuge commented Feb 8, 2025

What is this?

This is an example of implementing a Lambda function that acts as a custom Balancer by modifying the `maxCurrentWorkers on a job.

What was the problem/requirement? (What/Why)

We had an internal customer that wanted to modify how the default scheduler assigned workers to tasks. They wanted it to work in a similar way to the Deadline 10 Balancer options.

What was the solution? (How)

Implement a scheduled Lambda function that runs every minute and calculates the weighting and enforces it by setting maxCurrentWorkers

How was this change tested?

This has been running in the customer account for just over a week now. The client reports it works as expected.
No extensive additional testing has been done.

Was this change documented?

New README and template section

Lambda code was initially written in collaboration with Michael Yuan (thanks!)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Ahuge Ahuge requested a review from a team as a code owner February 8, 2025 01:00
@Ahuge Ahuge force-pushed the ah/feature/custom-balancer branch from cf2992f to 0962c7b Compare February 8, 2025 01:02
@Ahuge Ahuge changed the title Ah/feature/custom balancer feat: Implement an example CloudFormation for a custom Deadline Cloud Balancer Feb 8, 2025
@Ahuge Ahuge force-pushed the ah/feature/custom-balancer branch 2 times, most recently from 1c282f6 to 28d9382 Compare February 11, 2025 16:46
@erico-aws erico-aws added the waiting-on-maintainers Waiting on the maintainers to review. label Feb 19, 2025
Copy link
Contributor

@mwiebe mwiebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together and submitting it!

@Ahuge Ahuge force-pushed the ah/feature/custom-balancer branch from 28d9382 to 44955e9 Compare March 7, 2025 18:18
@mwiebe
Copy link
Contributor

mwiebe commented Mar 21, 2025

I've noodled on a bit of refactoring of this, confirming the identitystore: IAM permission isn't necessary, trying to clean up naming, etc. Have pushed that here: https://github.com/mwiebe/deadline-cloud-samples/commits/ah/feature/custom-balancer/

@Ahuge Ahuge force-pushed the ah/feature/custom-balancer branch from dbb551e to 1878d78 Compare March 24, 2025 16:19
Ahuge added 8 commits March 24, 2025 09:20
This sample implements functionality similar to one of the Deadline 10 balancer options.

Instead of a Priority/FIFO queue, this implements weighted balancing for jobs in the farm.

There are some special edge cases here:
- If there are jobs that are set to 100, the Balancer goes into a "High Priority" mode and the entire Fleet gets balanced only between jobs at 100 priority. Other jobs get paused.
- Jobs at 0 priority do not get any workers UNLESS there are only jobs with priority 0 on the farm. These jobs are treated as "paused"

The actual values for how many workers to assign to each job is done on a "squared weight" policy. That means that a job at priority 20 would get twice as many workers as a job at priority 10.

Signed-off-by: Alex Hughes <[email protected]>
Jobs at 100 priority should be the only ones that matter
If there are no jobs at 100, only jobs that have priority above 0 should matter
If there are only jobs with priority of 0, start allocating to those ones

Refactor our edge case logic to filter out invalid jobs
This simplifies logic to only operate on jobs that are valid in our current context

Signed-off-by: Alex Hughes <[email protected]>
Signed-off-by: Alex Hughes <[email protected]>
@Ahuge Ahuge force-pushed the ah/feature/custom-balancer branch from 1878d78 to fccbd53 Compare March 24, 2025 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-on-maintainers Waiting on the maintainers to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants