feat: add opt-in scheduling gates for queue capacity control#5033
feat: add opt-in scheduling gates for queue capacity control#5033devzizu wants to merge 24 commits intovolcano-sh:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @devzizu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where Volcano's scheduler could cause unnecessary cluster autoscaler scale-ups by marking pods as 'Unschedulable' even when they were only waiting for queue capacity. By leveraging Kubernetes scheduling gates, it introduces an opt-in mechanism to delay the 'Unschedulable' condition until a pod is truly blocked by a lack of cluster resources, while also ensuring proper queue capacity accounting for pods that have passed initial admission but are awaiting node allocation. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
1f0980d to
7d15238
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and well-designed feature to add opt-in scheduling gates for queue capacity control, improving compatibility with cluster autoscalers. While the overall design is comprehensive and includes a valuable E2E test, two significant security-related issues were identified: a data race in the asynchronous gate removal workers that could lead to a scheduler panic (Denial of Service), and a race condition in the JSON Patch logic for removing scheduling gates that could result in the unauthorized removal of security-related gates (Broken Access Control). Both of these critical security issues must be addressed before merging. Additionally, there are minor suggestions for a grammatical correction in the design document and replacing context.TODO() with context.Background() in background operations for improved context management.
4077068 to
6e3165d
Compare
0f96f9a to
75d95dc
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements opt-in scheduling gates for queue capacity control to prevent cluster autoscalers from triggering unnecessary scale-ups when pods are waiting for queue admission rather than cluster capacity. The solution addresses issue #4710 by using Kubernetes scheduling gates to delay the Unschedulable condition until pods pass queue capacity checks.
Changes:
- Added opt-in annotation
scheduling.volcano.sh/queue-allocation-gate: "true"for pods to participate in gate-based queue admission control - Implemented asynchronous gate removal workers in the allocate action to remove gates when pods pass capacity checks
- Enhanced capacity plugin with reserved task tracking to prevent race conditions where ungated-but-unscheduled pods could allow queue over-allocation
- Added comprehensive E2E tests and design documentation
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| staging/src/volcano.sh/apis/pkg/apis/scheduling/v1beta1/labels.go | Added QueueAllocationGateKey constant for the opt-in annotation |
| pkg/scheduler/api/helpers.go | Added helper functions to check for Volcano scheduling gates and annotations |
| pkg/webhooks/admission/pods/mutate/mutate_pod.go | Added webhook logic to inject scheduling gates for opted-in pods |
| pkg/scheduler/actions/allocate/allocate.go | Implemented async worker infrastructure and gate removal logic in allocate action |
| pkg/scheduler/plugins/capacity/capacity.go | Added reserved task cache to account for ungated pods in capacity checks |
| pkg/scheduler/framework/statement.go | Added Name() and Task() methods to operation struct for cleanup callbacks |
| pkg/scheduler/framework/session.go | Added cleanupReservationsFns map to session |
| pkg/scheduler/framework/session_plugins.go | Added AddCleanupReservationsFn and CleanupReservations methods |
| pkg/scheduler/api/types.go | Added CleanupReservationsFn function type |
| pkg/scheduler/api/job_info.go | Modified GetSchGatedPodResources to exclude Volcano-only gated pods |
| pkg/scheduler/cache/util.go | Added RemoveVolcanoSchGate utility function for gate removal |
| pkg/scheduler/cache/cache.go | Added synchronous gate removal before bind as safety guarantee |
| test/e2e/schedulingaction/allocate.go | Added comprehensive E2E test for capacity reservation behavior |
| test/e2e/util/pod.go | Added helper functions for checking gate status in tests |
| test/e2e/util/job.go | Added Annotations field to TaskSpec for test configuration |
| pkg/scheduler/util/test_utils.go | Updated test utilities to handle SubGroupPolicy without NetworkTopology |
| pkg/scheduler/actions/allocate/allocate_test.go | Added unit test for SubGroupPolicy without NetworkTopology |
| docs/design/scheduling-gates-queue-admission.md | Added comprehensive design document explaining the feature |
| docs/user-guide/version-compatibility-archive.md | Added version compatibility archive document |
| README.md | Updated Kubernetes compatibility table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d0bbf95 to
e50491c
Compare
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
…nsistency Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
commit f4db07d Merge: 1000067 01823d1 Author: Volcano Bot <49986348+volcano-sh-bot@users.noreply.github.com> Date: Fri Feb 13 11:26:57 2026 +0800 Merge pull request volcano-sh#5039 from volcano-sh/copilot/update-kubernetes-compatibility-map Reorganize Kubernetes compatibility matrix: show 5 latest versions with newest first commit 01823d1 Author: Jesse Stutler <jesseincomparable@hotmail.com> Date: Fri Feb 13 10:20:07 2026 +0800 Add version compatibility archive and update README with reference link commit 1000067 Merge: 6ba0e29 56c6901 Author: Volcano Bot <49986348+volcano-sh-bot@users.noreply.github.com> Date: Thu Feb 12 16:49:57 2026 +0800 Merge pull request volcano-sh#5038 from JesseStutler/fix_4871 Fixed issue where jobs with subgroups but not hard networkTopology.mode could not be scheduled. commit f22bea4 Author: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Date: Thu Feb 12 08:42:26 2026 +0000 Reorganize Kubernetes compatibility table: show only 5 latest versions with newest on top and left Co-authored-by: JesseStutler <38534065+JesseStutler@users.noreply.github.com> commit 56c6901 Author: JesseStutler <chenzicong4@huawei.com> Date: Thu Feb 12 16:05:35 2026 +0800 Fixed issue where jobs with subgroups but not hard networkTopology.mode could not be scheduled. Signed-off-by: zhengchenyu <zhengchenyu16@163.com> Signed-off-by: JesseStutler <chenzicong4@huawei.com> commit 9125f3e Author: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Date: Thu Feb 12 06:15:55 2026 +0000 Add Kubernetes compatibility for Volcano v1.13 and v1.14 Co-authored-by: JesseStutler <38534065+JesseStutler@users.noreply.github.com> commit f4e0f66 Author: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Date: Thu Feb 12 06:14:39 2026 +0000 Initial plan Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
… values Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
b2c7960 to
c23023f
Compare
Signed-off-by: devzizu <jazevedo960@gmail.com>
4c3064c to
0e40cee
Compare
Signed-off-by: devzizu <jazevedo960@gmail.com>
hajnalmt
left a comment
There was a problem hiding this comment.
I gave an other round what do you think about my comments?
| } | ||
|
|
||
| func (alloc *Action) UnInitialize() {} | ||
| func (alloc *Action) UnInitialize() { |
There was a problem hiding this comment.
Nor Initialize() nor UnInitialize() is used in our codebase except the unit test helper:
volcano/pkg/scheduler/uthelper/helper.go
Lines 224 to 229 in 49b8f40
What a bummer. Can you create at least a follow up ticket for this please? Currently nothing will close these channels. If this gets merged in we should fix this, so this is not just dead code here.
There was a problem hiding this comment.
I also think that this is out of the scope of this feature.
There was a problem hiding this comment.
Oh, this is a surprise for me -- I was counting on it to be called, but indeed I only see it as a test helper 😄
I actually call it here:
P.S. Perhaps we can add this to #5048?
|
|
||
| // RemoveVolcanoSchGate removes the Volcano scheduling gate from a pod via strategic merge patch. | ||
| // Returns nil if gate is successfully removed or already removed (idempotent). | ||
| func RemoveVolcanoSchGate(kubeClient kubernetes.Interface, pod *v1.Pod) error { |
There was a problem hiding this comment.
Can we introduce a unit test for this function? A test that directly calls it?
I know there is no util_test.go now, but probably we need it to have this verified. I ran into issues where the strategic merge didn't replace and kept the old values, but it would probably come up in the e2e test if this wouldn't work. Nevertheless let's have this to verify that this function works properly.
There was a problem hiding this comment.
Indeed, just added a new test and created the new util_test.go as suggested. Please take a look 🙏
Signed-off-by: devzizu <jazevedo960@gmail.com>
Signed-off-by: devzizu <jazevedo960@gmail.com>
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds opt-in scheduling gates for queue capacity control to prevent cluster autoscalers from triggering unnecessary scale-ups when pods are waiting for queue admission rather than cluster capacity.
When a pod opts in via the
scheduling.volcano.sh/queue-allocation-gate: "true"annotation, Volcano adds a scheduling gate that delays theUnschedulablecondition until the queue has capacity.The design is further described in #4727.
Which issue(s) this PR fixes:
Fixes #4710
Special notes for your reviewer:
Does this PR introduce a user-facing change?