Add overwriteRootCapability argument to capacity plugin #4662

mrendi29 · 2025-10-02T22:45:13Z

What type of PR is this?

Feature

What this PR does / why we need it:

Problem:
Currently, the root queue's resource limits are always set to the actual cluster resources, regardless of any user-configured capability values. This prevents administrators from setting virtual resource limits that differ from the physical cluster capacity. This helps in cases where a user has cluster-autoscaled enabled and the "virtual" resources can be higher than the "real" resources.

Which issue(s) this PR fixes:

Fixes #3910

Special notes for your reviewer: I have not yet tested this in our production clusters, I plan to test it in the following week.

Does this PR introduce a user-facing change?

None

Add overwriteRootCapability argument to capacity plugin to allow using user-configured root queue capability instead of actual cluster resources

volcano-sh-bot · 2025-10-02T22:45:17Z

Welcome @mrendi29!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

gemini-code-assist · 2025-10-02T22:45:24Z

Summary of Changes

Hello @mrendi29, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the capacity scheduling plugin by providing a mechanism to override the root queue's resource limits. This allows administrators to define virtual resource capacities that can differ from the physical cluster resources, which is particularly useful in environments utilizing cluster autoscaling where virtual resources might exceed real resources.

Highlights

New Configuration Option: Introduced a new boolean argument, overwriteRootCapability, to the capacity plugin. When set to true, it allows the root queue's resource limits to be determined by user-configured capabilities rather than the actual cluster resources.
Root Queue Capability Logic: Modified the buildHierarchicalQueueAttrs function to conditionally set the rootQueueAttr.realCapability. If overwriteRootCapability is enabled, the user-defined rootQueueAttr.capability is used; otherwise, it defaults to the total cluster resources.
Plugin Initialization: The New function for the capacity plugin now initializes overwriteRootCapability to false by default and parses its value from the plugin arguments.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

mrendi29 · 2025-10-02T22:46:46Z

cc: @JesseStutler @Monokaix

gemini-code-assist

Code Review

This pull request introduces a new overwriteRootCapability argument to the capacity plugin, allowing administrators to set virtual resource limits for the root queue. The implementation is sound, but there's an opportunity to improve code robustness by cloning resource objects to prevent potential pointer aliasing issues. Additionally, the PR is missing unit tests for the new functionality. It's important to add tests to verify the behavior of the overwriteRootCapability flag and prevent future regressions.

pkg/scheduler/plugins/capacity/capacity.go

hajnalmt · 2025-10-04T14:03:20Z

This is not true. You can set it to any value. We only set it to the cluster capacity when it's empty.
https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/plugins/capacity/capacity.go#L590
Aren't you using an old version or something?

I am far from happy with the current queue design, and all these hacks around the root queue, but I feel that this is still a step away from the right path.
I don't have time to implement it currently, but we should use overcommitFactors instead to align the configurations across the plugins...
#4530

hajnalmt · 2025-10-04T14:22:01Z

Oh sry, I understand it now. We want to overwrite the realCapability with the root Queue's capability, maybe renaming it to overwriteRootQueueRealCapability would better reflect the intention and logic behind the usage of the argument.

mrendi29 · 2025-10-04T18:23:19Z

@hajnalmt thanks for the review. I'll update the variable to better reflect the intention

JesseStutler · 2025-10-10T04:01:27Z

/cc

JesseStutler · 2025-10-14T09:39:05Z

@mrendi29 Please sign your commit by using git commit -s

JesseStutler · 2025-10-14T09:40:01Z

Special notes for your reviewer: I have not yet tested this in our production clusters, I plan to test it in the following week.

@mrendi29 Have you already tested it in production clusters and does it work properly?

mrendi29 · 2025-10-17T02:27:53Z

Have you already tested it in production clusters and does it work properly?

Not yet unfortunately, had to fight some fires at work. Planning to do so this week. Will also address the comments on the PR and include some tests.

hajnalmt · 2025-10-23T12:38:17Z

It looks like this would solve other issues too:
#4684
Do you need any help with this PR? If you don't have enough time I can polish this quite fast for you (if you give me temporarily permissions to your repo I will update your branch).

mrendi29 · 2025-10-23T12:49:44Z

It looks like this would solve other issues too:

#4684

Do you need any help with this PR? If you don't have enough time I can polish this quite fast for you (if you give me temporarily permissions to your repo I will update your branch).

I'll push my remaining changes in a bit and I'll give you access to the repo and we can pair on this together. Thanks for your help!

mrendi29 · 2025-10-23T17:26:31Z

@hajnalmt i've added you on the repo, you should have access now.

I'll work on adding a testcase as well. if you can get to that before I do feel free to make the change.

mrendi29 · 2025-10-28T21:04:00Z

@JesseStutler @hajnalmt can you take a second look at this PR? I believe I have addressed all comments. Let me know what you think.

FYI, the test case is AI generated after a couple of prompts :)

JesseStutler · 2025-11-03T09:21:49Z

pkg/scheduler/plugins/capacity/capacity_test.go

 	}
 }

+func TestOverwriteRootQueueRealCapability(t *testing.T) {


I think we can just add a simple case in Test_capacityPlugin_OnSessionOpenWithHierarchy is enough, there's no need to add a new Test_capacityPlugin_OnSessionOpenWithHierarchy test func and add a lot of new cases :)

wangyang0616 · 2025-11-03T11:15:01Z

pkg/scheduler/plugins/capacity/capacity.go

+			rootQueueAttr.capability, cp.totalResource)
+	} else {
+		// Default behavior: use actual cluster resources
+		rootQueueAttr.realCapability = cp.totalResource.Clone()


Wouldn't it be more appropriate to set the default value to infinity here?

The root queue is automatically created by Volcano, and users should not be aware of it. Therefore, when only the capacity plugin is enabled, users can submit an unlimited number of jobs, and the root queue does not perform any quota checks or restrictions. If users want to use the queue quota capability, they need to actively create queues for quota management.

The same logic applies to the default values for the Deserver and Guarantee of the root queue.

Our team is running into the same issue described here: #4680, which also relates to root capacity checks. I agree that it would make sense for the root capacity to be set to infinity by default, rather than requiring users to specify a value.

I do have a question, though. Are there any concerns with allowing infinite capacity on the root, especially given that the original design of Volcano seemed to care about overall cluster capacity? For example, there’s a capacity plugin that rejects jobs when the cluster is full and an overcommit plugin to allow extra jobs. It feels like the earlier designs didn’t favor an infinite root capacity. Would this change have any unintended impact on those mechanisms?

@vzhou-p Looks like difference users have difference requirements, setting it as a infinite value is a good choice, but currrently child queues' capability will inherit from the parent queue if not set:

So which mean child queue also can submit infinite jobs, so I'm not sure that whether it's what users want, because difference users may have difference requirements

I understand that the default value for the root queue may have different expectations depending on different scenarios:

1. Default to infinite: After enabling the capacity plugin, users do not need to create additional queues, and job scheduling behaves the same as when the capacity plugin is not enabled.
2. Default to the actual resource capacity of the cluster: After enabling the capacity plugin, the queue's quota management capability is automatically activated. Jobs exceeding the quota limit will not be scheduled.
3. Users can specify a certain percentage increase based on the actual resource capacity of the cluster, for example, 120% of the actual resources. This allows for some overcommitment of quotas based on user configuration.
4. Users can specify the actual resource capacity and actively manage it along with the queue quotas.

Can these four methods be exposed through arguments in the capability configuration? If no specific setting is provided, the infinite mode should be used by default. In scenarios where users do not create additional queues, the behavior should remain consistent with that when the capacity plugin is not enabled.

Setting the root queue as infinity sounds like a really interesting option. That means that I as a platform admin would not need to interact with the root queue directly and can just create another queue on top of root.
(I recall that there was another PR somewhere that wanted to interact with the root queue capacity through the helm chart, this could eliminate that).

In the same time, having child queues submit infinite jobs seems really risky.

I wonder though, if it makes sense for this functionality to be part of overwriteRootQueueRealCapability in L610 instead of default Volcano behavior.

Depending on what the PMCs think I can then make the necessary adjustments.

In the same time, having child queues submit infinite jobs seems really risky.

I think the design of the Volcano queue is to allow Volcano users to act as admins, allowing them to specify the queue's capability value, not just submit jobs to the default queue. Therefore as a Volcano user, and I may use Volcano to build a platform, I may write a custom webhook to require users who want to create a queue must specify capability value. So I think set the root and default queue with infinite is aligned with the behavior when we don't open the capacity plugin, but whether we allow the child queue still inherit from the ancestor, we may need to further discuss whether it's reasonable

Thanks all for the thoughtful discussion.
I recently raised a feature request to support maxApplications on queues. I think there’s an opportunity here to address the concerns around child queues inheriting infinite capacity by considering resource quota and maxApplications together.

By supporting both resource quota (for overall resource usage limits) and maxApplications (for controlling the number of jobs), we can give users more flexibility and additional control in two dimensions. This could help mitigate the risk of child queues having unlimited capacity, while still keeping the root queue default as “infinite” for ease of use. Users could opt-in to more restrictive behaviors as needed.

Perhaps we can look at combining this work or aligning the design to solve both requirements at once. Would love to hear thoughts on whether this approach could address the current concerns. If there’s consensus, I’m happy to help move this forward.

@vzhou-p So is maxApplication part of queue's capability or it's a seperate field? Will it be a required value?

I think maxApplications should be a separate field from the queue's capability (resource quota), rather than being part of the capability structure itself. This keeps the two concepts distinct.

Capability (resource quota): Controls how much resources (CPU, memory, etc.) can be consumed

maxApplications: Controls how many jobs/applications can run or be queued

I believe it should be optional, with a sensible default (likely infinite or unlimited). This aligns with the earlier discussion about keeping the root queue infinite by default. Users opt into restrictions as needed rather than being forced to configure everything upfront.

mrendi29 · 2025-11-06T23:37:19Z

I tried testing this in my k8s cluster today. I created a new image only for the scheduler and used that in my cluster. I did not change controller and admission image.
I created a root queue with 500 cores and 5000GB RAM (note my cluster has only <100 cores and 500GB RAM but has autoscaling enabled)
I also created a child queue of root with 250 cores and 4500GB RAM.

I tried submitting a spark job but got the following error from the spark operator logs:

│ I1106 23:33:48.303393      10 event.go:285] Event(v1.ObjectReference{Kind:"SparkApplication", Namespace:"spark", Name:"spark-sleep-8", UID:"c3526aed-2566-42e1-a0a6-33a445cca51f", APIVersion:"sparkoperator.k8s.io/v1beta2", ResourceVersion:"221645563", FieldPath:""}): type: 'Normal' reason: 'SparkApplicationAdded' SparkApplication spark-sleep-8 was added, enqueuing it for submission                                                                                                                                │
│ E1106 23:33:48.319225      10 controller.go:676] failed to process batch scheduler BeforeSubmitSparkApplication with error failed to sync PodGroup with error: admission webhook "validatepodgroup.volcano.sh" denied the request: can only submit PodGroup to queue with state `Open`, queue `myqueue` status is ``. Abandon schedule pods via volcano

Seems that the queue i created is not reporting status:

┌─────────────────────────────────────────────────────────────────────────── Describe(-/myqueue) ────────────────────────────────────────────────────────────────────────────┐
│ Name:         myqueue                                                                                                                                                      │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:30:58Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Resource Version:    221644752                                                                                                                                           │
│   UID:                 b9ad6d06-4ac8-4ca0-8e73-25964af8f0d6                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Guarantee:                                                                                                                                                               │
│     Resource:                                                                                                                                                              │
│       Cpu:      240                                                                                                                                                        │
│       Memory:   4500G                                                                                                                                                      │
│   Parent:       root                                                                                                                                                       │
│   Reclaimable:  true                                                                                                                                                       │
│   Weight:       1                                                                                                                                                          │
│ Events:         <none>

Whereas root:

│ Name:         root                                                                                                                                                         │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-08-20T22:08:49Z                                                                                                                                │
│   Generation:          3                                                                                                                                                   │
│   Resource Version:    188581782                                                                                                                                           │
│   UID:                 de2d7990-f6dd-42c2-8e9d-524d6f07845b                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Capability:                                                                                                                                                              │
│     Cpu:        500                                                                                                                                                        │
│     Memory:     5000G                                                                                                                                                      │
│   Reclaimable:  false                                                                                                                                                      │
│   Weight:       1                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Allocated:                                                                                                                                                               │
│     Cpu:     0                                                                                                                                                             │
│     Memory:  0                                                                                                                                                             │
│   Reservation:                                                                                                                                                             │
│   State:  Open                                                                                                                                                             │
│ Events:   <none>                                                                                                                                                           │
│

Will do some more digging

mrendi29 · 2025-11-06T23:59:47Z

If i submit the job to the root queue directly:
Scheduler logs:

│ I1106 23:53:28.184662       1 util.go:101] schedulerPodName  is responsible to PodGroup spark/spark-spark-sleep-8-pg                                                       │
│ I1106 23:53:28.184747       1 event_handlers.go:835] Add PodGroup(spark-spark-sleep-8-pg) into cache, spec(v1beta1.PodGroupSpec{MinMember:1, MinTaskMember:map[string]int3 │
│ I1106 23:53:31.994400       1 util.go:63] schedulerPodName  is responsible to Pod spark/spark-sleep-8-driver                                                               │
│ I1106 23:53:31.994451       1 event_handlers.go:398] Added pod <spark/spark-sleep-8-driver> into cache.

┌────────────────────────────────────────────────────────────────── Describe(spark/spark-spark-sleep-8-pg) ──────────────────────────────────────────────────────────────────┐
│ Name:         spark-spark-sleep-8-pg                                                                                                                                       │
│ Namespace:    spark                                                                                                                                                        │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         PodGroup                                                                                                                                                     │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:53:28Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Owner References:                                                                                                                                                        │
│     API Version:           sparkoperator.k8s.io/v1beta2                                                                                                                    │
│     Block Owner Deletion:  true                                                                                                                                            │
│     Controller:            true                                                                                                                                            │
│     Kind:                  SparkApplication                                                                                                                                │
│     Name:                  spark-sleep-8                                                                                                                                   │
│     UID:                   a8ff063f-d444-4890-a75b-4be4b3bb5ad2                                                                                                            │
│   Resource Version:        221651212                                                                                                                                       │
│   UID:                     96c19dc5-4382-47e2-8a54-74f4564cd1ec                                                                                                            │
│ Spec:                                                                                                                                                                      │
│   Min Member:  1                                                                                                                                                           │
│   Min Resources:                                                                                                                                                           │
│     Cpu:     122                                                                                                                                                           │
│     Memory:  1208G                                                                                                                                                         │
│   Queue:     root                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Phase:  Pending                                                                                                                                                          │
│ Events:   <none>

The driver is pending as well.

Scheduler conf:

actions: "enqueue, allocate, backfill"
tiers:
- plugins:
  - name: priority
  - name: gang
    enablePreemptable: false
  - name: conformance
- plugins:
  - name: drf
    enablePreemptable: false
  - name: predicates
  - name: capacity
    enableHierarchy: true 
    arguments:
      overwriteRootQueueRealCapability: true
  - name: nodeorder

JesseStutler · 2025-11-07T02:02:31Z

If i submit the job to the root queue directly: Scheduler logs:

│ I1106 23:53:28.184662       1 util.go:101] schedulerPodName  is responsible to PodGroup spark/spark-spark-sleep-8-pg                                                       │
│ I1106 23:53:28.184747       1 event_handlers.go:835] Add PodGroup(spark-spark-sleep-8-pg) into cache, spec(v1beta1.PodGroupSpec{MinMember:1, MinTaskMember:map[string]int3 │
│ I1106 23:53:31.994400       1 util.go:63] schedulerPodName  is responsible to Pod spark/spark-sleep-8-driver                                                               │
│ I1106 23:53:31.994451       1 event_handlers.go:398] Added pod <spark/spark-sleep-8-driver> into cache.

┌────────────────────────────────────────────────────────────────── Describe(spark/spark-spark-sleep-8-pg) ──────────────────────────────────────────────────────────────────┐
│ Name:         spark-spark-sleep-8-pg                                                                                                                                       │
│ Namespace:    spark                                                                                                                                                        │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         PodGroup                                                                                                                                                     │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:53:28Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Owner References:                                                                                                                                                        │
│     API Version:           sparkoperator.k8s.io/v1beta2                                                                                                                    │
│     Block Owner Deletion:  true                                                                                                                                            │
│     Controller:            true                                                                                                                                            │
│     Kind:                  SparkApplication                                                                                                                                │
│     Name:                  spark-sleep-8                                                                                                                                   │
│     UID:                   a8ff063f-d444-4890-a75b-4be4b3bb5ad2                                                                                                            │
│   Resource Version:        221651212                                                                                                                                       │
│   UID:                     96c19dc5-4382-47e2-8a54-74f4564cd1ec                                                                                                            │
│ Spec:                                                                                                                                                                      │
│   Min Member:  1                                                                                                                                                           │
│   Min Resources:                                                                                                                                                           │
│     Cpu:     122                                                                                                                                                           │
│     Memory:  1208G                                                                                                                                                         │
│   Queue:     root                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Phase:  Pending                                                                                                                                                          │
│ Events:   <none>

The driver is pending as well.

Scheduler conf:

actions: "enqueue, allocate, backfill"
tiers:
- plugins:
  - name: priority
  - name: gang
    enablePreemptable: false
  - name: conformance
- plugins:
  - name: drf
    enablePreemptable: false
  - name: predicates
  - name: capacity
    enableHierarchy: true 
    arguments:
      overwriteRootQueueRealCapability: true
  - name: nodeorder

Could you provide more logs of it? Like about whether there are some logs showing that the driver pod was be rejected to be scheduled or the root queue exceeded the limit or something

JesseStutler · 2025-11-07T02:07:20Z

I tried testing this in my k8s cluster today. I created a new image only for the scheduler and used that in my cluster. I did not change controller and admission image. I created a root queue with 500 cores and 5000GB RAM (note my cluster has only <100 cores and 500GB RAM but has autoscaling enabled) I also created a child queue of root with 250 cores and 4500GB RAM.

I tried submitting a spark job but got the following error from the spark operator logs:

│ I1106 23:33:48.303393      10 event.go:285] Event(v1.ObjectReference{Kind:"SparkApplication", Namespace:"spark", Name:"spark-sleep-8", UID:"c3526aed-2566-42e1-a0a6-33a445cca51f", APIVersion:"sparkoperator.k8s.io/v1beta2", ResourceVersion:"221645563", FieldPath:""}): type: 'Normal' reason: 'SparkApplicationAdded' SparkApplication spark-sleep-8 was added, enqueuing it for submission                                                                                                                                │
│ E1106 23:33:48.319225      10 controller.go:676] failed to process batch scheduler BeforeSubmitSparkApplication with error failed to sync PodGroup with error: admission webhook "validatepodgroup.volcano.sh" denied the request: can only submit PodGroup to queue with state `Open`, queue `myqueue` status is ``. Abandon schedule pods via volcano

Seems that the queue i created is not reporting status:

┌─────────────────────────────────────────────────────────────────────────── Describe(-/myqueue) ────────────────────────────────────────────────────────────────────────────┐
│ Name:         myqueue                                                                                                                                                      │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:30:58Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Resource Version:    221644752                                                                                                                                           │
│   UID:                 b9ad6d06-4ac8-4ca0-8e73-25964af8f0d6                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Guarantee:                                                                                                                                                               │
│     Resource:                                                                                                                                                              │
│       Cpu:      240                                                                                                                                                        │
│       Memory:   4500G                                                                                                                                                      │
│   Parent:       root                                                                                                                                                       │
│   Reclaimable:  true                                                                                                                                                       │
│   Weight:       1                                                                                                                                                          │
│ Events:         <none>

Whereas root:

│ Name:         root                                                                                                                                                         │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-08-20T22:08:49Z                                                                                                                                │
│   Generation:          3                                                                                                                                                   │
│   Resource Version:    188581782                                                                                                                                           │
│   UID:                 de2d7990-f6dd-42c2-8e9d-524d6f07845b                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Capability:                                                                                                                                                              │
│     Cpu:        500                                                                                                                                                        │
│     Memory:     5000G                                                                                                                                                      │
│   Reclaimable:  false                                                                                                                                                      │
│   Weight:       1                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Allocated:                                                                                                                                                               │
│     Cpu:     0                                                                                                                                                             │
│     Memory:  0                                                                                                                                                             │
│   Reservation:                                                                                                                                                             │
│   State:  Open                                                                                                                                                             │
│ Events:   <none>                                                                                                                                                           │
│

Will do some more digging

It's weird, maybe we could check controller's log, I think there may be something wrong with controller to sync the state of queue from "" to Open:

volcano/pkg/controllers/queue/state/open.go

Lines 43 to 48 in 83f550e

    
           return SyncQueue(os.queue, func(status *v1beta1.QueueStatus, podGroupList []string) { 
        
           	specState := os.queue.Status.State 
        
           	if len(specState) == 0 || specState == v1beta1.QueueStateOpen { 
        
           		status.State = v1beta1.QueueStateOpen 
        
           		return 
        
           	}

Signed-off-by: Endi Caushi <[email protected]>

Introducing new optional argument in the capacity plugin overwriteRootQueueRealCapability. Currently, the root queue's resource limits are always set to the actual cluster resources, regardless of any user-configured capability values. This prevents administrators from setting virtual resource limits that differ from the physical cluster capacity. The new option helps in cases where a user has cluster-autoscaler enabled where the "virtual" resources can be higher than the "real" resources. Signed-off-by: Endi Caushi <[email protected]>

Based on the practices in other plugins, the overwriteRootQueueRealCapability new argument in the capacity plugin comes from a const key. This makes it easier to follow the configurable arguments of the plugin. Signed-off-by: Endi Caushi <[email protected]>

Minor changes into the comments of the plugin. Added a positive/negative unit test to test the argument. Adding some informations about the argument to the capacity plugin's hierarchy documentation. Signed-off-by: Hajnal Mate <[email protected]>

hajnalmt

/lgtm

Hello @mrendi29 ,
Sorry for some reason I thought this is already merged in.
I rebased your branch, modified the logging a little bit (thank you for the access). Also I removed the AI tests with a simple negative-positive test case that follows the structure of the existing test suite. Also added the argument with some context to the appropriate design doc.

I am not sure about the failure you got, but it shouldn't be due to these changes, and this argument can be useful in other use cases.

Please take a look @JesseStutler

volcano-sh-bot · 2025-12-08T16:23:31Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hajnalmt
Once this PR has been reviewed and has the lgtm label, please assign wpeng102 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mrendi29 · 2025-12-08T16:29:43Z

@hajnalmt TSYM for lending a hand in this, I was OOO for the last 2 weeks and have been super busy at work wrapping up year end objectives which is why I have not been able to dedicate more time at this the last month. Your change LGTM as well. i'll test it in production once i get a chance

hajnalmt · 2025-12-09T08:18:05Z

/ok-to-test

@mrendi29 Anytime! 👍 Sorry that this got out of my focus.
One thing that came to my mind regarding the error you got, is that you shouldn't send pods to the root queue only to the leaf queues. I ran into these kind of issues I think, when I enabled hierarchy and there were pods in the root queue already.

volcano-sh-bot requested review from alcorj-mizar and william-wang October 2, 2025 22:45

volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 2, 2025

gemini-code-assist bot reviewed Oct 2, 2025

View reviewed changes

pkg/scheduler/plugins/capacity/capacity.go Outdated Show resolved Hide resolved

volcano-sh-bot requested a review from JesseStutler October 10, 2025 04:01

hajnalmt mentioned this pull request Oct 23, 2025

Scheduler do not work cause checkHierarchicalQueue failed when GPU dropped #4684

Open

mrendi29 force-pushed the overwrite-root-queue-capability branch 2 times, most recently from 3f47d5b to f37fc4f Compare October 23, 2025 17:11

mrendi29 force-pushed the overwrite-root-queue-capability branch from 082ba19 to d8e6bb6 Compare October 23, 2025 17:18

volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 28, 2025

JesseStutler reviewed Nov 3, 2025

View reviewed changes

wangyang0616 reviewed Nov 3, 2025

View reviewed changes

mrendi29 and others added 4 commits December 8, 2025 15:53

change name to overwriteRootQueueRealCapability and add more logging

9a7d792

Signed-off-by: Endi Caushi <[email protected]>

hajnalmt force-pushed the overwrite-root-queue-capability branch from f1f617d to 0800a75 Compare December 8, 2025 16:16

hajnalmt approved these changes Dec 8, 2025

View reviewed changes

volcano-sh-bot assigned hajnalmt Dec 8, 2025

volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 8, 2025

volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Dec 9, 2025

hajnalmt mentioned this pull request Dec 16, 2025

Hierarchical Queue Mode Prevents Reclaim Functionality #4817

Open

guoqinwill mentioned this pull request Dec 27, 2025

fix: prevent cluster-wide scheduling failure due to queue hierarchy validation #4864

Merged

Add overwriteRootCapability argument to capacity plugin #4662

Are you sure you want to change the base?

Add overwriteRootCapability argument to capacity plugin #4662

Conversation

mrendi29 commented Oct 2, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer: I have not yet tested this in our production clusters, I plan to test it in the following week.

Does this PR introduce a user-facing change?

Uh oh!

volcano-sh-bot commented Oct 2, 2025

Uh oh!

gemini-code-assist bot commented Oct 2, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

mrendi29 commented Oct 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

hajnalmt commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hajnalmt commented Oct 4, 2025

Uh oh!

mrendi29 commented Oct 4, 2025

Uh oh!

JesseStutler commented Oct 10, 2025

Uh oh!

JesseStutler commented Oct 14, 2025

Uh oh!

JesseStutler commented Oct 14, 2025

Uh oh!

mrendi29 commented Oct 17, 2025

Uh oh!

hajnalmt commented Oct 23, 2025

Uh oh!

mrendi29 commented Oct 23, 2025

Uh oh!

mrendi29 commented Oct 23, 2025

Uh oh!

mrendi29 commented Oct 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrendi29 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrendi29 commented Nov 6, 2025

Uh oh!

JesseStutler commented Nov 7, 2025

Uh oh!

JesseStutler commented Nov 7, 2025

Uh oh!

hajnalmt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

hajnalmt commented Oct 4, 2025 •

edited

Loading

mrendi29 commented Nov 6, 2025 •

edited

Loading

hajnalmt left a comment •

edited

Loading