Skip to content

Conversation

@mrendi29
Copy link

@mrendi29 mrendi29 commented Oct 2, 2025

What type of PR is this?

Feature

What this PR does / why we need it:

Problem:
Currently, the root queue's resource limits are always set to the actual cluster resources, regardless of any user-configured capability values. This prevents administrators from setting virtual resource limits that differ from the physical cluster capacity. This helps in cases where a user has cluster-autoscaled enabled and the "virtual" resources can be higher than the "real" resources.

Which issue(s) this PR fixes:

Fixes #3910

Special notes for your reviewer: I have not yet tested this in our production clusters, I plan to test it in the following week.

Does this PR introduce a user-facing change?

None

Add overwriteRootCapability argument to capacity plugin to allow using user-configured root queue capability instead of actual cluster resources

@volcano-sh-bot
Copy link
Contributor

Welcome @mrendi29!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@gemini-code-assist
Copy link

Summary of Changes

Hello @mrendi29, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the capacity scheduling plugin by providing a mechanism to override the root queue's resource limits. This allows administrators to define virtual resource capacities that can differ from the physical cluster resources, which is particularly useful in environments utilizing cluster autoscaling where virtual resources might exceed real resources.

Highlights

  • New Configuration Option: Introduced a new boolean argument, overwriteRootCapability, to the capacity plugin. When set to true, it allows the root queue's resource limits to be determined by user-configured capabilities rather than the actual cluster resources.
  • Root Queue Capability Logic: Modified the buildHierarchicalQueueAttrs function to conditionally set the rootQueueAttr.realCapability. If overwriteRootCapability is enabled, the user-defined rootQueueAttr.capability is used; otherwise, it defaults to the total cluster resources.
  • Plugin Initialization: The New function for the capacity plugin now initializes overwriteRootCapability to false by default and parses its value from the plugin arguments.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 2, 2025
@mrendi29
Copy link
Author

mrendi29 commented Oct 2, 2025

cc: @JesseStutler @Monokaix

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new overwriteRootCapability argument to the capacity plugin, allowing administrators to set virtual resource limits for the root queue. The implementation is sound, but there's an opportunity to improve code robustness by cloning resource objects to prevent potential pointer aliasing issues. Additionally, the PR is missing unit tests for the new functionality. It's important to add tests to verify the behavior of the overwriteRootCapability flag and prevent future regressions.

@hajnalmt
Copy link
Contributor

hajnalmt commented Oct 4, 2025

This is not true. You can set it to any value. We only set it to the cluster capacity when it's empty.
https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/plugins/capacity/capacity.go#L590
Aren't you using an old version or something?

I am far from happy with the current queue design, and all these hacks around the root queue, but I feel that this is still a step away from the right path.
I don't have time to implement it currently, but we should use overcommitFactors instead to align the configurations across the plugins...
#4530

@hajnalmt
Copy link
Contributor

hajnalmt commented Oct 4, 2025

Oh sry, I understand it now. We want to overwrite the realCapability with the root Queue's capability, maybe renaming it to overwriteRootQueueRealCapability would better reflect the intention and logic behind the usage of the argument.

@mrendi29
Copy link
Author

mrendi29 commented Oct 4, 2025

@hajnalmt thanks for the review. I'll update the variable to better reflect the intention

@JesseStutler
Copy link
Member

/cc

@JesseStutler
Copy link
Member

@mrendi29 Please sign your commit by using git commit -s

@JesseStutler
Copy link
Member

Special notes for your reviewer: I have not yet tested this in our production clusters, I plan to test it in the following week.

@mrendi29 Have you already tested it in production clusters and does it work properly?

@mrendi29
Copy link
Author

Have you already tested it in production clusters and does it work properly?

Not yet unfortunately, had to fight some fires at work. Planning to do so this week. Will also address the comments on the PR and include some tests.

@hajnalmt
Copy link
Contributor

It looks like this would solve other issues too:
#4684
Do you need any help with this PR? If you don't have enough time I can polish this quite fast for you (if you give me temporarily permissions to your repo I will update your branch).

@mrendi29
Copy link
Author

It looks like this would solve other issues too:

#4684

Do you need any help with this PR? If you don't have enough time I can polish this quite fast for you (if you give me temporarily permissions to your repo I will update your branch).

I'll push my remaining changes in a bit and I'll give you access to the repo and we can pair on this together. Thanks for your help!

@mrendi29 mrendi29 force-pushed the overwrite-root-queue-capability branch 2 times, most recently from 3f47d5b to f37fc4f Compare October 23, 2025 17:11
@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 23, 2025
@mrendi29 mrendi29 force-pushed the overwrite-root-queue-capability branch from 082ba19 to d8e6bb6 Compare October 23, 2025 17:18
@mrendi29
Copy link
Author

@hajnalmt i've added you on the repo, you should have access now.

I'll work on adding a testcase as well. if you can get to that before I do feel free to make the change.

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 28, 2025
@mrendi29
Copy link
Author

@JesseStutler @hajnalmt can you take a second look at this PR? I believe I have addressed all comments. Let me know what you think.

FYI, the test case is AI generated after a couple of prompts :)

}
}

func TestOverwriteRootQueueRealCapability(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just add a simple case in Test_capacityPlugin_OnSessionOpenWithHierarchy is enough, there's no need to add a new Test_capacityPlugin_OnSessionOpenWithHierarchy test func and add a lot of new cases :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

rootQueueAttr.capability, cp.totalResource)
} else {
// Default behavior: use actual cluster resources
rootQueueAttr.realCapability = cp.totalResource.Clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more appropriate to set the default value to infinity here?

The root queue is automatically created by Volcano, and users should not be aware of it. Therefore, when only the capacity plugin is enabled, users can submit an unlimited number of jobs, and the root queue does not perform any quota checks or restrictions. If users want to use the queue quota capability, they need to actively create queues for quota management.

The same logic applies to the default values for the Deserver and Guarantee of the root queue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our team is running into the same issue described here: #4680, which also relates to root capacity checks. I agree that it would make sense for the root capacity to be set to infinity by default, rather than requiring users to specify a value.

I do have a question, though. Are there any concerns with allowing infinite capacity on the root, especially given that the original design of Volcano seemed to care about overall cluster capacity? For example, there’s a capacity plugin that rejects jobs when the cluster is full and an overcommit plugin to allow extra jobs. It feels like the earlier designs didn’t favor an infinite root capacity. Would this change have any unintended impact on those mechanisms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vzhou-p Looks like difference users have difference requirements, setting it as a infinite value is a good choice, but currrently child queues' capability will inherit from the parent queue if not set:
image
So which mean child queue also can submit infinite jobs, so I'm not sure that whether it's what users want, because difference users may have difference requirements

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the default value for the root queue may have different expectations depending on different scenarios:

1. Default to infinite: After enabling the capacity plugin, users do not need to create additional queues, and job scheduling behaves the same as when the capacity plugin is not enabled.
2. Default to the actual resource capacity of the cluster: After enabling the capacity plugin, the queue's quota management capability is automatically activated. Jobs exceeding the quota limit will not be scheduled.
3. Users can specify a certain percentage increase based on the actual resource capacity of the cluster, for example, 120% of the actual resources. This allows for some overcommitment of quotas based on user configuration.
4. Users can specify the actual resource capacity and actively manage it along with the queue quotas.

Can these four methods be exposed through arguments in the capability configuration? If no specific setting is provided, the infinite mode should be used by default. In scenarios where users do not create additional queues, the behavior should remain consistent with that when the capacity plugin is not enabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the root queue as infinity sounds like a really interesting option. That means that I as a platform admin would not need to interact with the root queue directly and can just create another queue on top of root.
(I recall that there was another PR somewhere that wanted to interact with the root queue capacity through the helm chart, this could eliminate that).

In the same time, having child queues submit infinite jobs seems really risky.

I wonder though, if it makes sense for this functionality to be part of overwriteRootQueueRealCapability in L610 instead of default Volcano behavior.

Depending on what the PMCs think I can then make the necessary adjustments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the same time, having child queues submit infinite jobs seems really risky.

I think the design of the Volcano queue is to allow Volcano users to act as admins, allowing them to specify the queue's capability value, not just submit jobs to the default queue. Therefore as a Volcano user, and I may use Volcano to build a platform, I may write a custom webhook to require users who want to create a queue must specify capability value. So I think set the root and default queue with infinite is aligned with the behavior when we don't open the capacity plugin, but whether we allow the child queue still inherit from the ancestor, we may need to further discuss whether it's reasonable

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all for the thoughtful discussion.
I recently raised a feature request to support maxApplications on queues. I think there’s an opportunity here to address the concerns around child queues inheriting infinite capacity by considering resource quota and maxApplications together.

By supporting both resource quota (for overall resource usage limits) and maxApplications (for controlling the number of jobs), we can give users more flexibility and additional control in two dimensions. This could help mitigate the risk of child queues having unlimited capacity, while still keeping the root queue default as “infinite” for ease of use. Users could opt-in to more restrictive behaviors as needed.

Perhaps we can look at combining this work or aligning the design to solve both requirements at once. Would love to hear thoughts on whether this approach could address the current concerns. If there’s consensus, I’m happy to help move this forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vzhou-p So is maxApplication part of queue's capability or it's a seperate field? Will it be a required value?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maxApplications should be a separate field from the queue's capability (resource quota), rather than being part of the capability structure itself. This keeps the two concepts distinct.

  • Capability (resource quota): Controls how much resources (CPU, memory, etc.) can be consumed
  • maxApplications: Controls how many jobs/applications can run or be queued

I believe it should be optional, with a sensible default (likely infinite or unlimited). This aligns with the earlier discussion about keeping the root queue infinite by default. Users opt into restrictions as needed rather than being forced to configure everything upfront.

@mrendi29
Copy link
Author

mrendi29 commented Nov 6, 2025

I tried testing this in my k8s cluster today. I created a new image only for the scheduler and used that in my cluster. I did not change controller and admission image.
I created a root queue with 500 cores and 5000GB RAM (note my cluster has only <100 cores and 500GB RAM but has autoscaling enabled)
I also created a child queue of root with 250 cores and 4500GB RAM.

I tried submitting a spark job but got the following error from the spark operator logs:

│ I1106 23:33:48.303393      10 event.go:285] Event(v1.ObjectReference{Kind:"SparkApplication", Namespace:"spark", Name:"spark-sleep-8", UID:"c3526aed-2566-42e1-a0a6-33a445cca51f", APIVersion:"sparkoperator.k8s.io/v1beta2", ResourceVersion:"221645563", FieldPath:""}): type: 'Normal' reason: 'SparkApplicationAdded' SparkApplication spark-sleep-8 was added, enqueuing it for submission                                                                                                                                │
│ E1106 23:33:48.319225      10 controller.go:676] failed to process batch scheduler BeforeSubmitSparkApplication with error failed to sync PodGroup with error: admission webhook "validatepodgroup.volcano.sh" denied the request: can only submit PodGroup to queue with state `Open`, queue `myqueue` status is ``. Abandon schedule pods via volcano     

Seems that the queue i created is not reporting status:

┌─────────────────────────────────────────────────────────────────────────── Describe(-/myqueue) ────────────────────────────────────────────────────────────────────────────┐
│ Name:         myqueue                                                                                                                                                      │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:30:58Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Resource Version:    221644752                                                                                                                                           │
│   UID:                 b9ad6d06-4ac8-4ca0-8e73-25964af8f0d6                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Guarantee:                                                                                                                                                               │
│     Resource:                                                                                                                                                              │
│       Cpu:      240                                                                                                                                                        │
│       Memory:   4500G                                                                                                                                                      │
│   Parent:       root                                                                                                                                                       │
│   Reclaimable:  true                                                                                                                                                       │
│   Weight:       1                                                                                                                                                          │
│ Events:         <none>       

Whereas root:

│ Name:         root                                                                                                                                                         │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-08-20T22:08:49Z                                                                                                                                │
│   Generation:          3                                                                                                                                                   │
│   Resource Version:    188581782                                                                                                                                           │
│   UID:                 de2d7990-f6dd-42c2-8e9d-524d6f07845b                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Capability:                                                                                                                                                              │
│     Cpu:        500                                                                                                                                                        │
│     Memory:     5000G                                                                                                                                                      │
│   Reclaimable:  false                                                                                                                                                      │
│   Weight:       1                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Allocated:                                                                                                                                                               │
│     Cpu:     0                                                                                                                                                             │
│     Memory:  0                                                                                                                                                             │
│   Reservation:                                                                                                                                                             │
│   State:  Open                                                                                                                                                             │
│ Events:   <none>                                                                                                                                                           │
│                    

Will do some more digging

@mrendi29
Copy link
Author

mrendi29 commented Nov 6, 2025

If i submit the job to the root queue directly:
Scheduler logs:

│ I1106 23:53:28.184662       1 util.go:101] schedulerPodName  is responsible to PodGroup spark/spark-spark-sleep-8-pg                                                       │
│ I1106 23:53:28.184747       1 event_handlers.go:835] Add PodGroup(spark-spark-sleep-8-pg) into cache, spec(v1beta1.PodGroupSpec{MinMember:1, MinTaskMember:map[string]int3 │
│ I1106 23:53:31.994400       1 util.go:63] schedulerPodName  is responsible to Pod spark/spark-sleep-8-driver                                                               │
│ I1106 23:53:31.994451       1 event_handlers.go:398] Added pod <spark/spark-sleep-8-driver> into cache.   
┌────────────────────────────────────────────────────────────────── Describe(spark/spark-spark-sleep-8-pg) ──────────────────────────────────────────────────────────────────┐
│ Name:         spark-spark-sleep-8-pg                                                                                                                                       │
│ Namespace:    spark                                                                                                                                                        │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         PodGroup                                                                                                                                                     │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:53:28Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Owner References:                                                                                                                                                        │
│     API Version:           sparkoperator.k8s.io/v1beta2                                                                                                                    │
│     Block Owner Deletion:  true                                                                                                                                            │
│     Controller:            true                                                                                                                                            │
│     Kind:                  SparkApplication                                                                                                                                │
│     Name:                  spark-sleep-8                                                                                                                                   │
│     UID:                   a8ff063f-d444-4890-a75b-4be4b3bb5ad2                                                                                                            │
│   Resource Version:        221651212                                                                                                                                       │
│   UID:                     96c19dc5-4382-47e2-8a54-74f4564cd1ec                                                                                                            │
│ Spec:                                                                                                                                                                      │
│   Min Member:  1                                                                                                                                                           │
│   Min Resources:                                                                                                                                                           │
│     Cpu:     122                                                                                                                                                           │
│     Memory:  1208G                                                                                                                                                         │
│   Queue:     root                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Phase:  Pending                                                                                                                                                          │
│ Events:   <none>   

The driver is pending as well.

Scheduler conf:

actions: "enqueue, allocate, backfill"
tiers:
- plugins:
  - name: priority
  - name: gang
    enablePreemptable: false
  - name: conformance
- plugins:
  - name: drf
    enablePreemptable: false
  - name: predicates
  - name: capacity
    enableHierarchy: true 
    arguments:
      overwriteRootQueueRealCapability: true
  - name: nodeorder

@JesseStutler
Copy link
Member

If i submit the job to the root queue directly: Scheduler logs:

│ I1106 23:53:28.184662       1 util.go:101] schedulerPodName  is responsible to PodGroup spark/spark-spark-sleep-8-pg                                                       │
│ I1106 23:53:28.184747       1 event_handlers.go:835] Add PodGroup(spark-spark-sleep-8-pg) into cache, spec(v1beta1.PodGroupSpec{MinMember:1, MinTaskMember:map[string]int3 │
│ I1106 23:53:31.994400       1 util.go:63] schedulerPodName  is responsible to Pod spark/spark-sleep-8-driver                                                               │
│ I1106 23:53:31.994451       1 event_handlers.go:398] Added pod <spark/spark-sleep-8-driver> into cache.   
┌────────────────────────────────────────────────────────────────── Describe(spark/spark-spark-sleep-8-pg) ──────────────────────────────────────────────────────────────────┐
│ Name:         spark-spark-sleep-8-pg                                                                                                                                       │
│ Namespace:    spark                                                                                                                                                        │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         PodGroup                                                                                                                                                     │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:53:28Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Owner References:                                                                                                                                                        │
│     API Version:           sparkoperator.k8s.io/v1beta2                                                                                                                    │
│     Block Owner Deletion:  true                                                                                                                                            │
│     Controller:            true                                                                                                                                            │
│     Kind:                  SparkApplication                                                                                                                                │
│     Name:                  spark-sleep-8                                                                                                                                   │
│     UID:                   a8ff063f-d444-4890-a75b-4be4b3bb5ad2                                                                                                            │
│   Resource Version:        221651212                                                                                                                                       │
│   UID:                     96c19dc5-4382-47e2-8a54-74f4564cd1ec                                                                                                            │
│ Spec:                                                                                                                                                                      │
│   Min Member:  1                                                                                                                                                           │
│   Min Resources:                                                                                                                                                           │
│     Cpu:     122                                                                                                                                                           │
│     Memory:  1208G                                                                                                                                                         │
│   Queue:     root                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Phase:  Pending                                                                                                                                                          │
│ Events:   <none>   

The driver is pending as well.

Scheduler conf:

actions: "enqueue, allocate, backfill"
tiers:
- plugins:
  - name: priority
  - name: gang
    enablePreemptable: false
  - name: conformance
- plugins:
  - name: drf
    enablePreemptable: false
  - name: predicates
  - name: capacity
    enableHierarchy: true 
    arguments:
      overwriteRootQueueRealCapability: true
  - name: nodeorder

Could you provide more logs of it? Like about whether there are some logs showing that the driver pod was be rejected to be scheduled or the root queue exceeded the limit or something

@JesseStutler
Copy link
Member

I tried testing this in my k8s cluster today. I created a new image only for the scheduler and used that in my cluster. I did not change controller and admission image. I created a root queue with 500 cores and 5000GB RAM (note my cluster has only <100 cores and 500GB RAM but has autoscaling enabled) I also created a child queue of root with 250 cores and 4500GB RAM.

I tried submitting a spark job but got the following error from the spark operator logs:

│ I1106 23:33:48.303393      10 event.go:285] Event(v1.ObjectReference{Kind:"SparkApplication", Namespace:"spark", Name:"spark-sleep-8", UID:"c3526aed-2566-42e1-a0a6-33a445cca51f", APIVersion:"sparkoperator.k8s.io/v1beta2", ResourceVersion:"221645563", FieldPath:""}): type: 'Normal' reason: 'SparkApplicationAdded' SparkApplication spark-sleep-8 was added, enqueuing it for submission                                                                                                                                │
│ E1106 23:33:48.319225      10 controller.go:676] failed to process batch scheduler BeforeSubmitSparkApplication with error failed to sync PodGroup with error: admission webhook "validatepodgroup.volcano.sh" denied the request: can only submit PodGroup to queue with state `Open`, queue `myqueue` status is ``. Abandon schedule pods via volcano     

Seems that the queue i created is not reporting status:

┌─────────────────────────────────────────────────────────────────────────── Describe(-/myqueue) ────────────────────────────────────────────────────────────────────────────┐
│ Name:         myqueue                                                                                                                                                      │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-11-06T23:30:58Z                                                                                                                                │
│   Generation:          1                                                                                                                                                   │
│   Resource Version:    221644752                                                                                                                                           │
│   UID:                 b9ad6d06-4ac8-4ca0-8e73-25964af8f0d6                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Guarantee:                                                                                                                                                               │
│     Resource:                                                                                                                                                              │
│       Cpu:      240                                                                                                                                                        │
│       Memory:   4500G                                                                                                                                                      │
│   Parent:       root                                                                                                                                                       │
│   Reclaimable:  true                                                                                                                                                       │
│   Weight:       1                                                                                                                                                          │
│ Events:         <none>       

Whereas root:

│ Name:         root                                                                                                                                                         │
│ Namespace:                                                                                                                                                                 │
│ Labels:       <none>                                                                                                                                                       │
│ Annotations:  <none>                                                                                                                                                       │
│ API Version:  scheduling.volcano.sh/v1beta1                                                                                                                                │
│ Kind:         Queue                                                                                                                                                        │
│ Metadata:                                                                                                                                                                  │
│   Creation Timestamp:  2025-08-20T22:08:49Z                                                                                                                                │
│   Generation:          3                                                                                                                                                   │
│   Resource Version:    188581782                                                                                                                                           │
│   UID:                 de2d7990-f6dd-42c2-8e9d-524d6f07845b                                                                                                                │
│ Spec:                                                                                                                                                                      │
│   Capability:                                                                                                                                                              │
│     Cpu:        500                                                                                                                                                        │
│     Memory:     5000G                                                                                                                                                      │
│   Reclaimable:  false                                                                                                                                                      │
│   Weight:       1                                                                                                                                                          │
│ Status:                                                                                                                                                                    │
│   Allocated:                                                                                                                                                               │
│     Cpu:     0                                                                                                                                                             │
│     Memory:  0                                                                                                                                                             │
│   Reservation:                                                                                                                                                             │
│   State:  Open                                                                                                                                                             │
│ Events:   <none>                                                                                                                                                           │
│                    

Will do some more digging

It's weird, maybe we could check controller's log, I think there may be something wrong with controller to sync the state of queue from "" to Open:

return SyncQueue(os.queue, func(status *v1beta1.QueueStatus, podGroupList []string) {
specState := os.queue.Status.State
if len(specState) == 0 || specState == v1beta1.QueueStateOpen {
status.State = v1beta1.QueueStateOpen
return
}

mrendi29 and others added 4 commits December 8, 2025 15:53
Introducing new optional argument in the capacity plugin
overwriteRootQueueRealCapability.

Currently, the root queue's resource limits are always set to the actual cluster
resources, regardless of any user-configured capability values.
This prevents administrators from setting virtual resource limits that differ from
the physical cluster capacity.

The new option helps in cases where a user has cluster-autoscaler enabled where
the "virtual" resources can be higher than the "real" resources.

Signed-off-by: Endi Caushi <[email protected]>
Based on the practices in other plugins, the overwriteRootQueueRealCapability
new argument in the capacity plugin comes from a const key.
This makes it easier to follow the configurable arguments of the plugin.

Signed-off-by: Endi Caushi <[email protected]>
Minor changes into the comments of the plugin.
Added a positive/negative unit test to test the argument.
Adding some informations about the argument to the capacity
plugin's hierarchy documentation.

Signed-off-by: Hajnal Mate <[email protected]>
@hajnalmt hajnalmt force-pushed the overwrite-root-queue-capability branch from f1f617d to 0800a75 Compare December 8, 2025 16:16
Copy link
Contributor

@hajnalmt hajnalmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Hello @mrendi29 ,
Sorry for some reason I thought this is already merged in.
I rebased your branch, modified the logging a little bit (thank you for the access). Also I removed the AI tests with a simple negative-positive test case that follows the structure of the existing test suite. Also added the argument with some context to the appropriate design doc.

I am not sure about the failure you got, but it shouldn't be due to these changes, and this argument can be useful in other use cases.

Please take a look @JesseStutler

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 8, 2025
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hajnalmt
Once this PR has been reviewed and has the lgtm label, please assign wpeng102 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mrendi29
Copy link
Author

mrendi29 commented Dec 8, 2025

@hajnalmt TSYM for lending a hand in this, I was OOO for the last 2 weeks and have been super busy at work wrapping up year end objectives which is why I have not been able to dedicate more time at this the last month. Your change LGTM as well. i'll test it in production once i get a chance

@hajnalmt
Copy link
Contributor

hajnalmt commented Dec 9, 2025

/ok-to-test

@mrendi29 Anytime! 👍 Sorry that this got out of my focus.
One thing that came to my mind regarding the error you got, is that you shouldn't send pods to the root queue only to the leaf queues. I ran into these kind of issues I think, when I enabled hierarchy and there were pods in the root queue already.

@volcano-sh-bot volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PodGroup stuck in Pending Phase

6 participants