DRA Device Binding Conditions: graduate to beta#137795
DRA Device Binding Conditions: graduate to beta#137795k8s-ci-robot merged 3 commits intokubernetes:masterfrom
Conversation
|
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @ttsuuubasa. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
|
/approve |
Promote DRADeviceBindingConditions feature gate from Alpha to Beta in v1.36 with default enabled. - Update feature gate definition to set default=true for v1.36 Beta - Update API documentation comments from "alpha field" to "beta field" across all resource API versions (v1, v1beta1, v1beta2) Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
- update expectations for the default BindingTimeout in KubeSchedulerConfiguration - DRA unit tests: - enable DRADeviceBindingConditions by default - add allocationTimestamp to expected ResourceClaims for PreBind cases - disable DRADeviceBindingConditions when testing the stable allocator in TestSchedulerPerf Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
5693bea to
57e649a
Compare
|
/test pull-kubernetes-node-e2e-containerd Unrelated flake. |
|
@pohly I have submitted an Exception request today. If it is approved, we will have an additional 3 days. I would appreciate it if you could continue with the review in the meantime. |
pohly
left a comment
There was a problem hiding this comment.
/lgtm
/approve
Swapping the defaults for feature enabled/disabled in the scheduler plugin test cases looks reasonable. Once those features are always enabled, only a few test cases need to be updated.
/label api-review
/cc @liggitt
For doc changes.
|
LGTM label has been added. DetailsGit tree hash: 2c238bb160f1c324b5e77cd7a140f456c58a7251 |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dom4ha, liggitt, pohly, ttsuuubasa The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
/kind api-change
/sig scheduling
/wg device-management
What this PR does / why we need it:
This PR graduates the
DRADeviceBindingConditionsfeature gate from Alpha to Beta in Kubernetes v1.36, enabling it by default.Which issue(s) this PR is related to:
KEP: kubernetes/enhancements#5007
Special notes for your reviewer:
This feature is considered to have met the Beta criteria based on the merged code PRs, as well as feedback and discussions with DRA driver vendors and maintainers.
metrics
To satisfy the PRR requirements of KEP-5007, the required metrics were implemented in the following PRs, enabling better observability and validation of the feature’s behavior.
feedback from DRA Driver vendor
We implemented Device Binding Conditions support for the ComputeDomain provided by the NVIDIA DRA Driver and published the corresponding PR.
Based on feedback from the vendor maintainers, concerns were raised about potential performance issues with a centralized architecture in which the controller responsible for setting BindingConditions resides on the controller side.
In response, we discussed possible approaches for realizing a distributed architecture - where the controller logic is placed on the worker side - with SIG Scheduling maintainers in the Slack thread referenced below.
Incorporating the proposed solutions from those discussions, we completed the implementation of ComputeDomain with BindingConditions using a distributed architecture.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: