Skip to content

DRA Device Binding Conditions: graduate to beta#137795

Merged
k8s-ci-robot merged 3 commits intokubernetes:masterfrom
ttsuuubasa:dra-binding-conditions-beta
Mar 18, 2026
Merged

DRA Device Binding Conditions: graduate to beta#137795
k8s-ci-robot merged 3 commits intokubernetes:masterfrom
ttsuuubasa:dra-binding-conditions-beta

Conversation

@ttsuuubasa
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature
/kind api-change
/sig scheduling
/wg device-management

What this PR does / why we need it:

This PR graduates the DRADeviceBindingConditions feature gate from Alpha to Beta in Kubernetes v1.36, enabling it by default.

Which issue(s) this PR is related to:

KEP: kubernetes/enhancements#5007

Special notes for your reviewer:

This feature is considered to have met the Beta criteria based on the merged code PRs, as well as feedback and discussions with DRA driver vendors and maintainers.

  • metrics
    To satisfy the PRR requirements of KEP-5007, the required metrics were implemented in the following PRs, enabling better observability and validation of the feature’s behavior.

  • feedback from DRA Driver vendor
    We implemented Device Binding Conditions support for the ComputeDomain provided by the NVIDIA DRA Driver and published the corresponding PR.

    Based on feedback from the vendor maintainers, concerns were raised about potential performance issues with a centralized architecture in which the controller responsible for setting BindingConditions resides on the controller side.

    In response, we discussed possible approaches for realizing a distributed architecture - where the controller logic is placed on the worker side - with SIG Scheduling maintainers in the Slack thread referenced below.

    Incorporating the proposed solutions from those discussions, we completed the implementation of ComputeDomain with BindingConditions using a distributed architecture.

Does this PR introduce a user-facing change?

DRA: graduate Device Binding Conditions (KEP #5007) to beta. The feature is now enabled by default in v1.36.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/issues/5007

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. labels Mar 17, 2026
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Mar 17, 2026
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 17, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 17, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @ttsuuubasa. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added area/code-generation area/test sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Mar 17, 2026
@k8s-triage-robot
Copy link
Copy Markdown

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

/cc @pohly @dom4ha @macsko @johnbelamaric

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 17, 2026
@dom4ha
Copy link
Copy Markdown
Member

dom4ha commented Mar 17, 2026

/approve

@dom4ha dom4ha moved this from Needs Triage to Done in SIG Scheduling Mar 18, 2026
Promote DRADeviceBindingConditions feature gate from Alpha to Beta
in v1.36 with default enabled.

- Update feature gate definition to set default=true for v1.36 Beta
- Update API documentation comments from "alpha field" to "beta field"
  across all resource API versions (v1, v1beta1, v1beta2)

Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
- update expectations for the default BindingTimeout in KubeSchedulerConfiguration
- DRA unit tests:
  - enable DRADeviceBindingConditions by default
  - add allocationTimestamp to expected ResourceClaims for PreBind cases
- disable DRADeviceBindingConditions when testing the stable allocator in TestSchedulerPerf

Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
@ttsuuubasa ttsuuubasa force-pushed the dra-binding-conditions-beta branch from 5693bea to 57e649a Compare March 18, 2026 04:58
@pohly
Copy link
Copy Markdown
Contributor

pohly commented Mar 18, 2026

/test pull-kubernetes-node-e2e-containerd

Unrelated flake.

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

@pohly
Unit tests and integration tests are already passing.
Enabling DRADeviceBindingConditions by default turned out to require changes across a broader set of test cases than initially expected.

I have submitted an Exception request today. If it is approved, we will have an additional 3 days.

I would appreciate it if you could continue with the review in the meantime.

Copy link
Copy Markdown
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Swapping the defaults for feature enabled/disabled in the scheduler plugin test cases looks reasonable. Once those features are always enabled, only a few test cases need to be updated.

/label api-review
/cc @liggitt

For doc changes.

@k8s-ci-robot k8s-ci-robot requested a review from liggitt March 18, 2026 08:02
@k8s-ci-robot k8s-ci-robot added api-review Categorizes an issue or PR as actively needing an API review. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 18, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 2c238bb160f1c324b5e77cd7a140f456c58a7251

@dom4ha dom4ha moved this from Done to Needs Final Approver in SIG Scheduling Mar 18, 2026
@liggitt liggitt moved this to In progress in API Reviews Mar 18, 2026
@liggitt
Copy link
Copy Markdown
Member

liggitt commented Mar 18, 2026

/approve
for API comment changes

@liggitt liggitt moved this from In progress to API review completed, 1.36 in API Reviews Mar 18, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dom4ha, liggitt, pohly, ttsuuubasa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2026
@k8s-ci-robot k8s-ci-robot merged commit 0d28578 into kubernetes:master Mar 18, 2026
30 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.36 milestone Mar 18, 2026
@github-project-automation github-project-automation bot moved this from Needs Final Approver to Done in SIG Scheduling Mar 18, 2026
@github-project-automation github-project-automation bot moved this from Triage to Done in SIG Node CI/Test Board Mar 18, 2026
@pohly pohly moved this from 👀 In review to ✅ Done in Dynamic Resource Allocation Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Projects

Status: API review completed, 1.36
Status: ✅ Done
Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants