Skip to content

Define FG graduation#954

Open
guptaNswati wants to merge 1 commit intokubernetes-sigs:mainfrom
guptaNswati:FG-policy
Open

Define FG graduation#954
guptaNswati wants to merge 1 commit intokubernetes-sigs:mainfrom
guptaNswati:FG-policy

Conversation

@guptaNswati
Copy link
Copy Markdown
Contributor

Address #931

Signed-off-by: Swati Gupta <swatig@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 18, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@guptaNswati guptaNswati requested a review from Copilot March 18, 2026 21:14
@guptaNswati guptaNswati self-assigned this Mar 18, 2026
@guptaNswati guptaNswati moved this from Backlog to In Progress in Planning Board: k8s-dra-driver-gpu Mar 18, 2026
@guptaNswati guptaNswati added documentation Issue/PR focused on fixing/editing/adding documentation bits maintenance/chores issue/pr for maintenance, release work, code cleanup, chores labels Mar 18, 2026
@guptaNswati guptaNswati added this to the v26.4.0 milestone Mar 18, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a formal policy document describing how feature gates in the NVIDIA DRA Driver for GPUs should progress from Alpha to Beta to Stable, including evidence expectations and a snapshot of the current gate inventory.

Changes:

  • Introduces graduation criteria (entry/graduation requirements) for Alpha, Beta, and Stable feature gates.
  • Defines deprecation/removal expectations and upstream Kubernetes dependency coupling rules.
  • Documents current feature-gate inventory and highlights gaps to reach/maintain desired stages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


### 2.3 Stable (GA) — Production Grade

**Default:** `true`, **Locked:** feature gate cannot be disabled
Comment on lines +155 to +156
| `DynamicMIG` | Alpha | `false` | v25.12 | [KEP-4815] (Alpha 1.35, Beta target 1.36) | Mutually exclusive with PassthroughSupport, NVMLDeviceHealthCheck, MPSSupport |
| `NVMLDeviceHealthCheck` | Alpha | `false` | v25.12 | [KEP-5055] (Alpha 1.33, Beta target 1.36) | Mutually exclusive with DynamicMIG |
@@ -0,0 +1,184 @@
# Policy on Feature Gate Graduation
Comment on lines +24 to +25
**Default:** `false` (opt-in)
**Signal:** "Try it out and give us feedback."
Comment on lines +44 to +46
**Default:** `true` (opt-out)
**Signal:** "We're confident in the design. Early production use is
encouraged."
When the upstream dependency is not at the required level, the feature must
detect and degrade gracefully, require and fail loudly, or defer promotion.

## 3. Current Feature Gate Inventory
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its better to make this doc just about policy and adding the details of current feature gates on a different doc.
Even better would be that each feature gate has a dedicated page with details about it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good idea. Rn, we dont have any doc on the FGs. A dedicated page would allow us to have design, discussions and different stages at a single place.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We should at least split the static policy section and the dynamic feature gate section.

Copy link
Copy Markdown

@rajatchopra rajatchopra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we reconcile information here with roadmap/issues?

When the upstream dependency is not at the required level, the feature must
detect and degrade gracefully, require and fail loudly, or defer promotion.

## 3. Current Feature Gate Inventory
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We should at least split the static policy section and the dynamic feature gate section.

@k8s-triage-robot
Copy link
Copy Markdown

Unknown CLA label state. Rechecking for CLA labels.

Send feedback to sig-contributor-experience at kubernetes/community.

/check-cla
/easycla

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. documentation Issue/PR focused on fixing/editing/adding documentation bits maintenance/chores issue/pr for maintenance, release work, code cleanup, chores

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants