Skip to content

Commit 4ca7248

Browse files
committed
Add ComputeDomain v1beta2 as a hub, CRD conversion webhook, and Helm-templated CRD
Introduce resource.nvidia.com/v1beta2 as the storage/conversion hub for ComputeDomain, keep v1beta1 as a deprecated served version, and wire the CRD to a conversion webhook so the apiserver can round-trip between API specs. Ship the computedomains CRD as a Helm chart template so webhook clientConfig (service name/namespace/port, optional caBundle, cert-manager CA injection) resolves at install time instead of baking static cluster settings into the manifest. Helm validation now rejects resources.computeDomains.enabled=true when webhook.enabled=false, matching the CRD's conversion.webhook requirement. Caveats ------- - The conversion webhook must be enabled when ComputeDomains are enabled: spec.conversion.strategy=Webhook on the CRD requires a reachable webhook, with webhook.enabled=false the cluster cannot honor v1beta1/v1beta2 conversion. Install with webhook.enabled=true (and valid TLS) or disable ComputeDomains (resources.computeDomains.enabled=false). - The computedomains CRD is intentionally templatized (Helm templates/, not only chart crds/):. With that CRDs cannot be directly applied by kubectl, instead install or upgrade via Helm (or reproduce the same templating) so clientConfig and optional cert-manager annotations align with the deployed webhook. - Because of the above reason, helm chart deletion will not remove the CRD as well and any CR instances associated with it. Ideal thing to do is to create a separate helm chart for CRDs, so added a TODO for that. Signed-off-by: Shiva Krishna, Merla <smerla@nvidia.com>
1 parent 279897d commit 4ca7248

File tree

40 files changed

+2063
-45
lines changed

40 files changed

+2063
-45
lines changed

Makefile

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,18 +112,20 @@ coverage: test
112112

113113
generate: generate-crds generate-informers fmt
114114

115+
# Only copy the CRD for CDClique since we have templatized CRD for CD with webhook configuration
116+
# and that is under helm templates/ instead of crds/
117+
# TODO: Need to automate this to templatize the CRD for CD everytime we update the CD API
115118
generate-crds: generate-deepcopy .remove-crds
116119
for dir in $(CLIENT_SOURCES); do \
117120
controller-gen crd:crdVersions=v1 \
118121
paths=$(CURDIR)/$${dir} \
119122
output:crd:dir=$(CURDIR)/deployments/helm/tmp_crds; \
120123
done
121124
mkdir -p $(CURDIR)/deployments/helm/$(DRIVER_NAME)/crds
122-
cp -R $(CURDIR)/deployments/helm/tmp_crds/* \
125+
cp -R $(CURDIR)/deployments/helm/tmp_crds/resource.nvidia.com_computedomaincliques.yaml \
123126
$(CURDIR)/deployments/helm/$(DRIVER_NAME)/crds
124127
rm -rf $(CURDIR)/deployments/helm/tmp_crds
125128

126-
127129
# Regenerate everything and fail if the tree is dirty (used by `make check`).
128130
check-generate: generate
129131
git diff --exit-code HEAD

api/nvidia.com/resource/v1beta1/computedomain.go

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,13 @@ const (
3333
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
3434
// +k8s:openapi-gen=true
3535
// +kubebuilder:resource:scope=Namespaced
36+
// +kubebuilder:deprecatedversion
3637
// +kubebuilder:subresource:status
3738

3839
// ComputeDomain prepares a set of nodes to run a multi-node workload in.
40+
//
41+
// Deprecated: use resource.nvidia.com/v1beta2 ComputeDomain. This version is
42+
// retained for compatibility.
3943
type ComputeDomain struct {
4044
metav1.TypeMeta `json:",inline"`
4145
metav1.ObjectMeta `json:"metadata,omitempty"`
@@ -59,6 +63,10 @@ type ComputeDomainList struct {
5963

6064
// +kubebuilder:validation:XValidation:rule="self == oldSelf", message="A computeDomain.spec is immutable"
6165

66+
// AnnotationComputeDomainNumNodes stores the v1beta1-only numNodes field on the hub)
67+
// object so it survives conversion. It is not part of the v1beta2 API.
68+
const AnnotationComputeDomainNumNodes = "resource.nvidia.com/computedomain-num-nodes"
69+
6270
// ComputeDomainSpec provides the spec for a ComputeDomain.
6371
type ComputeDomainSpec struct {
6472
// Intended number of IMEX daemons (i.e., individual compute nodes) in the
@@ -83,9 +91,14 @@ type ComputeDomainSpec struct {
8391
// `numNodes` IMEX daemons. Pods from more than `numNodes` nodes trying to
8492
// join the ComputeDomain may lead to unexpected behavior.
8593
//
86-
// The `numNodes` parameter is deprecated and will be removed in the next
87-
// API version.
88-
NumNodes int `json:"numNodes"`
94+
// The `numNodes` field exists only on this deprecated API version, it is
95+
// not present on resource.nvidia.com/v1beta2 and is round-tripped via
96+
// metadata.annotations["resource.nvidia.com/computedomain-num-nodes"] on the hub.
97+
//
98+
// +kubebuilder:default:=0
99+
// +kubebuilder:validation:Minimum=0
100+
// +kubebuilder:validation:Optional
101+
NumNodes int `json:"numNodes,omitempty"`
89102
Channel *ComputeDomainChannelSpec `json:"channel"`
90103
}
91104

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
/*
2+
Copyright The Kubernetes Authors.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
package v1beta1
18+
19+
import (
20+
"fmt"
21+
"strconv"
22+
23+
v1beta2 "sigs.k8s.io/nvidia-dra-driver-gpu/api/nvidia.com/resource/v1beta2"
24+
25+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
26+
)
27+
28+
// ConvertTo implements the hub (v1beta2) side of multi-version conversion (kubebuilder pattern).
29+
func (src *ComputeDomain) ConvertTo(dst *v1beta2.ComputeDomain) error {
30+
if src == nil || dst == nil {
31+
return fmt.Errorf("ConvertTo: nil ComputeDomain")
32+
}
33+
dst.TypeMeta = metav1.TypeMeta{
34+
APIVersion: v1beta2.SchemeGroupVersion.String(),
35+
Kind: "ComputeDomain",
36+
}
37+
dst.ObjectMeta = *src.ObjectMeta.DeepCopy()
38+
syncNumNodesAnnotation(&dst.ObjectMeta, src.Spec.NumNodes)
39+
dst.Spec = v1beta2.ComputeDomainSpec{}
40+
if src.Spec.Channel != nil {
41+
dst.Spec.Channel = &v1beta2.ComputeDomainChannelSpec{
42+
ResourceClaimTemplate: v1beta2.ComputeDomainResourceClaimTemplate{
43+
Name: src.Spec.Channel.ResourceClaimTemplate.Name,
44+
},
45+
AllocationMode: src.Spec.Channel.AllocationMode,
46+
}
47+
}
48+
dst.Status = v1beta1StatusToV1beta2(&src.Status)
49+
return nil
50+
}
51+
52+
// ConvertFrom restores a deprecated v1beta1 view from the hub (v1beta2) representation.
53+
func (dst *ComputeDomain) ConvertFrom(src *v1beta2.ComputeDomain) error {
54+
if src == nil || dst == nil {
55+
return fmt.Errorf("ConvertFrom: nil ComputeDomain")
56+
}
57+
dst.TypeMeta = metav1.TypeMeta{
58+
APIVersion: SchemeGroupVersion.String(),
59+
Kind: "ComputeDomain",
60+
}
61+
dst.ObjectMeta = *src.ObjectMeta.DeepCopy()
62+
dst.Spec = ComputeDomainSpec{
63+
NumNodes: NumNodesFromAnnotation(&src.ObjectMeta),
64+
}
65+
// Hide hub-only storage key from the deprecated API surface.
66+
if dst.ObjectMeta.Annotations != nil {
67+
delete(dst.ObjectMeta.Annotations, AnnotationComputeDomainNumNodes)
68+
}
69+
if src.Spec.Channel != nil {
70+
dst.Spec.Channel = &ComputeDomainChannelSpec{
71+
ResourceClaimTemplate: ComputeDomainResourceClaimTemplate{
72+
Name: src.Spec.Channel.ResourceClaimTemplate.Name,
73+
},
74+
AllocationMode: src.Spec.Channel.AllocationMode,
75+
}
76+
}
77+
dst.Status = v1beta2StatusToV1beta1(&src.Status)
78+
return nil
79+
}
80+
81+
func syncNumNodesAnnotation(meta *metav1.ObjectMeta, n int) {
82+
if meta.Annotations == nil {
83+
meta.Annotations = map[string]string{}
84+
}
85+
if n == 0 {
86+
delete(meta.Annotations, AnnotationComputeDomainNumNodes)
87+
return
88+
}
89+
meta.Annotations[AnnotationComputeDomainNumNodes] = strconv.Itoa(n)
90+
}
91+
92+
// NumNodesFromAnnotation returns the v1beta1 numNodes value carried on the hub object.
93+
func NumNodesFromAnnotation(meta *metav1.ObjectMeta) int {
94+
if meta == nil || meta.Annotations == nil {
95+
return 0
96+
}
97+
s, ok := meta.Annotations[AnnotationComputeDomainNumNodes]
98+
if !ok || s == "" {
99+
return 0
100+
}
101+
n, err := strconv.Atoi(s)
102+
if err != nil {
103+
return 0
104+
}
105+
return n
106+
}
107+
108+
func v1beta1StatusToV1beta2(in *ComputeDomainStatus) v1beta2.ComputeDomainStatus {
109+
if in == nil {
110+
return v1beta2.ComputeDomainStatus{}
111+
}
112+
out := v1beta2.ComputeDomainStatus{
113+
Status: in.Status,
114+
}
115+
if len(in.Nodes) > 0 {
116+
out.Nodes = make([]*v1beta2.ComputeDomainNode, len(in.Nodes))
117+
for i, n := range in.Nodes {
118+
if n == nil {
119+
continue
120+
}
121+
out.Nodes[i] = &v1beta2.ComputeDomainNode{
122+
Name: n.Name,
123+
IPAddress: n.IPAddress,
124+
CliqueID: n.CliqueID,
125+
Index: n.Index,
126+
Status: n.Status,
127+
}
128+
}
129+
}
130+
return out
131+
}
132+
133+
func v1beta2StatusToV1beta1(in *v1beta2.ComputeDomainStatus) ComputeDomainStatus {
134+
if in == nil {
135+
return ComputeDomainStatus{}
136+
}
137+
out := ComputeDomainStatus{
138+
Status: in.Status,
139+
}
140+
if len(in.Nodes) > 0 {
141+
out.Nodes = make([]*ComputeDomainNode, len(in.Nodes))
142+
for i, n := range in.Nodes {
143+
if n == nil {
144+
continue
145+
}
146+
out.Nodes[i] = &ComputeDomainNode{
147+
Name: n.Name,
148+
IPAddress: n.IPAddress,
149+
CliqueID: n.CliqueID,
150+
Index: n.Index,
151+
Status: n.Status,
152+
}
153+
}
154+
}
155+
return out
156+
}
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
/*
2+
Copyright The Kubernetes Authors.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
package v1beta2
18+
19+
import (
20+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
21+
)
22+
23+
const (
24+
ComputeDomainStatusNone = ""
25+
ComputeDomainStatusReady = "Ready"
26+
ComputeDomainStatusNotReady = "NotReady"
27+
28+
ComputeDomainChannelAllocationModeSingle = "Single"
29+
ComputeDomainChannelAllocationModeAll = "All"
30+
)
31+
32+
// +genclient
33+
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
34+
// +k8s:openapi-gen=true
35+
// +kubebuilder:resource:scope=Namespaced
36+
// +kubebuilder:storageversion
37+
// +kubebuilder:subresource:status
38+
39+
// ComputeDomain prepares a set of nodes to run a multi-node workload in.
40+
//
41+
// Hub is the storage / conversion hub for ComputeDomain API versions.
42+
type ComputeDomain struct {
43+
metav1.TypeMeta `json:",inline"`
44+
metav1.ObjectMeta `json:"metadata,omitempty"`
45+
46+
Spec ComputeDomainSpec `json:"spec,omitempty"`
47+
// Global ComputeDomain status. Can be used to guide debugging efforts.
48+
// Workload however should not rely on inspecting this field at any point
49+
// during its lifecycle.
50+
Status ComputeDomainStatus `json:"status,omitempty"`
51+
}
52+
53+
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
54+
55+
// ComputeDomainList provides a list of ComputeDomains.
56+
type ComputeDomainList struct {
57+
metav1.TypeMeta `json:",inline"`
58+
metav1.ListMeta `json:"metadata,omitempty"`
59+
60+
Items []ComputeDomain `json:"items"`
61+
}
62+
63+
// +kubebuilder:validation:XValidation:rule="self == oldSelf", message="A computeDomain.spec is immutable"
64+
65+
// ComputeDomainSpec provides the spec for a ComputeDomain.
66+
//
67+
// The deprecated resource.nvidia.com/v1beta1 API carries `spec.numNodes`; that
68+
// value is not part of this version and is preserved on the stored object via
69+
// metadata.annotations["resource.nvidia.com/computedomain-num-nodes"].
70+
type ComputeDomainSpec struct {
71+
Channel *ComputeDomainChannelSpec `json:"channel"`
72+
}
73+
74+
// ComputeDomainChannelSpec provides the spec for a channel used to run a workload inside a ComputeDomain.
75+
type ComputeDomainChannelSpec struct {
76+
ResourceClaimTemplate ComputeDomainResourceClaimTemplate `json:"resourceClaimTemplate"`
77+
// Allows for requesting all IMEX channels (the maximum per IMEX domain) or
78+
// precisely one.
79+
// +kubebuilder:validation:Enum=All;Single
80+
// +kubebuilder:default:=Single
81+
// +kubebuilder:validation:Optional
82+
AllocationMode string `json:"allocationMode,omitempty"`
83+
}
84+
85+
// ComputeDomainResourceClaimTemplate provides the details of the ResourceClaimTemplate to generate.
86+
type ComputeDomainResourceClaimTemplate struct {
87+
Name string `json:"name"`
88+
}
89+
90+
// ComputeDomainStatus provides the status for a ComputeDomain.
91+
type ComputeDomainStatus struct {
92+
// +kubebuilder:validation:Enum=Ready;NotReady
93+
// +kubebuilder:default=NotReady
94+
Status string `json:"status"`
95+
// +listType=map
96+
// +listMapKey=name
97+
Nodes []*ComputeDomainNode `json:"nodes,omitempty"`
98+
}
99+
100+
// ComputeDomainNode provides information about each node added to a ComputeDomain.
101+
type ComputeDomainNode struct {
102+
Name string `json:"name"`
103+
IPAddress string `json:"ipAddress"`
104+
CliqueID string `json:"cliqueID"`
105+
// The Index field is used to ensure a consistent IP-to-DNS name
106+
// mapping across all machines within an IMEX domain. Each node's index
107+
// directly determines its DNS name within a given NVLink partition
108+
// (i.e. clique). In other words, the 2-tuple of (CliqueID, Index) will
109+
// always be unique. This field is marked as optional (but not
110+
// omitempty) in order to support downgrades and avoid an API bump.
111+
// +kubebuilder:validation:Optional
112+
Index int `json:"index"`
113+
// The Status field tracks the readiness of the IMEX daemon running on
114+
// this node. It gets switched to Ready whenever the IMEX daemon is
115+
// ready to broker GPU memory exchanges and switches to NotReady when
116+
// it is not. It is marked as optional in order to support downgrades
117+
// and avoid an API bump.
118+
// +kubebuilder:validation:Optional
119+
// +kubebuilder:validation:Enum=Ready;NotReady
120+
// +kubebuilder:default:=NotReady
121+
Status string `json:"status,omitempty"`
122+
}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
/*
2+
Copyright The Kubernetes Authors.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
// +k8s:deepcopy-gen=package
18+
// +groupName=resource.nvidia.com
19+
20+
package v1beta2

0 commit comments

Comments
 (0)