-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
📖 Update autoscaling from zero enhancement proposal with support for platform-aware autoscale from zero #11962
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -107,8 +107,8 @@ node group. But, during a scale from zero situation (ie when a node group has ze | |||||||||||||||
autoscaler needs to acquire this information from the infrastructure provider. | ||||||||||||||||
|
||||||||||||||||
An optional status field is proposed on the Infrastructure Machine Template which will be populated | ||||||||||||||||
by infrastructure providers to contain the CPU, memory, and GPU capacities for machines described by that | ||||||||||||||||
template. The cluster autoscaler will then utilize this information by reading the appropriate | ||||||||||||||||
by infrastructure providers to contain the CPU, CPU architecture, memory, and GPU capacities for machines | ||||||||||||||||
described by that template. The cluster autoscaler will then utilize this information by reading the appropriate | ||||||||||||||||
infrastructure reference from the resource it is scaling (MachineSet or MachineDeployment). | ||||||||||||||||
|
||||||||||||||||
A user may override the field in the associated infrastructure template by applying annotations to the | ||||||||||||||||
|
@@ -160,6 +160,10 @@ the template. Internally, this field will be represented by a Go `map` type uti | |||||||||||||||
for the keys and `k8s.io/apimachinery/pkg/api/resource.Quantity` as the values (similar to how resource | ||||||||||||||||
limits and requests are handled for pods). | ||||||||||||||||
|
||||||||||||||||
Additionally, the status field could contain information about the node, such as the architecture and | ||||||||||||||||
operating system. This information is not required for the autoscaler to function, but it can be useful in | ||||||||||||||||
scenarios where the autoscaler needs to make decisions for clusters with heterogeneous node groups in architecture, OS, or both. | ||||||||||||||||
|
||||||||||||||||
It is worth mentioning that the Infrastructure Machine Templates are not usually reconciled by themselves. | ||||||||||||||||
Each infrastructure provider will be responsible for determining the best implementation for adding the | ||||||||||||||||
status field based on the information available on their platform. | ||||||||||||||||
|
@@ -175,6 +179,7 @@ const ( | |||||||||||||||
// DockerMachineTemplateStatus defines the observed state of a DockerMachineTemplate | ||||||||||||||||
type DockerMachineTemplateStatus struct { | ||||||||||||||||
Capacity corev1.ResourceList `json:"capacity,omitempty"` | ||||||||||||||||
NodeInfo *platform.NodeSystemInfo `json:"nodeInfo,omitempty"` | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
// DockerMachineTemplate is the Schema for the dockermachinetemplates API. | ||||||||||||||||
|
@@ -188,6 +193,29 @@ type DockerMachineTemplate struct { | |||||||||||||||
``` | ||||||||||||||||
_Note: the `ResourceList` and `ResourceName` referenced are from k8s.io/api/core/v1`_ | ||||||||||||||||
|
||||||||||||||||
`platform.NodeSystemInfo` is a struct that contains the architecture and operating system information of the node, to | ||||||||||||||||
implement in the `util` package of the `cluster-api` project. The defined types and constants are exported for use by the | ||||||||||||||||
cluster-api providers integrations. | ||||||||||||||||
Its definition would look like this: | ||||||||||||||||
Comment on lines
+196
to
+199
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should not put an API struct in the util package. On the other side I'm also not sure if we should put it in any of our existing packages of our api groups. Because as soon we create a new apiVersion every infra provider CRD also requires a new apiVersion to bump to our new apiVersion. So I think it might be a good idea to create a new API package for this. One option could be In general I would like to see this API versioned, so we have a chance to evolve it over time. |
||||||||||||||||
|
||||||||||||||||
```go | ||||||||||||||||
package platform | ||||||||||||||||
|
||||||||||||||||
type Architecture string | ||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the idea to make this an enum? (i.e. enumerate all valid values) If yes, let's add the appropriate enum marker here |
||||||||||||||||
|
||||||||||||||||
const ( | ||||||||||||||||
ArchitectureAmd64 Architecture = "amd64" | ||||||||||||||||
ArchitectureArm64 Architecture = "arm64" | ||||||||||||||||
ArchitectureS390x Architecture = "s390x" | ||||||||||||||||
ArchitecturePpc64le Architecture = "ppc64le" | ||||||||||||||||
) | ||||||||||||||||
|
||||||||||||||||
type NodeSystemInfo struct { | ||||||||||||||||
Architecture Architecture `json:"architecture,omitempty"` | ||||||||||||||||
OperatingSystem string `json:"operatingSystem,omitempty"` | ||||||||||||||||
Comment on lines
+214
to
+215
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
} | ||||||||||||||||
``` | ||||||||||||||||
|
||||||||||||||||
When used as a manifest, it would look like this: | ||||||||||||||||
|
||||||||||||||||
``` | ||||||||||||||||
|
@@ -204,8 +232,16 @@ status: | |||||||||||||||
memory: 500mb | ||||||||||||||||
cpu: "1" | ||||||||||||||||
nvidia.com/gpu: "1" | ||||||||||||||||
nodeInfo: | ||||||||||||||||
aleskandro marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||
architecture: arm64 | ||||||||||||||||
operatingSystem: linux | ||||||||||||||||
``` | ||||||||||||||||
|
||||||||||||||||
The information stored in the `status.nodeInfo` field is rendered as labels for the node object that is created within | ||||||||||||||||
the node group by the cluster autoscaler and fed into the cluster autoscaler's scheduler simulator `framework.NodeInfo` struct. | ||||||||||||||||
In particular, the `architecture` and `operatingSystem` fields are used to determine the simulated node's labels | ||||||||||||||||
`kubernetes.io/arch` and `kubernetes.io/os`. This logic will be implemented in the cluster autoscaler's ClusterAPI cloud provider code. | ||||||||||||||||
|
||||||||||||||||
#### MachineSet and MachineDeployment Annotations | ||||||||||||||||
|
||||||||||||||||
In cases where a user needs to provide specific resource information for a | ||||||||||||||||
|
@@ -229,6 +265,8 @@ metadata: | |||||||||||||||
capacity.cluster-autoscaler.kubernetes.io/memory: "500mb" | ||||||||||||||||
capacity.cluster-autoscaler.kubernetes.io/cpu: "1" | ||||||||||||||||
capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk: "100Gi" | ||||||||||||||||
node-info.cluster-autoscaler.kubernetes.io/cpu-architecture: "arm64" | ||||||||||||||||
node-info.cluster-autoscaler.kubernetes.io/os: "linux" | ||||||||||||||||
Comment on lines
+268
to
+269
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm the suffix here matches neither the kubernetes.io/ labels nor the fields in the NodeInfo struct. I think it would make sense to align to the NodeInfo struct field names |
||||||||||||||||
``` | ||||||||||||||||
_Note: the annotations will be defined in the cluster autoscaler, not in cluster-api._ | ||||||||||||||||
|
||||||||||||||||
|
@@ -246,6 +284,8 @@ metadata: | |||||||||||||||
capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute" | ||||||||||||||||
``` | ||||||||||||||||
|
||||||||||||||||
If the `capacity.cluster-autoscaler.kubernetes.io/labels` annotation specifies a label that would otherwise be generated from the `capacity` or `node-info` annotations, the autoscaler will use the label defined in `capacity.cluster-autoscaler.kubernetes.io/labels`, overriding any labels produced by processing the other annotations. | ||||||||||||||||
|
||||||||||||||||
### Security Model | ||||||||||||||||
|
||||||||||||||||
This feature will require the service account associated with the cluster autoscaler to have | ||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is modelled after corev1.NodeSystemInfo but I'm wondering why they have different names for the struct and field. I would lean towards also using NodeInfo as struct name
@JoelSpeed opinions?