generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 274
KREP-007: Horizontal scaling in KRO (Scale subresource + RGD Set) #866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tjamet
wants to merge
1
commit into
kubernetes-sigs:main
Choose a base branch
from
tjamet:krep-horizontal-scaling
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+223
−0
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,223 @@ | ||
| # Horizontal scaling in KRO | ||
|
|
||
| This proposal complements [KREP-002] and defines: | ||
| - An interface to control `ResourceGraphDefinition` (RGD) instance scale using the Kubernetes “scale subresource”. | ||
| - An opt-in mechanism to replicate RGD instances and declaratively control rollout across replicas. | ||
|
|
||
| It is inspired by Kubernetes Set controllers (for example, `ReplicaSet`) to achieve safe, observable, and controllable rollouts. | ||
|
|
||
| ## Goals | ||
|
|
||
| - Integrate KRO with the Kubernetes horizontal scaling ecosystem (for example, `kubectl scale`, [HPA], and [keda]) via the [scale-subresources]. | ||
| - Keep a single RGD instance as a “unit of deployment”; KRO’s graph resolution ensures correct ordering and propagation within that unit. | ||
| - Provide an optional “Set” abstraction to replicate RGD instances and control rollout across replicas. | ||
|
|
||
| ## Non-goals | ||
|
|
||
| - Implement all possible scaling or rollout strategies. | ||
| - Replace or reimplement HPA/VPA behavior. | ||
| - Manage progressive rollout within a single RGD instance. The Set abstraction controls rollout across replicas only. | ||
|
|
||
| ## Background and problem statement | ||
|
|
||
| In [KREP-002] we introduced looping/collections. Collections may be required for: | ||
| - Different environments (for example, private vs. public ingresses). | ||
| - Different customers (duplicated stacks, different hostnames). | ||
| - Horizontal scale based on load (request rate, queue depth, number of replicas, and so on). | ||
|
|
||
| Required replica counts can vary rapidly. Kubernetes already provides a rich ecosystem (`kubectl scale`, [HPA], [keda]). KRO should integrate with this ecosystem. | ||
|
|
||
| Replicas can be: | ||
| - A single resource (for example, a Pod or a Deployment), or | ||
| - A set of resources that together form a functional unit (for example, a [cluster-api] [capa-cluster]). | ||
|
|
||
| When a replica is a set of resources, we need controlled rollout to limit blast radius if a new version/configuration causes issues. | ||
|
|
||
| ## Design overview | ||
|
|
||
| This proposal proceeds in two phases: | ||
|
|
||
| 1) Scale subresource on RGD (integration with the Kubernetes scaling ecosystem). | ||
| 2) RGD Set controller (replicate RGD instances and control rollout across replicas). | ||
|
|
||
| ### Phase 1: Scale subresource on RGD | ||
|
|
||
| Users add explicit schema annotations on RGD fields to expose the Scale subresource: | ||
| - `| scale` on integer fields to represent desired/current replicas. | ||
| - `| scaleSelector` on a status string field that holds a label selector for Pods (used by HPA/VPA). | ||
|
|
||
| Validation rules: | ||
| - `| scale` applies to integers only. | ||
| - `| scale` must be present on both spec and status (or on neither). | ||
| - `| scaleSelector` applies to a string attribute in status. KRO does not validate selector correctness nor enforce that it matches Pods. | ||
|
|
||
| Example (minimum to expose Scale): | ||
|
|
||
| ```yaml | ||
| apiVersion: kro.run/v1alpha1 | ||
| kind: ResourceGraphDefinition | ||
| schema: | ||
| spec: | ||
| desiredReplicas: "integer | scale" | ||
| status: | ||
| replicas: "integer | scale" | ||
| podSelector: "string | scaleSelector" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Noticing that a podSelector is not a string, it's a map[string]string. How would you model this? |
||
| ``` | ||
|
|
||
| Usage: | ||
| - `kubectl scale` works against the RGD instances. | ||
| - HPA can target the RGD Scale subresource and use `status.podSelector` to select Pods. | ||
|
|
||
| Example HPA (illustrative): | ||
|
|
||
| ```yaml | ||
| apiVersion: autoscaling/v2 | ||
| kind: HorizontalPodAutoscaler | ||
| metadata: | ||
| name: my-rgd-hpa | ||
| spec: | ||
| scaleTargetRef: | ||
| apiVersion: example.org/v1 | ||
| kind: MyObject | ||
| name: some-name | ||
| minReplicas: 1 | ||
| maxReplicas: 10 | ||
| metrics: | ||
| - type: Resource | ||
| resource: | ||
| name: cpu | ||
| target: | ||
| type: Utilization | ||
| averageUtilization: 70 | ||
| ``` | ||
|
|
||
| ### Phase 2: RGD Set controller (replication and rollout) | ||
|
|
||
| An RGD can define a `scaling` section to emit an additional CRD that manages sets of RGD instances. | ||
| A dedicated dynamic controller reconciles the Set and performs creation, update, and deletion of the underlying RGD instances. | ||
|
|
||
| High-level behavior: | ||
| - A new `MyObjectSet` CRD is generated next to `MyObject` (the RGD instance CRD). | ||
| - The Set controller lives alongside existing KRO controllers and has one responsibility: manage RGD instance replicas. | ||
| - `OwnerReferences` are set from each RGD instance to its Set. | ||
| - Update order: oldest to newest. | ||
| - Deletion order: newest to oldest. | ||
| - Rolling update concurrency is configurable. | ||
| - The controller waits for each updated instance to become “Ready” (KRO Ready condition) before proceeding. | ||
|
|
||
| Example RGD enabling a Set: | ||
|
|
||
| ```yaml | ||
| apiVersion: kro.run/v1alpha1 | ||
| kind: ResourceGraphDefinition | ||
| scaling: | ||
| kind: MyObjectSet | ||
| method: replicas | ||
| schema: | ||
| kind: MyObject | ||
| spec: | ||
| desiredReplicas: "integer | scale" | ||
| status: | ||
| replicas: "integer | scale" | ||
| podSelector: "string | scaleSelector" | ||
| ``` | ||
|
|
||
| This generates CRDs `MyObject` and `MyObjectSet`. | ||
|
|
||
| Users can define a Set to replicate `MyObject`: | ||
|
|
||
| ```yaml | ||
| apiVersion: example.org/v1 | ||
| kind: MyObjectSet | ||
| metadata: | ||
| name: some-name | ||
| spec: | ||
| replicas: 1 | ||
| rollingUpdate: | ||
| maxConcurrentUpdated: 2 | ||
| template: | ||
| spec: ${specThatMatchesResourceGraphDefinitionSchema} | ||
| ``` | ||
|
|
||
| The Set controller creates a `MyObject` instance: | ||
|
|
||
| ```yaml | ||
| apiVersion: example.org/v1 | ||
| kind: MyObject | ||
| metadata: | ||
| name: some-name-ldcs4 | ||
| spec: ${specThatMatchesResourceGraphDefinitionSchema} | ||
| ``` | ||
|
|
||
| Scaling via CLI: | ||
| - `kubectl scale MyObjectSet/some-name --replicas=10` | ||
|
|
||
| Rolling updates: | ||
| - When the Set template changes, the controller updates `MyObject` instances (up to `maxConcurrentUpdated` at a time) and waits for each to become Ready before proceeding. | ||
|
|
||
| ## Controller details (for developers) | ||
|
|
||
| Triggers and watch setup: | ||
| - Reconcile on Set changes and on underlying RGD instance changes. | ||
| - Set and RGD instances share scope (same namespace or cluster-scoped). | ||
|
|
||
| Readiness and progression: | ||
| - The Set controller uses the RGD instance Ready condition to determine when to continue rolling updates. | ||
|
|
||
| Ordering and naming: | ||
| - Instances are named with randomized suffixes (ReplicaSet-like). | ||
| - Update oldest-to-newest; delete newest-to-oldest. | ||
|
|
||
| Error handling and backoff: | ||
| - Standard controller-runtime requeue with exponential backoff on errors. | ||
|
|
||
| Observability: | ||
| - Emit Events on Set and instances for create/update/delete. | ||
| - Expose metrics for reconciliation latency and in-progress rollouts. | ||
|
|
||
| ## Validation summary | ||
|
|
||
| - `| scale` only on integer fields. | ||
| - `| scale` must appear in both spec and status together. | ||
| - `| scaleSelector` only on a status string field; KRO does not validate selector semantics. | ||
|
|
||
| ## Scoping | ||
|
|
||
| ### In scope | ||
|
|
||
| - “Count/replicas” scaling with randomized names (ReplicaSet-like) and in-place updates of RGD instances. | ||
| - Creating instances when scaling up; deleting instances when scaling down. | ||
| - Updating instances oldest-to-newest; deleting newest-to-oldest. | ||
| - Configurable parallelism for rolling updates. | ||
|
|
||
| ### Out of scope (future work candidates) | ||
|
|
||
| - Replicating RGDs based on other objects (for example, Nodes, Namespaces). | ||
| - Customizable update/delete orders beyond the default. | ||
| - Alternate naming conventions (for example, ordinal numbering). | ||
|
|
||
| ## Testing strategy | ||
|
|
||
| Unit tests: | ||
| - CRD generation with `| scale` and `| scaleSelector` annotations. | ||
| - Validation of schema rules (types, presence in spec/status). | ||
| - Set controller logic: ordering, concurrency, readiness gates, and error paths. | ||
|
|
||
| Integration tests: | ||
| - Scale subresource end-to-end: `kubectl scale`, HPA interactions. | ||
| - Set controller end-to-end: creation, scaling up/down, rolling update respecting concurrency and readiness. | ||
|
|
||
| ## Other solutions considered | ||
|
|
||
| - Implement Set behavior directly inside the RGD reconciler together with collections. | ||
| This couples graph resolution, failure domains, and rollout strategies, reducing user control and observability over rollouts. | ||
| - Implement horizontal scaling in an external controller. | ||
| This could scale any resource but increases user complexity (more controllers to install, configure, and monitor). | ||
|
|
||
|
|
||
| [KREP-002]: https://github.com/kubernetes-sigs/kro/pull/679 | ||
| [cluster-api]: https://cluster-api.sigs.k8s.io/introduction | ||
| [capa-cluster]: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-eks.yaml | ||
| [scale-subresources]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource | ||
| [HPA]: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ | ||
| [keda]: https://keda.sh/ | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, this KREP is really cool.
I wonder a bit about the proliferation of keywords. WDYT about building this concept to be a bit more aware of kubernetes subresources? e.g.
https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource