Skip to content

Explore concept of VNA - Vertical Node Autoscaler #354

@eytan-avisror

Description

@eytan-avisror

In some cases, it may be appropriate to scale nodes vertically, i.e. from m5.xlarge to m5.2xlarge.
For example, when we detect better binpacking may occur, or when the IG reaches the max and there are pending pods.

e.g.

We can try to abstract instance type completely, example:

apiVersion: instancemgr.keikoproj.io/v1alpha1
kind: InstanceGroup
metadata:
  name: my-instance-group
  namespace: instance-manager
spec:
  provisioner: eks
  strategy:
    type: rollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  eks:
    minSize: 3
    maxSize: 6
    configuration:

      # < instanceType not provided >

      instanceFamily: m5  # optional

      resources:
        requests:
          mem: 8Gi
          cpu: 2
        limits:
          mem: 64Gi
          cpu: 16
      ...

Initially spin up m5.xlarge (if instanceFamily is provided, otherwise we can decide the best match) which provides 2vcpu/8Gi mem, and we can scale up to m5.4xlarge which has 16/64 respectively.

Another option is to keep this new spec inside VerticalScalingPolicy so that the IG simply does not provide instanceType and VSP can be provided as follows:

apiVersion: instancemgr.keikoproj.io/v1alpha1
kind: VerticalScalingPolicy
metadata:
  name: default
  namespace: instance-manager
spec:

  instanceFamily: m5  # optional

  resources:
    requests:
      mem: 8Gi
      cpu: 2
    limits:
      mem: 64Gi
      cpu: 16

  scaleTargetRef:
      apiVersion: instancemgr.keikoproj.io/v1alpha1
      kind: InstanceGroup
      name: my-instance-group

We should also probably explore supporting something like HPA's behavior spec based on node capacity

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100 // should be between 0 and 40
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max

@backjo any thoughts on this, would you find this useful?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions