Neural Autoscaler

A Kubernetes operator that forecasts workload resource usage and resizes pods in-place, no restarts, no replica churn. The controller collects usage from metrics-server, runs a Chronos-2 ONNX time-series model over a rolling history buffer, and applies new container requests through the pods/resize subresource.

How it works

On each reconcile tick the controller runs a predict→resize loop:

Collect — Fetch current CPU/memory usage from metrics-server for the workload named in spec.metrics.metricsServer.targetRef, and append each sample to an in-memory per-resource history buffer.
Forecast — Once the buffer holds enough samples (16+), run the bundled ONNX forecaster to produce a future usage series over spec.forecast.horizon at spec.forecast.step intervals.
Derive targets — When spec.resize is set, compute per-pod container requests from forecast peaks (max over horizon and quantiles, with headroom, divided by matching pod count) and clamp to spec.resize.resources min/max. Skips resize if the change is below minChangePercent.
Apply — Patch pod requests in place via pods/resize. Resize scope is controlled by spec.resize.containerName (optional): unset = primary container (spec.containers[0]), named value = that container only, "*" = all containers in each pod. Only requests are predicted; limits are raised only when they would fall below the new request.

Example

apiVersion: autoscaling.plural.sh/v1alpha1
kind: NeuralAutoscaler
metadata:
  name: neuralautoscaler-sample
spec:
  metrics:
    type: MetricsServer
    metricsServer:
      targetRef:
        kind: Deployment
        name: api
      resources:
        - cpu
        - memory
  forecast:
    horizon: 12
    step: 1m
  resize:
    minChangePercent: 5
    resources:
      cpu:
        min: 100m
        max: "8"
      memory:
        min: 128Mi
        max: 16Gi

Point targetRef at a Deployment, StatefulSet, or ReplicaSet in the same namespace. Omit resize to run forecast-only (metrics collection and logging, no pod changes).

Container targeting

spec.resize.containerName is optional and controls which containers in each pod are resized:

unset: resize only the primary container (spec.containers[0])
"app" (or any exact name): resize only that container
"*": resize all containers in each matched pod

Example:

spec:
  resize:
    containerName: "*"
    minChangePercent: 5
    resources:
      cpu:
        min: 100m
        max: "8"
      memory:
        min: 128Mi
        max: 2Gi

Prometheus metrics source

Instead of metrics-server, you can point at a Prometheus-compatible API (for example Kubecost). PromQL is built automatically from targetRef and resources; only url, optional auth, lookback, and step are required beyond the workload selector:

spec:
  metrics:
    type: Prometheus
    prometheus:
      url: http://kubecost-prometheus-server.kubecost.svc
      targetRef:
        kind: Deployment
        name: api
      resources:
        - cpu
        - memory
      lookback: 1h
      step: 1m
  resize:
    minChangePercent: 5
    resources:
      cpu:
        min: 100m
        max: "8"
      memory:
        min: 128Mi
        max: 16Gi

Quick start

Prerequisites

Kubernetes 1.27+ with in-place pod vertical scaling (InPlacePodVerticalScaling; enabled by default on 1.33+). For local kind clusters on Kubernetes < 1.33, enable the feature gate with hack/kind-inplace-config.yaml.
metrics-server installed in the cluster.

Install the operator

The chart pulls the controller image from ghcr.io/pluralsh/neural-autoscaler. Set image.tag to the release version you are installing (chart default: 0.2.0).

helm install neural-autoscaler ./charts/neural-autoscaler \
  --namespace neural-autoscaler-system \
  --create-namespace

Samples

Manifests under config/samples/ include the api demo workload and NeuralAutoscaler variants. kubectl apply -k config/samples applies the metrics-server sample (autoscaling_v1alpha1_neuralautoscaler_metrics_server.yaml), which warms up in ~16 reconciles at 20s (~5 min). Prometheus samples (autoscaling_v1alpha1_neuralautoscaler_prometheus*.yaml) need ~16 min at 1m step before forecasting. For all-containers resizing, use autoscaling_v1alpha1_neuralautoscaler_metrics_server_all_containers.yaml (containerName: "*").

Local development

# Download the Chronos-2 ONNX model (needed for local runs and image builds)
make download-chronos-onnx

# Run the controller locally against your kubeconfig
make run-local

Upgrade / uninstall

helm upgrade neural-autoscaler ./charts/neural-autoscaler \
  --namespace neural-autoscaler-system \
  --set image.tag=<version>

helm uninstall neural-autoscaler --namespace neural-autoscaler-system

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
api/v1alpha1		api/v1alpha1
charts/neural-autoscaler		charts/neural-autoscaler
cmd		cmd
config		config
hack		hack
internal		internal
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Autoscaler

How it works

Example

Container targeting

Prometheus metrics source

Quick start

Prerequisites

Install the operator

Samples

Local development

Upgrade / uninstall

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Neural Autoscaler

How it works

Example

Container targeting

Prometheus metrics source

Quick start

Prerequisites

Install the operator

Samples

Local development

Upgrade / uninstall

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages