A Kubernetes operator that forecasts workload resource usage and resizes pods in-place, no restarts, no replica churn.
The controller collects usage from metrics-server, runs a Chronos-2 ONNX time-series model over a rolling history buffer,
and applies new container requests through the pods/resize subresource.
On each reconcile tick the controller runs a predict→resize loop:
- Collect — Fetch current CPU/memory usage from metrics-server for the workload named in
spec.metrics.metricsServer.targetRef, and append each sample to an in-memory per-resource history buffer. - Forecast — Once the buffer holds enough samples (16+), run the bundled ONNX forecaster to produce a future usage series over
spec.forecast.horizonatspec.forecast.stepintervals. - Derive targets — When
spec.resizeis set, compute per-pod container requests from forecast peaks (max over horizon and quantiles, with headroom, divided by matching pod count) and clamp tospec.resize.resourcesmin/max. Skips resize if the change is belowminChangePercent. - Apply — Patch pod requests in place via
pods/resize. Resize scope is controlled byspec.resize.containerName(optional): unset = primary container (spec.containers[0]), named value = that container only,"*"= all containers in each pod. Only requests are predicted; limits are raised only when they would fall below the new request.
apiVersion: autoscaling.plural.sh/v1alpha1
kind: NeuralAutoscaler
metadata:
name: neuralautoscaler-sample
spec:
metrics:
type: MetricsServer
metricsServer:
targetRef:
kind: Deployment
name: api
resources:
- cpu
- memory
forecast:
horizon: 12
step: 1m
resize:
minChangePercent: 5
resources:
cpu:
min: 100m
max: "8"
memory:
min: 128Mi
max: 16GiPoint targetRef at a Deployment, StatefulSet, or ReplicaSet in the same namespace. Omit resize to run forecast-only (metrics collection and logging, no pod changes).
spec.resize.containerName is optional and controls which containers in each pod are resized:
- unset: resize only the primary container (
spec.containers[0]) "app"(or any exact name): resize only that container"*": resize all containers in each matched pod
Example:
spec:
resize:
containerName: "*"
minChangePercent: 5
resources:
cpu:
min: 100m
max: "8"
memory:
min: 128Mi
max: 2GiInstead of metrics-server, you can point at a Prometheus-compatible API (for example Kubecost). PromQL is built automatically from targetRef and resources; only url, optional auth, lookback, and step are required beyond the workload selector:
spec:
metrics:
type: Prometheus
prometheus:
url: http://kubecost-prometheus-server.kubecost.svc
targetRef:
kind: Deployment
name: api
resources:
- cpu
- memory
lookback: 1h
step: 1m
resize:
minChangePercent: 5
resources:
cpu:
min: 100m
max: "8"
memory:
min: 128Mi
max: 16Gi
- Kubernetes 1.27+ with in-place pod vertical scaling (
InPlacePodVerticalScaling; enabled by default on 1.33+). For local kind clusters on Kubernetes < 1.33, enable the feature gate withhack/kind-inplace-config.yaml. - metrics-server installed in the cluster.
The chart pulls the controller image from ghcr.io/pluralsh/neural-autoscaler. Set image.tag to the release version you are installing (chart default: 0.2.0).
helm install neural-autoscaler ./charts/neural-autoscaler \
--namespace neural-autoscaler-system \
--create-namespaceManifests under config/samples/ include the api demo workload and NeuralAutoscaler variants. kubectl apply -k config/samples applies the metrics-server sample (autoscaling_v1alpha1_neuralautoscaler_metrics_server.yaml), which warms up in ~16 reconciles at 20s (~5 min). Prometheus samples (autoscaling_v1alpha1_neuralautoscaler_prometheus*.yaml) need ~16 min at 1m step before forecasting. For all-containers resizing, use autoscaling_v1alpha1_neuralautoscaler_metrics_server_all_containers.yaml (containerName: "*").
# Download the Chronos-2 ONNX model (needed for local runs and image builds)
make download-chronos-onnx
# Run the controller locally against your kubeconfig
make run-localhelm upgrade neural-autoscaler ./charts/neural-autoscaler \
--namespace neural-autoscaler-system \
--set image.tag=<version>
helm uninstall neural-autoscaler --namespace neural-autoscaler-system