Skip to content

Kubernetes Events for Scaling Decisions #918

@ev-shindin

Description

@ev-shindin

Emit Kubernetes events on VariantAutoscaling resources for key scaling decisions and errors. Currently only ServiceMonitorDeleted warning events are emitted.

Events:

Event Type Reason When
Scale-up decision Normal ScaledUp Target replicas increased
Scale-down decision Normal ScaledDown Target replicas decreased
Limiter constrained Warning ResourceConstrained Decision was limited by GPU availability
Metrics unavailable Warning MetricsUnavailable Prometheus query failed or returned no data
Scale-to-zero applied Normal ScaledToZero Enforcer set replicas to 0 (no active requests)
Optimization error Warning OptimizationFailed Analysis/optimizer error for this VA

Implementation:

  • Use r.Recorder.Eventf() in applySaturationDecisions() and the reconciler
  • Rate-limit: at most 1 event per VA per optimization cycle to avoid flooding the API server
  • Include relevant context in event message (e.g., "Scaled up from 2 to 4 replicas, saturation=0.85")

Acceptance Criteria:

  • 6 event types emitted on appropriate conditions
  • Events visible via kubectl describe va <name>
  • Events are rate-limited (no flooding on rapid cycles)
  • Unit/integration tests verify event emission

Metadata

Metadata

Assignees

Labels

triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions