Emit Kubernetes events on VariantAutoscaling resources for key scaling decisions and errors. Currently only ServiceMonitorDeleted warning events are emitted.
Events:
| Event |
Type |
Reason |
When |
| Scale-up decision |
Normal |
ScaledUp |
Target replicas increased |
| Scale-down decision |
Normal |
ScaledDown |
Target replicas decreased |
| Limiter constrained |
Warning |
ResourceConstrained |
Decision was limited by GPU availability |
| Metrics unavailable |
Warning |
MetricsUnavailable |
Prometheus query failed or returned no data |
| Scale-to-zero applied |
Normal |
ScaledToZero |
Enforcer set replicas to 0 (no active requests) |
| Optimization error |
Warning |
OptimizationFailed |
Analysis/optimizer error for this VA |
Implementation:
- Use
r.Recorder.Eventf() in applySaturationDecisions() and the reconciler
- Rate-limit: at most 1 event per VA per optimization cycle to avoid flooding the API server
- Include relevant context in event message (e.g., "Scaled up from 2 to 4 replicas, saturation=0.85")
Acceptance Criteria:
Emit Kubernetes events on VariantAutoscaling resources for key scaling decisions and errors. Currently only
ServiceMonitorDeletedwarning events are emitted.Events:
ScaledUpScaledDownResourceConstrainedMetricsUnavailableScaledToZeroOptimizationFailedImplementation:
r.Recorder.Eventf()inapplySaturationDecisions()and the reconcilerAcceptance Criteria:
kubectl describe va <name>