feat: hpa with cpu + mem util scaling options#628
feat: hpa with cpu + mem util scaling options#628burnjake wants to merge 1 commit intoinfluxdata:masterfrom
Conversation
| name: memory | ||
| target: | ||
| type: Utilization | ||
| averageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }} |
There was a problem hiding this comment.
ok so I understand that the autoscaler will launch additional telegraf nodes if you get above a certain memory and CPU usage, but what ensures that the first pod gets reduced usage? Is there a load balancer or some other proxy in front that would round robin the usage?
Trying to understand the full use-case and how a user would take advantage of this without needing to make modifications to their config. Thanks!
There was a problem hiding this comment.
Hi! Apologies I've been away for a few days. So our use case is to utilise the opentelemetry input, aggregate with basicstats and output with the prometheusclient. We have a traffic pattern where the number of connections varies quite a lot within the day, so varying our replica count is prudent.
As the opentelemetry input expects connections via gRPC, we can't depend on normal load balancing via a k8s service and instead we need to use rely on an external LB which we've plumbed into the ingress of the cluster which will discover the new replicas and do the things to spread the traffic (update its connection pool I think?). In short, we don't need extra configuration within telegraf for this to work, but our use case is indeed very specific!
We would like to scale the number of replicas based on usage which is a slight pain currently as you have to set the
deployment.spec.replicasfield to none if we were to roll our own HPA resource. There's also a pre-existing issue: #624.Setting
autoscaling.enabled: truetemplates the followingDeploymentandHPAresources:Setting
autoscaling.enabled: falsetemplates the followingDeploymentresource:An example with
behaviour: