Skip to content

Adjust backend-listen HPA for cost saving#5640

Merged
beastoin merged 1 commit intomainfrom
task/adjust-backend-listen-hpa-for-cost-saving
Mar 15, 2026
Merged

Adjust backend-listen HPA for cost saving#5640
beastoin merged 1 commit intomainfrom
task/adjust-backend-listen-hpa-for-cost-saving

Conversation

@thainguyensunya
Copy link
Collaborator

@thainguyensunya thainguyensunya commented Mar 15, 2026

Change:
Thanks to the new backend-listen WS connections metric, now we can adjust the HPA with the real WS connections in backend-listen for cost optimization

  • minReplicas: 26 -> 20
  • maxReplicas: 50 -> 40

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 15, 2026

Greptile Summary

Reduces the backend-listen HPA replica bounds for cost optimization, leveraging the new WebSocket connections metric (activeConnectionsPerPod: 20). minReplicas drops from 26 to 20 and maxReplicas from 50 to 40.

  • Discrepancy: PR description says minReplicas: 26 -> 22, but the code sets it to 20. This needs clarification — the difference is significant for capacity.
  • Scaling behavior policies (conservative scale-down, aggressive scale-up) remain unchanged, which is appropriate for a long-lived WebSocket service.

Confidence Score: 3/5

  • Low-risk config change but the mismatch between PR description and actual values needs author confirmation before merging
  • The change itself is straightforward HPA tuning, but the PR description claims minReplicas goes to 22 while the code sets it to 20. This discrepancy could indicate either an outdated description or an incorrect value, and since this affects production capacity for a WebSocket service, it warrants clarification.
  • Pay close attention to backend/charts/backend-listen/prod_omi_backend_listen_values.yaml — confirm that minReplicas=20 (not 22) is intentional

Important Files Changed

Filename Overview
backend/charts/backend-listen/prod_omi_backend_listen_values.yaml Reduces HPA minReplicas from 26 to 20 and maxReplicas from 50 to 40 for cost savings. PR description states minReplicas should be 22, but code sets it to 20 — discrepancy needs clarification.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[WS Connection Metric] -->|activeConnectionsPerPod: 20| B[HPA Decision]
    B -->|Below threshold| C[Scale Down]
    B -->|Above threshold| D[Scale Up]
    C -->|stabilization: 600s\nselectPolicy: Min| E["Min: 20 replicas (was 26)"]
    D -->|stabilization: 120s\nselectPolicy: Max| F["Max: 40 replicas (was 50)"]
    E --> G[Cost Savings]
    F --> H[Handle Traffic Spikes]
Loading

Last reviewed commit: 48e0833

enabled: true
minReplicas: 26
maxReplicas: 50
minReplicas: 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description does not match actual change
The PR description states minReplicas: 26 -> 22, but the actual diff shows 26 -> 20. Please update the PR description to reflect the correct value, or update the code if 22 was the intended minReplicas. A 6-replica difference (20 vs 22 vs 26) matters for capacity planning on a WebSocket-heavy service.

@beastoin beastoin merged commit ff732ed into main Mar 15, 2026
2 checks passed
@beastoin beastoin deleted the task/adjust-backend-listen-hpa-for-cost-saving branch March 15, 2026 02:19
@beastoin
Copy link
Collaborator

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants