Skip to content

balance, proxy: support evicting backends by config#1116

Open
djshow832 wants to merge 1 commit intopingcap:mainfrom
djshow832:evict_backend
Open

balance, proxy: support evicting backends by config#1116
djshow832 wants to merge 1 commit intopingcap:mainfrom
djshow832:evict_backend

Conversation

@djshow832
Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #1115

Problem Summary:
When a backend is confirmed to fail, we need a way to evict it manually.

What is changed and how it works:

  • add proxy.fail-backend-list and proxy.failover-timeout config
  • stop routing new connections to failed backends and migrate existing ones away
  • force close remaining connections after the failover timeout
  • allow fail-backend-list to match by pod name or backend address

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Notable changes

  • Has configuration change
  • Has HTTP API interfaces change
  • Has tiproxyctl change
  • Other user behavior changes

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

- Support evict failed backends by config

@ti-chi-bot ti-chi-bot bot requested review from YangKeao and xhebox April 2, 2026 07:50
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 2, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xhebox for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XL label Apr 2, 2026
@djshow832 djshow832 changed the title balance, proxy balance, proxy: support evicting backends by config Apr 2, 2026
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.46405% with 10 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@4d841da). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pkg/balance/router/group.go 85.18% 2 Missing and 2 partials ⚠️
pkg/balance/router/router.go 93.33% 2 Missing and 1 partial ⚠️
pkg/proxy/backend/backend_conn_mgr.go 78.57% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1116   +/-   ##
=======================================
  Coverage        ?   67.43%           
=======================================
  Files           ?      141           
  Lines           ?    14959           
  Branches        ?        0           
=======================================
  Hits            ?    10087           
  Misses          ?     4192           
  Partials        ?      680           
Flag Coverage Δ
unit 67.43% <93.46%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@djshow832
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce40a52848

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +307 to +308
if conn.phase == phaseClosed || conn.forceClosing {
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip force-closing connections that are mid-redirect

CloseTimedOutFailoverConnections currently force-closes any non-closed connection, including ones in phaseRedirectNotify. With failover-timeout=0 (or a very short timeout), rebalance() can queue a redirect in Balance() and then immediately call ForceClose() on the same session in the same tick, which races with the pending redirect signal and can leave router bookkeeping inconsistent (incorrect connScore/list state when close and redirect callbacks arrive in opposite order). Guarding phaseRedirectNotify here (or cancelling/marking the pending redirect before close) avoids this failover race.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support evicting backends

2 participants