Skip to content

*: mark killed standby connections as normal closed#68347

Merged
ti-chi-bot[bot] merged 1 commit into
pingcap:masterfrom
ChangRui-Ryan:changrui_cse_1882
May 19, 2026
Merged

*: mark killed standby connections as normal closed#68347
ti-chi-bot[bot] merged 1 commit into
pingcap:masterfrom
ChangRui-Ryan:changrui_cse_1882

Conversation

@ChangRui-Ryan
Copy link
Copy Markdown
Contributor

@ChangRui-Ryan ChangRui-Ryan commented May 13, 2026

What problem does this PR solve?

Issue Number: ref #67765

Problem Summary:
/tidb-pool/checkconn can confirm some expected TiDB-side connection closes, such as read packet timeout, for standby/tidb-pool deployments. However, when a connection is closed by SQL KILL CONNECTION, the connection is intentionally closed by TiDB but is not recorded as normal closed. As a result, the manager may see the closed gateway connection as unconfirmed.

What changed and how does it work

This change extends SessionManager.Kill with an optional normal-close reason. SQL KILL paths pass a reason, while internal query-kill paths keep passing an empty reason. Server.Kill records the gateway connection ID into normalClosedConns only when all of these are true: the operation is KILL CONNECTION, a normal-close reason is provided, and the server has a StandbyController for /tidb-pool/checkconn. KILL QUERY and non-standby deployments do not write the normal-closed cache.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Summary by CodeRabbit

  • New Features

    • Support for killing connections with optional "normal close" messages, including distinct messages for local vs remote statement kills and propagation of keyspace/gateway info.
    • Added a public method to allow callers to supply a normal-close message when terminating connections.
  • Tests

    • Expanded tests to cover normal-close message recording and connection status changes under standby controller.
  • Chores

    • Build/test targets updated to include new session manager component and adjust test sharding.

Review Change Stack

@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-tests-checked labels May 13, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented May 13, 2026

@ChangRui-Ryan I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/needs-tests-checked labels May 13, 2026
@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 13, 2026

Hi @ChangRui-Ryan. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 78c47e6b-0a7c-490e-a231-4afe1e70280c

📥 Commits

Reviewing files that changed from the base of the PR and between afeb4b8 and b8c6d4a.

📒 Files selected for processing (7)
  • cmd/tidb-server/BUILD.bazel
  • pkg/executor/BUILD.bazel
  • pkg/executor/simple.go
  • pkg/server/BUILD.bazel
  • pkg/server/server.go
  • pkg/server/server_test.go
  • pkg/session/sessmgr/processinfo.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/executor/BUILD.bazel
🚧 Files skipped from review as they are similar to previous changes (4)
  • cmd/tidb-server/BUILD.bazel
  • pkg/server/server_test.go
  • pkg/server/server.go
  • pkg/executor/simple.go

📝 Walkthrough

Walkthrough

The PR extends the session-manager kill API with a normalCloseMsg path, adds Server.KillWithNormalCloseMsg and internal s.kill(normalCloseMsg) to record standby "normal closed" connections, routes executor KILL statements through sessmgr.KillWithNormalCloseMsg with local/remote context, updates tests to validate standby behavior, and adds Bazel build tweaks.

Changes

Kill Method Messaging and Standby Integration

Layer / File(s) Summary
Session manager kill API extension
pkg/session/sessmgr/processinfo.go
Exports NormalCloseMsgKillStmt and NormalCloseMsgKillStmtFromRemote; defines NormalCloseKiller interface and KillWithNormalCloseMsg(sm, ...) helper that routes to NormalCloseKiller or falls back to sm.Kill.
Server kill implementation with standby recording
pkg/server/server.go
Imports keyspace; adds internal s.kill(..., normalCloseMsg) and exported Server.KillWithNormalCloseMsg(..., normalCloseMsg); when query==false and normalCloseMsg non-empty and StandbyController present, calls SetNormalClosedConn with keyspace name and gateway connection id and marks connection wait-shutdown.
Kill statement execution through session manager helper
pkg/executor/simple.go
Adds sessmgr import; introduces killBySQLStmt to choose normalCloseMsg for statement kills (local vs remote) and calls sessmgr.KillWithNormalCloseMsg; updates CONNECTION_ID, TiDB-compatible, remote-redirection, and same-server kill paths to call the helper with correct fromRemote.
Server kill behavior tests with standby scenarios
pkg/server/server_test.go
Adds net/http import and testStandbyController stub; introduces newConn helper; extends TestGetConAttrs with assertions for no-record when standby unset, record+wait-shutdown when standby present for connection kills, and no-record for query kills.
Bazel deps and shard updates
pkg/executor/BUILD.bazel, pkg/server/BUILD.bazel, cmd/tidb-server/BUILD.bazel
Add //pkg/session/sessmgr to executor and server_test deps; increase tidb-server_test shard_count from 6 to 7.

Estimated code review effort:
🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested Labels:
ok-to-test

Suggested Reviewers:

  • YangKeao
  • hawkingrei
  • windtalker
  • mjonss

🐰 I hop with a tiny log of code,
A gentle kill message on the road,
Standby listens, notes a gentle close,
Tests ensure the tidy prose,
Hooray — connections end in peace.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '*: mark killed standby connections as normal closed' directly and clearly summarizes the main change—recording killed standby connections as normal closed.
Description check ✅ Passed The description provides an issue reference (ref #67765), a clear problem summary, and an explanation of what changed. However, it is missing the 'Side effects' section checklist items, which appear to be unchecked but present in the template.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 7.40741% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.9584%. Comparing base (c6056f8) to head (b8c6d4a).
⚠️ Report is 52 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #68347        +/-   ##
================================================
- Coverage   77.7306%   76.9584%   -0.7722%     
================================================
  Files          1990       2036        +46     
  Lines        551898     580539     +28641     
================================================
+ Hits         428994     446774     +17780     
- Misses       121984     130835      +8851     
- Partials        920       2930      +2010     
Flag Coverage Δ
integration 46.2827% <7.4074%> (+6.4808%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4888% <ø> (ø)
parser ∅ <ø> (∅)
br 65.4419% <ø> (+2.3485%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 13, 2026

@better0332: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 15, 2026
Copy link
Copy Markdown
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot Bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels May 19, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 19, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bb7133, better0332, wjhuang2016

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels May 19, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 19, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-05-19 03:39:53.301696324 +0000 UTC m=+235522.805827000: ☑️ agreed by bb7133.
  • 2026-05-19 05:38:43.239633147 +0000 UTC m=+242652.743763823: ☑️ agreed by wjhuang2016.

@hawkingrei
Copy link
Copy Markdown
Member

/retest

1 similar comment
@ChangRui-Ryan
Copy link
Copy Markdown
Contributor Author

/retest

@tiprow
Copy link
Copy Markdown

tiprow Bot commented May 19, 2026

@ChangRui-Ryan: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hawkingrei
Copy link
Copy Markdown
Member

/retest

@ti-chi-bot ti-chi-bot Bot merged commit 037f05b into pingcap:master May 19, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants