server, session: interrupt autocommit DML after disconnect#68237
Conversation
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds bufReadConn-driven liveness monitoring wired into SQLKiller, cancels in-flight dispatch on client disconnect for eligible autocommit INSERT/UPDATE/DELETE, performs a pre-commit liveness check, refactors killQuery to use cancelDispatch, and adds integration tests and helpers. ChangesAutocommit Write Statement Interruption on Client Disconnect
Sequence DiagramsequenceDiagram
participant Client
participant TiDBHandler
participant Monitor
participant BufConn
participant SQLKiller
Client->>TiDBHandler: SEND INSERT request
TiDBHandler->>Monitor: setSQLKillerConnectionAlive(bufReadConn.IsAlive)
Monitor->>BufConn: periodic IsAlive()
BufConn-->>Monitor: alive=false
Monitor->>SQLKiller: sendKillSignal(QueryInterrupted)
Monitor->>TiDBHandler: cancelDispatch()
TiDBHandler->>SQLKiller: FinishResultSet()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Skipping CI for Draft Pull Request. |
|
@King-Dylan I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
pkg/util/sqlkiller/sqlkiller.go (1)
237-243: ⚡ Quick winConsider routing through
sendKillSignalfor consistent kill logging.
CheckConnectionAlivedirectly CASeskiller.Signaland skips the "kill initiated" log emitted bysendKillSignal(line 126). When a disconnect-triggered interrupt fires from the pre-commit path, no audit log records that the kill was initiated, which makes diagnosing post-disconnect aborts harder. The existingHandleSignalperiodic check has the same gap, but adding a new code path is a good time to fix it.♻️ Suggested change
// CheckConnectionAlive checks whether the connection is alive immediately. func (killer *SQLKiller) CheckConnectionAlive() { fn := killer.IsConnectionAlive.Load() if fn != nil && !(*fn)() { - atomic.CompareAndSwapUint32(&killer.Signal, 0, QueryInterrupted) + killer.sendKillSignal(QueryInterrupted) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/util/sqlkiller/sqlkiller.go` around lines 237 - 243, CheckConnectionAlive currently updates killer.Signal directly via atomic.CompareAndSwapUint32 which bypasses the "kill initiated" audit log; change CheckConnectionAlive to call the existing sendKillSignal method on SQLKiller instead of performing the CAS itself (i.e., replace the direct atomic.CompareAndSwapUint32(&killer.Signal, 0, QueryInterrupted) with killer.sendKillSignal(QueryInterrupted)). Ensure sendKillSignal continues to perform the atomic CAS and emit the kill-initiated log so both disconnect-triggered and periodic checks use the same logging path; reference CheckConnectionAlive, sendKillSignal, SQLKiller, Signal, QueryInterrupted and HandleSignal when locating the change.pkg/server/conn.go (3)
2181-2189: ⚡ Quick winSame defer-cleanup hardening recommended as
conn_stmt.go.This
handleStmtsite mirrors the lifecycle inexecutePreparedStmtAndWriteResult. Ifcc.ctx.ExecuteStmt(line 2186) panics with monitoring set up, the eager cleanup at line 2188 is skipped and the deferred cleanup at line 2231 only runs on thers != nilbranch. The cleanup function issync.Once-wrapped, so adefer clearConnectionAlive()immediately aftersetSQLKillerConnectionAlive()is safe and would close the panic-leak window in one place.See the matching comment on
pkg/server/conn_stmt.golines 313-330 for the suggested refactor — applying the same change here would keep the two execution paths symmetric.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/conn.go` around lines 2181 - 2189, The cleanup function clearConnectionAlive can be skipped if cc.ctx.ExecuteStmt panics when monitoringConnectionAlive is true; mirror the fix used in executePreparedStmtAndWriteResult by calling clearConnectionAlive = cc.setSQLKillerConnectionAlive() and immediately deferring clearConnectionAlive() (i.e., add defer clearConnectionAlive() right after setSQLKillerConnectionAlive() inside handleStmt) so the sync.Once-protected cleanup runs on panic as well; refer to the symbols clearConnectionAlive, monitoringConnectionAlive, setSQLKillerConnectionAlive, cc.ctx.ExecuteStmt and the handleStmt path to apply the same hardening as in conn_stmt.go.
2113-2132: ⚖️ Poor tradeoffConsider per-connection monitor reuse instead of per-statement goroutine churn.
Every monitored autocommit DML spawns a fresh goroutine + ticker, runs for the duration of the statement, and tears down. On a busy TiDB instance with many short-lived
INSERTs, this adds measurable scheduler/allocation overhead (goroutine stack + ticker timer per statement). A single per-clientConnmonitor that toggles between active/idle states on achan bool(or driven fromdispatch) would avoid the churn.Not blocking for this PR — the current design is simple and correct — but worth tracking as a follow-up if perf telemetry shows up. Also, the 1ms test interval (line 2116) is fine for unit tests but will hot-loop under heavy parallel test runs; acceptable trade-off for fast disconnect detection.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/conn.go` around lines 2113 - 2132, monitorConnectionAlive currently spawns a new goroutine + ticker per statement which causes goroutine/ticker churn; refactor to a single long-lived per-clientConn monitor that lives on clientConn and accepts activation messages (e.g., an activate/deactivate chan bool or context-driven signal) instead of creating a new goroutine for each autocommit DML. Move monitorConnectionAlive to be started once for each clientConn (e.g., in clientConn initialization), have it read from the new activation channel and only run the ticker loop while active, call cc.ctx.GetSessionVars().SQLKiller.SendKillSignal(sqlkiller.QueryInterrupted) and cc.cancelDispatch() on isAlive failure as before, and update callers that currently spawn this goroutine to instead send activate/deactivate signals to the clientConn monitor.
2093-2111: ⚡ Quick winAdd a brief lifecycle comment to the new helper.
setSQLKillerConnectionAlivedoes three coupled things — install anIsConnectionAlivecallback on the SQLKiller, fork a poller goroutine, and return a once-wrapped cleanup. None of those are obvious from the name alone, and a future reader has to walk throughmonitorConnectionAliveandReset()semantics (which does not clearIsConnectionAlive) to understand why the cleanup contract matters.Per the coding guidelines, "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees…". A short doc comment naming the contract (caller MUST invoke the returned cleanup; idempotent; stops the monitor and clears the callback) would help.
♻️ Suggested doc comment
+// setSQLKillerConnectionAlive installs a connection-liveness probe on the +// session's SQLKiller and starts a background goroutine that interrupts the +// statement when the underlying TCP connection is detected as dead. +// Callers MUST invoke the returned cleanup once the statement is done; it is +// idempotent and stops the monitor and clears the SQLKiller probe. func (cc *clientConn) setSQLKillerConnectionAlive() func() {As per coding guidelines: "Comments SHOULD explain non-obvious intent, constraints, invariants, concurrency guarantees".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/conn.go` around lines 2093 - 2111, Add a short doc comment above setSQLKillerConnectionAlive describing its lifecycle and contract: explain that the function installs an IsConnectionAlive callback on the session SQLKiller, starts a background poller goroutine via monitorConnectionAlive, and returns a cleanup function that the caller MUST invoke; note the cleanup is idempotent (uses sync.Once), it stops the monitor (closes stopMonitor) and clears the IsConnectionAlive store, and callers are expected to call the returned func to avoid leaving the callback/goroutine running.pkg/server/conn_stmt.go (1)
313-330: ⚡ Quick winDefer cleanup right after setup to harden the panic path.
If
ExecuteStmt(line 321) panics withmonitoringConnectionAlive == true, neither the eager cleanup at line 322-324 nor the deferred cleanup at line 330 runs, and the monitor goroutine plus theIsConnectionAlivefunction pointer leak until the underlying connection finally closes.SQLKiller.Reset()does not clearIsConnectionAlive, so subsequent statements on the same session can observe a stale liveness function until the leaked goroutine self-exits.Since
setSQLKillerConnectionAlivereturns async.Once-wrapped cleanup, deferring it immediately is idempotent with the eager call below — recommend pulling the lifecycle into a single deferred call:♻️ Proposed change
- clearConnectionAlive := func() {} - monitoringConnectionAlive := false - if planCacheStmt != nil && planCacheStmt.PreparedAst != nil { - monitoringConnectionAlive = shouldMonitorConnectionAliveDuringExecute(planCacheStmt.PreparedAst.Stmt, vars) - if monitoringConnectionAlive { - clearConnectionAlive = cc.setSQLKillerConnectionAlive() - } - } + clearConnectionAlive := func() {} + monitoringConnectionAlive := false + if planCacheStmt != nil && planCacheStmt.PreparedAst != nil && + shouldMonitorConnectionAliveDuringExecute(planCacheStmt.PreparedAst.Stmt, vars) { + monitoringConnectionAlive = true + clearConnectionAlive = cc.setSQLKillerConnectionAlive() + defer clearConnectionAlive() + } rs, err := (&cc.ctx).ExecuteStmt(ctx, execStmt) - if rs == nil || err != nil { - clearConnectionAlive() - } var lazy bool if rs != nil { if !monitoringConnectionAlive { clearConnectionAlive = cc.setSQLKillerConnectionAlive() + defer clearConnectionAlive() } - defer clearConnectionAlive()(Same pattern applies to
handleStmtinconn.go.)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/conn_stmt.go` around lines 313 - 330, The current setup can leak the connection-alive monitor if ExecuteStmt panics because clearConnectionAlive is only deferred after ExecuteStmt; change the lifecycle so you call cc.setSQLKillerConnectionAlive() as soon as monitoringConnectionAlive is true and immediately defer the returned cleanup (clearConnectionAlive) before calling (&cc.ctx).ExecuteStmt, relying on the idempotent sync.Once cleanup to allow the eager nil-check later to call the same cleanup safely; update the block that uses monitoringConnectionAlive, clearConnectionAlive, setSQLKillerConnectionAlive and ExecuteStmt so the defer is installed immediately when setSQLKillerConnectionAlive is called (and remove redundant separate defers/eager calls), ensuring IsConnectionAlive/monitor goroutine cannot leak on panic.pkg/server/tests/commontest/tidb_test.go (1)
3507-3602: ⚡ Quick winCover the prepared/binary DML path as well.
Both new regressions use
conn.ExecContext(...), so they only exercise the COM_QUERY path. This PR also changes prepared execution behavior, and that branch can still regress unnoticed without aPrepareContext(... INSERT ...)/Stmt.ExecContext(...)variant here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/tests/commontest/tidb_test.go` around lines 3507 - 3602, Add a variant of each failing test that exercises the prepared/binary DML path: in TestClientDisconnectKillsAutocommitInsert and TestClientDisconnectCancelsAutocommitInsertPrewrite, after creating the table obtain conn via dbt.GetDB().Conn(...), call conn.PrepareContext(...) for the INSERT SQL to get a *Stmt, then run stmt.ExecContext(...) inside the goroutine (instead of conn.ExecContext), and follow the same netConn.Close(), processlist checks (processlistCountByInfo), error assertions, and final row-count assertions; ensure you Close() the prepared Stmt and the Conn in deferred cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/server/tests/commontest/tidb_test.go`:
- Around line 3521-3595: Capture the server process id for the long-running
INSERT before closing the raw network connection in both
TestClientDisconnectCancelsInsert and
TestClientDisconnectCancelsAutocommitInsert (use the same query logic you use in
processlistCountByInfo to find the matching "insert into issue57531_*" entry),
then register a t.Cleanup that opens a separate DB session (dbt.GetDB().Conn)
and issues a best-effort KILL for that PID if the processlist entry still
exists; ensure you perform the PID capture just after the require.Eventually
that confirms the statement is present and before netConn.Close(), and make the
cleanup tolerant of errors so it never fails the test teardown.
---
Nitpick comments:
In `@pkg/server/conn_stmt.go`:
- Around line 313-330: The current setup can leak the connection-alive monitor
if ExecuteStmt panics because clearConnectionAlive is only deferred after
ExecuteStmt; change the lifecycle so you call cc.setSQLKillerConnectionAlive()
as soon as monitoringConnectionAlive is true and immediately defer the returned
cleanup (clearConnectionAlive) before calling (&cc.ctx).ExecuteStmt, relying on
the idempotent sync.Once cleanup to allow the eager nil-check later to call the
same cleanup safely; update the block that uses monitoringConnectionAlive,
clearConnectionAlive, setSQLKillerConnectionAlive and ExecuteStmt so the defer
is installed immediately when setSQLKillerConnectionAlive is called (and remove
redundant separate defers/eager calls), ensuring IsConnectionAlive/monitor
goroutine cannot leak on panic.
In `@pkg/server/conn.go`:
- Around line 2181-2189: The cleanup function clearConnectionAlive can be
skipped if cc.ctx.ExecuteStmt panics when monitoringConnectionAlive is true;
mirror the fix used in executePreparedStmtAndWriteResult by calling
clearConnectionAlive = cc.setSQLKillerConnectionAlive() and immediately
deferring clearConnectionAlive() (i.e., add defer clearConnectionAlive() right
after setSQLKillerConnectionAlive() inside handleStmt) so the
sync.Once-protected cleanup runs on panic as well; refer to the symbols
clearConnectionAlive, monitoringConnectionAlive, setSQLKillerConnectionAlive,
cc.ctx.ExecuteStmt and the handleStmt path to apply the same hardening as in
conn_stmt.go.
- Around line 2113-2132: monitorConnectionAlive currently spawns a new goroutine
+ ticker per statement which causes goroutine/ticker churn; refactor to a single
long-lived per-clientConn monitor that lives on clientConn and accepts
activation messages (e.g., an activate/deactivate chan bool or context-driven
signal) instead of creating a new goroutine for each autocommit DML. Move
monitorConnectionAlive to be started once for each clientConn (e.g., in
clientConn initialization), have it read from the new activation channel and
only run the ticker loop while active, call
cc.ctx.GetSessionVars().SQLKiller.SendKillSignal(sqlkiller.QueryInterrupted) and
cc.cancelDispatch() on isAlive failure as before, and update callers that
currently spawn this goroutine to instead send activate/deactivate signals to
the clientConn monitor.
- Around line 2093-2111: Add a short doc comment above
setSQLKillerConnectionAlive describing its lifecycle and contract: explain that
the function installs an IsConnectionAlive callback on the session SQLKiller,
starts a background poller goroutine via monitorConnectionAlive, and returns a
cleanup function that the caller MUST invoke; note the cleanup is idempotent
(uses sync.Once), it stops the monitor (closes stopMonitor) and clears the
IsConnectionAlive store, and callers are expected to call the returned func to
avoid leaving the callback/goroutine running.
In `@pkg/server/tests/commontest/tidb_test.go`:
- Around line 3507-3602: Add a variant of each failing test that exercises the
prepared/binary DML path: in TestClientDisconnectKillsAutocommitInsert and
TestClientDisconnectCancelsAutocommitInsertPrewrite, after creating the table
obtain conn via dbt.GetDB().Conn(...), call conn.PrepareContext(...) for the
INSERT SQL to get a *Stmt, then run stmt.ExecContext(...) inside the goroutine
(instead of conn.ExecContext), and follow the same netConn.Close(), processlist
checks (processlistCountByInfo), error assertions, and final row-count
assertions; ensure you Close() the prepared Stmt and the Conn in deferred
cleanup.
In `@pkg/util/sqlkiller/sqlkiller.go`:
- Around line 237-243: CheckConnectionAlive currently updates killer.Signal
directly via atomic.CompareAndSwapUint32 which bypasses the "kill initiated"
audit log; change CheckConnectionAlive to call the existing sendKillSignal
method on SQLKiller instead of performing the CAS itself (i.e., replace the
direct atomic.CompareAndSwapUint32(&killer.Signal, 0, QueryInterrupted) with
killer.sendKillSignal(QueryInterrupted)). Ensure sendKillSignal continues to
perform the atomic CAS and emit the kill-initiated log so both
disconnect-triggered and periodic checks use the same logging path; reference
CheckConnectionAlive, sendKillSignal, SQLKiller, Signal, QueryInterrupted and
HandleSignal when locating the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 36d77292-d723-4dc8-b5a2-fc303967de25
📒 Files selected for processing (6)
pkg/server/conn.gopkg/server/conn_stmt.gopkg/server/server.gopkg/server/tests/commontest/tidb_test.gopkg/session/tidb.gopkg/util/sqlkiller/sqlkiller.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #68237 +/- ##
================================================
- Coverage 77.2806% 75.7098% -1.5709%
================================================
Files 2010 2008 -2
Lines 555350 563442 +8092
================================================
- Hits 429178 426581 -2597
- Misses 125252 136804 +11552
+ Partials 920 57 -863
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/server/conn.go (1)
2104-2132: ⚡ Quick winCleanup does not synchronize with goroutine exit.
close(stopMonitor)returns immediately; the goroutine may have already entered thecase <-ticker.Carm (Go'sselectpicks randomly when both cases are ready), soSendKillSignal/cancelDispatchcan still fire after the caller treats monitoring as cleared. In practice the dispatch loop resetsSQLKillerbetween commands so it's mostly benign, but a stray cancel landing on the next dispatch's context (Line 1373-1376) is observable. Consider re-checkingstopafter theisAlive()call, or have the cleanup wait for the goroutine via adonechannel /sync.WaitGroup.♻️ Sketch — re-check stop after liveness check
for { select { case <-ticker.C: + select { + case <-stop: + return + default: + } if !isAlive() { cc.ctx.GetSessionVars().SQLKiller.SendKillSignal(sqlkiller.QueryInterrupted) cc.cancelDispatch() return } case <-stop: return } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/conn.go` around lines 2104 - 2132, The monitor cleanup races with monitorConnectionAlive: close(stopMonitor) can return before the goroutine exits, so SQLKiller.SendKillSignal / cc.cancelDispatch may run after cleanup. Fix by adding a done channel (or sync.WaitGroup) that monitorConnectionAlive closes/signals before it returns and have the clearOnce closure wait for that signal after closing stopMonitor; also ensure monitorConnectionAlive checks stop non-blockingly after detecting !isAlive() (e.g., select { case <-stop: return default: } ) before calling SQLKiller.SendKillSignal and cc.cancelDispatch. Update the symbols clearOnce, stopMonitor, monitorConnectionAlive, isAlive, SQLKiller.SendKillSignal, and cancelDispatch accordingly so cleanup waits for the goroutine to finish.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/server/conn.go`:
- Around line 2220-2225: The shutdown short-circuit in the
ExecuteStmt/connection handling branch can return before registering defer
clearConnectionAlive(), leaking the monitor goroutine and
SQLKiller.IsConnectionAlive pointer; modify the logic in the conn handling
around ExecuteStmt/rs so that when monitoringConnectionAlive is true and rs !=
nil you always call cc.setSQLKillerConnectionAlive() and register defer
clearConnectionAlive() (or call clearConnectionAlive() just before any early
return caused by cc.getStatus() == connStatusShutdown) to ensure the SQLKiller
monitor is cleaned up; update the code paths that currently return on
connStatusShutdown so they either set/defer clearConnectionAlive first or invoke
clearConnectionAlive synchronously before returning, ensuring no dangling
SendKillSignal/cancelDispatch targets remain.
- Around line 2093-2099: In clientConn.setSQLKillerConnectionAlive add a brief
clarifying comment next to the cc.bufReadConn.IsAlive() != 0 check explaining
that IsAlive() returns 1 for alive, 0 for dead and -1 for an error/unknown state
(e.g., SetReadDeadline failure), and that the code intentionally treats any
non-zero value (1 or -1) as “not dead” to conservatively avoid terminating
queries when liveness is uncertain; reference cc.bufReadConn.IsAlive() and the
setSQLKillerConnectionAlive closure in the comment so future readers understand
the safety rationale.
---
Nitpick comments:
In `@pkg/server/conn.go`:
- Around line 2104-2132: The monitor cleanup races with monitorConnectionAlive:
close(stopMonitor) can return before the goroutine exits, so
SQLKiller.SendKillSignal / cc.cancelDispatch may run after cleanup. Fix by
adding a done channel (or sync.WaitGroup) that monitorConnectionAlive
closes/signals before it returns and have the clearOnce closure wait for that
signal after closing stopMonitor; also ensure monitorConnectionAlive checks stop
non-blockingly after detecting !isAlive() (e.g., select { case <-stop: return
default: } ) before calling SQLKiller.SendKillSignal and cc.cancelDispatch.
Update the symbols clearOnce, stopMonitor, monitorConnectionAlive, isAlive,
SQLKiller.SendKillSignal, and cancelDispatch accordingly so cleanup waits for
the goroutine to finish.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7aa8e686-da73-41a6-8533-fa3ce3f6ef73
📒 Files selected for processing (2)
pkg/server/conn.gopkg/session/tidb.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/session/tidb.go
|
/retest |
1 similar comment
|
/retest |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/server/tests/commontest/tidb_test.go (1)
3604-3612: ⚡ Quick winPrefer
selectwithtime.Afterover polling a buffered channel.
doneis a buffered, single-shot channel; usingrequire.Eventuallywith a non-blockingselectpolls every 50 ms instead of just blocking on the receive. A plainselectwith a timeout is more idiomatic, fails faster on a real hang, and makes the timeout message explicit.♻️ Proposed simplification
- var execErr error - require.Eventually(t, func() bool { - select { - case execErr = <-done: - return true - default: - return false - } - }, 5*time.Second, 50*time.Millisecond) - require.Error(t, execErr) + var execErr error + select { + case execErr = <-done: + case <-time.After(5 * time.Second): + t.Fatal("timeout waiting for autocommit insert to return after client disconnect") + } + require.Error(t, execErr)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/server/tests/commontest/tidb_test.go` around lines 3604 - 3612, The current test polls the buffered single-shot channel done using require.Eventually and a non-blocking select; replace that polling with a blocking select that waits on the done channel and a timeout via time.After so the test blocks until the goroutine finishes or the timeout fires. Modify the code that assigns execErr from <-done (the done channel and execErr variable in tidb_test.go) to use: select { case execErr = <-done: /* success */ case <-time.After(5 * time.Second): t.Fatalf("timed out waiting for done") } so it fails fast and reports an explicit timeout instead of polling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/server/tests/commontest/tidb_test.go`:
- Around line 3604-3612: The current test polls the buffered single-shot channel
done using require.Eventually and a non-blocking select; replace that polling
with a blocking select that waits on the done channel and a timeout via
time.After so the test blocks until the goroutine finishes or the timeout fires.
Modify the code that assigns execErr from <-done (the done channel and execErr
variable in tidb_test.go) to use: select { case execErr = <-done: /* success */
case <-time.After(5 * time.Second): t.Fatalf("timed out waiting for done") } so
it fails fast and reports an explicit timeout instead of polling.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4cae3642-411d-485c-b3f3-b9c3e2f21040
📒 Files selected for processing (1)
pkg/server/tests/commontest/tidb_test.go
|
/retest |
1 similar comment
|
/retest |
76574ec to
60060bd
Compare
|
/retest |
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bb7133, Defined2014 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #68236
Problem Summary:
Autocommit write statements such as
INSERTdo not write a result set, so the connection-liveness check added for the result-set path does not cover the whole statement lifecycle. If the client disconnects after TiDB has read the request packet, the statement can keep running until expression evaluation, commit, prewrite, or retry/backoff finishes naturally.What changed and how does it work?
INSERT,UPDATE, andDELETE, including text protocolEXECUTEand binary prepared execution.QueryInterruptedand cancels the current dispatch context, so prewrite/backoff can exit promptly.INSERT/UPDATE/DELETE, so a disconnect observed after execution but before commit is converted into an interrupt.KILL QUERY.INSERTdisconnect before commit and during prewrite/backoff.Check List
Tests
Validated locally:
make bazel_preparewas also attempted, but the current checkout'sgo.modcontains a local replacementgithub.com/pingcap/tidb/pkg/parser => ./pkg/parser, and Gazelle failed with:make bazel_ci_simple_preparepassed and did not generate additional Bazel diffs.Side effects
No expected performance regression. The extra monitor is scoped to
INSERT/UPDATE/DELETEexecution while the statement is active. It checks once per second in normal builds and exits when the statement finishes.Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Bug Fixes
Tests