-
Notifications
You must be signed in to change notification settings - Fork 1
Implement nc.reconnect() method for manual reconnection (#86) #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add public reconnect() method that allows users to manually trigger reconnection process for scenarios like refreshing authentication credentials or rebalancing client connections. Features: - Manual reconnection with full subscription recovery - Comprehensive error handling for invalid states - Reuses existing robust reconnection infrastructure - Thread-safe implementation with proper callback support 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
WalkthroughAdds a public Connection.reconnect() method, extends the public ConnectionError union with four new variants, integrates manual reconnect with the existing background reconnect thread, and adds tests for manual reconnect success and error cases plus a small test-helper mutex call change. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/connection.zig (3)
507-513: Make the synthetic reconnect cause internal (don’t expose ManualReconnect publicly)
ManualReconnectis only used as an internal sentinel to drive cleanup. Keeping it in the publicConnectionErrorset leaks an implementation detail.
- Remove
ManualReconnectfromConnectionError(see separate diff comment).- Add a private error set and use it here:
- const synthetic_error = ConnectionError.ManualReconnect; + const synthetic_error = InternalReconnect.ManualReconnect;Add near the top-level (outside this range):
const InternalReconnect = error{ ManualReconnect };
112-116: Scope public errors to user-facing cases only; move ManualReconnect to a private error set
ManualReconnectneed not be part of the public API.Apply:
pub const ConnectionError = error{ @@ - NotConnected, - ManualReconnect, + NotConnected, } || PublishError || std.Thread.SpawnError || std.posix.WriteError || std.posix.ReadError;Then define a private set (outside this block), e.g.:
const InternalReconnect = error{ ManualReconnect };
496-499: Minor: tweak wording for connecting-state logThe log says “initial connection not yet established” but returns
error.NotConnected. Consider clarifying the log to “Cannot reconnect while initial connect is in progress”.- .connecting => { - log.warn("Cannot reconnect: initial connection not yet established", .{}); + .connecting => { + log.warn("Cannot reconnect: initial connect is in progress", .{}); return ConnectionError.NotConnected; },tests/reconnection_test.zig (2)
154-160: Use the canonical condvar pattern: hold the lock while checking the predicateAs written, there’s a small race between reading
reconnected_calledand waiting, which can cause an extra 100ms delay. Hold the lock across the predicate check.- var timer = try std.time.Timer.start(); - while (tracker.reconnected_called == 0) { - if (timer.read() >= 5000 * std.time.ns_per_ms) { - return error.ReconnectionTimeout; - } - try tracker.timedWait(100); - } + var timer = try std.time.Timer.start(); + tracker.mutex.lock(); + defer tracker.mutex.unlock(); + while (tracker.reconnected_called == 0) { + if (timer.read() >= 5000 * std.time.ns_per_ms) { + return error.ReconnectionTimeout; + } + _ = tracker.cond.timedWait(&tracker.mutex, 100 * std.time.ns_per_ms) catch {}; + }
183-202: Add coverage for AlreadyReconnecting (optional)Call
nc.reconnect()twice back-to-back and asserterror.AlreadyReconnectingon the second call.test "reconnect() returns AlreadyReconnecting when called twice" { tracker.reset(); const nc = try utils.createConnection(.node1, .{ .reconnect = .{ .allow_reconnect = true, .reconnect_wait_ms = 100 }, }); defer utils.closeConnection(nc); try nc.reconnect(); try testing.expectError(error.AlreadyReconnecting, nc.reconnect()); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/connection.zig(2 hunks)tests/reconnection_test.zig(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.zig
📄 CodeRabbit inference engine (CLAUDE.md)
Format code using
zig fmtbefore commit
Files:
src/connection.zigtests/reconnection_test.zig
🪛 GitHub Actions: CI
tests/reconnection_test.zig
[error] 1-1: zig fmt --check . failed. Formatting issues detected in ./tests/reconnection_test.zig. Run 'zig fmt' to fix formatting.
🔇 Additional comments (2)
tests/reconnection_test.zig (2)
65-65: Correct mutex usage with cond.timedWaitPassing
&self.mutexis the right fix; this matches Zig’s condvar API.
1-203: Fix CI: runzig fmt
CI requires all Zig sources formatted. Unable to verify here (nozigavailable); please runzig fmt --check .locally and commit any formatting changes.
When std.Thread.spawn() fails in reconnect(), ensure proper thread cleanup by using deferred close pattern instead of setting status under mutex. This mirrors the pattern used in markNeedsReconnect() and prevents reader/flusher threads from hanging on acquireSocket(). Co-authored-by: Lukáš Lalinský <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/connection.zig (1)
467-530: Manual reconnect flow looks solid; spawn-failure path now safeLocking, status transitions, buffer freezing, and the deferred disconnected_cb are correct. The spawn error fallback with needs_close + deferred close happens outside the mutex, resolving the prior deadlock risk.
🧹 Nitpick comments (6)
src/connection.zig (3)
112-116: Don’t expose internal “ManualReconnect” in the public error setIt’s only used as a synthetic trigger and never surfaced to callers. Keeping it public widens the error set type and can ripple through downstream code. Make it a local error set in reconnect() instead.
Apply:
@@ NoResponders, - ReconnectDisabled, - AlreadyReconnecting, - NotConnected, - ManualReconnect, + ReconnectDisabled, + AlreadyReconnecting, + NotConnected, @@ - const synthetic_error = ConnectionError.ManualReconnect; + const SyntheticReconnectError = error{ ManualReconnect }; + const synthetic_error: anyerror = SyntheticReconnectError.ManualReconnect;
467-530: Minor API polish: document non-blocking behavior and returned errorsAdd to the doc comment that reconnect() is non-blocking and list the possible errors (ReconnectDisabled, AlreadyReconnecting, ConnectionClosed, NotConnected).
1400-1512: Reduce mutex hold during reconnect attempts (optional)doReconnect holds the main mutex across connect(), handshake, sleeps, and server-pool walks. That can stall publishers/subscribers during long network delays. Consider narrowing the critical sections: compute next server + update state under the lock, perform blocking socket/handshake outside, then re-enter to commit results.
tests/reconnection_test.zig (3)
154-161: Avoid data race on tracker.reconnected_called; hold the mutex across the wait loopThe loop reads reconnected_called without the mutex while callbacks increment it under the mutex. Make the read+wait atomic with the condition variable.
Apply:
- var timer = try std.time.Timer.start(); - while (tracker.reconnected_called == 0) { - if (timer.read() >= 5000 * std.time.ns_per_ms) { - return error.ReconnectionTimeout; - } - try tracker.timedWait(100); - } + var timer = try std.time.Timer.start(); + tracker.mutex.lock(); + defer tracker.mutex.unlock(); + while (tracker.reconnected_called == 0) { + if (timer.read() >= 5000 * std.time.ns_per_ms) return error.ReconnectionTimeout; + _ = tracker.cond.timedWait(&tracker.mutex, 100 * std.time.ns_per_ms) catch {}; + }
183-193: Optional: assert no callbacks fired when reconnect is disabledReconnectDisabled should not change state. Assert disconnected_called and reconnected_called remain 0 to catch regressions.
195-202: Optional: add tests for AlreadyReconnecting and NotConnected
- Call nc.reconnect() twice back-to-back; the second should yield error.AlreadyReconnecting.
- Trigger during initial connect (or fake .connecting) to validate error.NotConnected.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/connection.zig(2 hunks)tests/reconnection_test.zig(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.zig
📄 CodeRabbit inference engine (CLAUDE.md)
Format code using
zig fmtbefore commit
Files:
src/connection.zigtests/reconnection_test.zig
🧠 Learnings (1)
📚 Learning: 2025-08-30T18:48:28.333Z
Learnt from: lalinsky
PR: lalinsky/nats.zig#66
File: src/connection.zig:317-323
Timestamp: 2025-08-30T18:48:28.333Z
Learning: In src/connection.zig, the establishConnection() call in connectToServer() is during initial connection setup where no race conditions exist - the connection starts disconnected and other threads would see the connecting status and wait appropriately.
Applied to files:
src/connection.zig
🔇 Additional comments (1)
tests/reconnection_test.zig (1)
65-65: Good fix: pass &self.mutex to timedWaitThis matches Condition.timedWait’s signature and avoids UB.
Summary
Implements a public
reconnect()method that allows users to manually trigger the reconnection process, addressing issue #86.This provides flexibility for scenarios like:
Implementation Details
nc.reconnect()method with comprehensive error handlingReconnectDisabled,AlreadyReconnecting,NotConnected,ManualReconnectTest Coverage
Usage Example
🤖 Generated with Claude Code