fix(core): Prevent Redis connection recovery from being missed#28256
Open
fix(core): Prevent Redis connection recovery from being missed#28256
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Contributor
There was a problem hiding this comment.
No issues found across 2 files
Architecture diagram
sequenceDiagram
participant Redis as ioRedis Library
participant RCS as RedisClientService
participant Emitter as Internal Emitter (Debounced)
Note over Redis, RCS: Connection Failure
Redis->>RCS: retryStrategy() callback invoked
rect rgb(240, 240, 240)
Note right of RCS: NEW: Atomic State Update
RCS->>RCS: Set lostConnection = true
end
RCS->>Emitter: emit('connection-lost', timeout)
Note over Emitter: Debounce Timer Starts (1s)
Note over Redis, RCS: Connection Re-established
Redis->>RCS: Event: 'ready'
alt is lostConnection == true
Note right of RCS: Recovery Path Triggered
RCS->>RCS: internalRecover()
RCS->>Emitter: emit('connection-recovered')
RCS->>RCS: Set lostConnection = false
else CHANGED: Previous Race Condition (if ready fired before debounce)
Note right of RCS: Recovery skipped (Bug fixed)
end
Note over Emitter: Debounce Timer Expires
Emitter-->>RCS: Listener: 'connection-lost' fires
rect rgb(240, 240, 240)
Note right of RCS: CHANGED: Only logs warning now
RCS->>RCS: logger.warn(...)
end
Contributor
Performance ComparisonComparing current → latest master → 14-day baseline docker-stats
Memory consumption baseline with starter plan resources
Idle baseline with Instance AI module loaded
How to read this table
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RedisClientServicedebounces all event emissions by 1s, and the reconnection retry interval is also 1s. When a connection drops,retryStrategyqueues a debouncedconnection-lostevent. Meanwhile,ioRediswaits 1s and reconnects. So the debounce timer and thereadyevent race each other. Typically the debounce wins and setslostConnection = truebeforereadychecks it, but under loadreadycan fire first, so it findslostConnectionstill asfalseand skips recovery, then the lateconnection-lostflips the flag totrue, where it stays forever.This PR sets
lostConnection = truesynchronously inretryStrategyto bypass the debounce. The debounced listener now only logs.Related Linear tickets, Github issues, and Community forum posts
https://linear.app/n8n/issue/CAT-2721
Closes #27753
Review / Merge checklist
Backport to Beta,Backport to Stable, orBackport to v1(if the PR is an urgent fix that needs to be backported)