fix: serialize killchannel map access to prevent concurrent-map crash#318
fix: serialize killchannel map access to prevent concurrent-map crash#318ThiagoBauken wants to merge 2 commits into
Conversation
killchannel (map[string]chan bool) was read, written and deleted from both HTTP request goroutines (Connect/Disconnect/logout/admin-delete) and the per-session startClient goroutine with no synchronization, which can crash the process with "concurrent map read and map write" / "concurrent map writes" under simultaneous connects. Guard every map operation with a dedicated mutex via four helpers (setKillChannel / getKillChannel / deleteKillChannel / signalKill). The lock is held only around the map access, never while sending on or receiving from a channel, so a slow or absent receiver cannot block another session. The startClient loop now reads its channel through getKillChannel each iteration instead of indexing the map directly. Behaviour is otherwise unchanged; this complements the buffered-channel fix from asternic#242. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request replaces direct, unguarded access to the killchannel map with mutex-guarded helper functions to prevent data races, and adds concurrent unit tests. The review feedback highlights a critical concurrency bug and performance issue in wmiau.go where retrieving the kill channel inside a for loop on every iteration can lead to an infinite busy-loop, resource leaks, and high mutex contention. The reviewer suggests retrieving the channel once before entering the loop.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| for { | ||
| kill, _ := getKillChannel(userID) | ||
| select { | ||
| case <-killchannel[userID]: | ||
| case <-kill: |
There was a problem hiding this comment.
Critical Concurrency Bug & Performance Overhead
By calling getKillChannel(userID) inside the for loop on every iteration, two major issues are introduced:
- Resource Leak & Infinite Busy-Loop (Bug): If a new connection is established for the same
userIDwhile an olderstartClientgoroutine is still running, the older goroutine will retrieve the new connection's kill channel from the map. When a disconnect signal is sent, only one of the goroutines (non-deterministically) will receive the signal and exit. The remaining goroutine will continue running, and on its next iteration, it will find that the channel has been deleted from the map (getKillChannelreturnsnil). Reading from anilchannel blocks forever, causing theselectto always fall through to thedefaultbranch and sleep for 1 second. This results in an infinite busy-loop that leaks CPU and memory indefinitely. - High Mutex Contention (Performance): Calling
getKillChannelevery second for every active session constantly locks and unlocks the globalkillchannelMumutex, leading to severe CPU overhead and lock contention under high load.
Solution
Retrieve the kill channel once before entering the loop. This ensures the goroutine always listens to its own dedicated channel, preventing the leak/race bug and completely eliminating the repeated mutex locking overhead.
| for { | |
| kill, _ := getKillChannel(userID) | |
| select { | |
| case <-killchannel[userID]: | |
| case <-kill: | |
| kill, _ := getKillChannel(userID) | |
| for { | |
| select { | |
| case <-kill: |
Reading getKillChannel(userID) on every loop iteration re-locked killchannelMu once per second per session (contention under load) and could switch to a replaced channel after a reconnect. Capture the channel once before the loop so each session goroutine listens on its own channel without re-locking. Addresses Gemini Code Assist review on asternic#318. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Closing to consolidate my recent PRs into a smaller, focused set that's easier to review — I'll resubmit the changes grouped together. Sorry for the churn, and thanks! |
Problem
killchannel(map[string]chan bool, declared inmain.go) is accessed from two kinds of goroutines with no synchronization:Connectwriteskillchannel[txtid] = …;Disconnect,Logoutand the admin delete path send onkillchannel[txtid].startClientgoroutine — readskillchannel[userID]in its loop anddeletes the entry on kill.Concurrent connects/disconnects therefore read, write and delete the same map at the same time. Go's runtime detects this and aborts the whole process with
fatal error: concurrent map read and map write(orconcurrent map writes). This is consistent with the crash reports in this area (#191, #153, #212).This is separate from #242, which fixed the channel buffering; the map itself was never guarded.
Fix
Introduce a dedicated
sync.Mutex(killchannelMu) and route every access through four small helpers:setKillChannel,getKillChannel,deleteKillChannel— lock only around the map op;signalKill— looks up the channel under the lock, then does the existing non-blocking send outside the lock.The lock is never held while sending on or receiving from a channel, so a slow or absent receiver cannot block another session or deadlock. The
startClientloop reads its channel viagetKillChanneleach iteration instead of indexing the map directly. No lifecycle behaviour changes — it purely makes the existing operations thread-safe, complementing the buffered-channel fix from #242.Testing
TestKillChannelHelpers— set/get/signal/delete cycle + safe no-op signal on a missing entry.TestKillChannelConcurrent— 100 users hammered from concurrent goroutines; undergo test -race(CI) the old bare-map access reports the race, the guarded helpers are clean.go build ./...✅ ·go vet ./...✅ ·go test ./...✅