Summary
When the remote FreeSWITCH server closes the TCP connection (EOF), receiveLoop panics with send on closed channel because it attempts to send to a response channel that has already been closed by a concurrent close() call.
Environment
- Library:
github.com/percipia/eslgo v1.5.0
- Go version: 1.21+
Panic Output
panic: send on closed channel
goroutine 3110 [running]:
github.com/percipia/eslgo.(*Conn).receiveLoop(0xc0006803c0)
.../vendor/github.com/percipia/eslgo/connection.go:299 +0x345
created by github.com/percipia/eslgo.newConnection in goroutine 1219
.../vendor/github.com/percipia/eslgo/connection.go:92 +0x5ff
How to Reproduce
Real-world scenario
- Establish an inbound or outbound ESL connection to FreeSWITCH
- Have FreeSWITCH terminate the connection from its side (e.g.,
fs_cli -x "reload mod_event_socket", a FreeSWITCH restart, or a network drop)
- Simultaneously call
conn.ExitAndClose() from the application side (e.g., triggered by a session timeout or context cancellation racing with the EOF)
- The panic occurs non-deterministically — it depends on goroutine scheduling. Running under load or with
-race makes it more consistent.
Minimal reproducer
The following test reliably triggers the race by simulating an abrupt server-side close concurrent with a client Close() call:
package eslgo_test
import (
"net"
"testing"
"time"
"github.com/percipia/eslgo"
)
// fakeFreeSWITCH writes a minimal ESL auth/response handshake then immediately
// closes the connection, simulating a server-side EOF.
func fakeFreeSWITCH(t *testing.T) net.Listener {
t.Helper()
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatal(err)
}
go func() {
conn, err := ln.Accept()
if err != nil {
return
}
// Minimal ESL handshake: send auth request, accept any auth reply, then drop
conn.Write([]byte("Content-Type: auth/request\r\n\r\n"))
buf := make([]byte, 256)
conn.Read(buf) // consume auth reply
conn.Write([]byte("Content-Type: command/reply\r\nReply-Text: +OK accepted\r\n\r\n"))
time.Sleep(50 * time.Millisecond)
conn.Close() // triggers EOF in receiveLoop
}()
return ln
}
func TestReceiveLoopPanicOnEOF(t *testing.T) {
ln := fakeFreeSWITCH(t)
defer ln.Close()
conn, err := eslgo.Dial(ln.Addr().String(), "ClueCon", eslgo.DefaultOptions)
if err != nil {
t.Fatal(err)
}
// Race: server closes (EOF) vs client calling ExitAndClose concurrently
go func() {
time.Sleep(60 * time.Millisecond)
conn.ExitAndClose() // may race with receiveLoop's EOF handling
}()
time.Sleep(200 * time.Millisecond) // give enough time for the panic to surface
}
Run with the race detector to confirm:
go test -race -run TestReceiveLoopPanicOnEOF -count=10 ./...
Root Cause
There is a race between receiveLoop and close() on the responseChannels map.
close() closes all response channels under a write lock:
func (c *Conn) close() {
c.stopFunc()
c.responseChanMutex.Lock()
defer c.responseChanMutex.Unlock()
for key, responseChan := range c.responseChannels {
close(responseChan) // TypeDisconnect channel is closed here
delete(c.responseChannels, key)
}
c.conn.Close()
}
receiveLoop sends to TypeDisconnect on EOF — without holding the mutex:
func (c *Conn) receiveLoop() {
for c.runningContext.Err() == nil {
err := c.doMessage()
if err != nil {
if err.Error() == "EOF" {
c.logger.Warn("Connection closed, stopping receive loop\n")
select {
// No mutex held — channel may already be closed by close()
case c.responseChannels[TypeDisconnect] <- &RawResponse{...}:
default:
}
return
}
break
}
}
}
Note that doMessage() correctly holds responseChanMutex.RLock() when writing to response channels. The EOF path in receiveLoop skips this protection.
Race Sequence
- FreeSWITCH drops the TCP connection →
receiveLoop receives EOF from doMessage()
- Simultaneously, a caller invokes
ExitAndClose() → close() acquires write lock → closes and deletes all channels including TypeDisconnect
receiveLoop evaluates c.responseChannels[TypeDisconnect] — obtains the now-closed channel pointer (map read is not yet protected)
select attempts to send to the closed channel → panic
Proposed Fix
Mirror the same responseChanMutex.RLock() pattern used by doMessage() around the disconnect send in receiveLoop:
func (c *Conn) receiveLoop() {
for c.runningContext.Err() == nil {
err := c.doMessage()
if err != nil {
if err.Error() == "EOF" {
c.logger.Warn("Connection closed, stopping receive loop\n")
c.responseChanMutex.RLock()
disconnectCh, ok := c.responseChannels[TypeDisconnect]
if ok {
select {
case disconnectCh <- &RawResponse{
Headers: textproto.MIMEHeader{
"Content-Type": []string{TypeDisconnect},
"Error": []string{err.Error()},
},
Body: []byte("connection closed: " + err.Error()),
}:
default:
}
}
c.responseChanMutex.RUnlock()
return
}
break
}
}
}
This ensures the channel is not read from the map and sent to after close() has already deleted and closed it, consistent with how doMessage() handles the same channels.
Summary
When the remote FreeSWITCH server closes the TCP connection (EOF),
receiveLooppanics withsend on closed channelbecause it attempts to send to a response channel that has already been closed by a concurrentclose()call.Environment
github.com/percipia/eslgo v1.5.0Panic Output
How to Reproduce
Real-world scenario
fs_cli -x "reload mod_event_socket", a FreeSWITCH restart, or a network drop)conn.ExitAndClose()from the application side (e.g., triggered by a session timeout or context cancellation racing with the EOF)-racemakes it more consistent.Minimal reproducer
The following test reliably triggers the race by simulating an abrupt server-side close concurrent with a client
Close()call:Run with the race detector to confirm:
go test -race -run TestReceiveLoopPanicOnEOF -count=10 ./...Root Cause
There is a race between
receiveLoopandclose()on theresponseChannelsmap.close()closes all response channels under a write lock:receiveLoopsends toTypeDisconnecton EOF — without holding the mutex:Note that
doMessage()correctly holdsresponseChanMutex.RLock()when writing to response channels. The EOF path inreceiveLoopskips this protection.Race Sequence
receiveLoopreceives EOF fromdoMessage()ExitAndClose()→close()acquires write lock → closes and deletes all channels includingTypeDisconnectreceiveLoopevaluatesc.responseChannels[TypeDisconnect]— obtains the now-closed channel pointer (map read is not yet protected)selectattempts to send to the closed channel → panicProposed Fix
Mirror the same
responseChanMutex.RLock()pattern used bydoMessage()around the disconnect send inreceiveLoop:This ensures the channel is not read from the map and sent to after
close()has already deleted and closed it, consistent with howdoMessage()handles the same channels.