Skip to content

Connection can't be closed when there is unread DataFrame in the stream #97

Open
@bitoku

Description

@bitoku

When a stream gets a DataFrame and nothing reads data from the stream of the connection, it can't close the connection and leads to memory leak.

When I'm working on https://github.com/cri-o/cri-o memory leak issue, I found that in the situation above, spdy stream server causes memory leak.
The leak happens when it tries to close the connection, and it waits wg.Wait() forever.

spdystream/connection.go

Lines 392 to 394 in 57d1ca2

// wait for all frame handler workers to indicate they've drained their queues
// before handling the go away frame
wg.Wait()

The cause of the issue is that it can't finish frameHandler() because of the deadlock.
There are some workers calling frameHandler(), and wg.Add() when each worker starts and wg.Done() when each worker is done.

spdystream/connection.go

Lines 317 to 333 in 57d1ca2

for i := 0; i < FRAME_WORKERS; i++ {
frameQueues[i] = NewPriorityFrameQueue(QUEUE_SIZE)
// Ensure frame queue is drained when connection is closed
go func(frameQueue *PriorityFrameQueue) {
<-s.closeChan
frameQueue.Drain()
}(frameQueues[i])
wg.Add(1)
go func(frameQueue *PriorityFrameQueue) {
// let the WaitGroup know this worker is done
defer wg.Done()
s.frameHandler(frameQueue, newHandler)
}(frameQueues[i])
}

DataFrames queued in a PriorityFrameQueue are processed in frameHandler() and if it is a DataFrame, it is handled by dataFrameHandler() ( = handleDataFrame()).
In the function, it blocks until either the stream is closed or something reads the stream dataChan (e.g. something calls Read() or ReadData())
Otherwise the function can't finish, which means frameHandler() also can't finish, and it never calls wg.Done().

spdystream/connection.go

Lines 598 to 607 in 57d1ca2

if len(frame.Data) > 0 {
stream.dataLock.RLock()
select {
case <-stream.closeChan:
debugMessage("(%p) (%d) Data frame not sent (stream shut down)", stream, stream.streamId)
case stream.dataChan <- frame.Data:
debugMessage("(%p) (%d) Data frame sent", stream, stream.streamId)
}
stream.dataLock.RUnlock()
}

If nothing reads the data from the stream, the only way of blocking it is to close the stream.
However streams in a connection are closed after wg.Wait().

spdystream/connection.go

Lines 394 to 409 in 57d1ca2

wg.Wait()
if goAwayFrame != nil {
s.handleGoAwayFrame(goAwayFrame)
}
// now it's safe to close remote channels and empty s.streams
s.streamCond.L.Lock()
// notify streams that they're now closed, which will
// unblock any stream Read() calls
for _, stream := range s.streams {
stream.closeRemoteChannels()
}
s.streams = make(map[spdy.StreamId]*Stream)
s.streamCond.Broadcast()
s.streamCond.L.Unlock()

The simplest fix is to move stream.closeRemoteChannels() before wg.Wait(), but I'm unsure if it causes other race condition errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions