-
Notifications
You must be signed in to change notification settings - Fork 623
Description
Summary
During testing, it has been observed that a TCP connection can be closed while the TCP work queue is still processing I/O for the same connection. When the worker thread processes what would now be a stale work item, the attemptIO encounters a NullPointerException on the dereferenced connection link.
Observed Behavior
- Socket is closed:
SocketChannel closing, local: localhost/127.0.0.1:56988 remote: localhost/127.0.0.1:8030
- Worker process a queue item afterwards and throws:
FFDC1015I: An FFDC Incident has been created: "java.lang.NullPointerException: Cannot invoke "com.ibm.ws.tcpchannel.internal.SocketIOChannel.getSocket()" because the return value of "com.ibm.ws.tcpchannel.internal.TCPConnLink.getSocketIOChannel()" is null com.ibm.ws.tcpchannel.internal.WorkQueueManager workerRun(req)" at ffdc_25.10.22_01.19.19.0.log
Root Cause
The WorkQueueManager.attemptIO(...) reads the TCPConnLink from the request. Assuming that this is valid, it immediately calls conn.getSocketIOChannel(). There is a race condition where the connection close/destroy has already nulled out this object. This results in the ioChannel being null when the worker finally works on the request. Since the code immediately calls conn.getSocketIOChannel(), an NPE is thrown.
We should defensively guard the SocketIOChannel in the attemptIO method.
Doing this does not change behavior for valid connections; only the error path changes from throwing the NPE to a graceful close.