We are continuously addressing and improving the SDK, if possible, make sure the problem persist in the latest SDK version.
The issue appears to be present in both 3.55.0 and 3.58.0 (latest stable) based on IL inspection of Microsoft.Azure.Cosmos.Direct.dll. Since the source code for the Direct transport layer is not publicly available, this investigation was done by decompiling the binary - we may be missing context.
Describe the bug
TCP keepalive time and interval are never applied on Linux in Direct TCP mode. Connection.SetKeepAliveSocketOptions passes uint values to Socket.SetSocketOption, which resolves to the object overload. A boxed uint fails the is int check inside that overload, causing ArgumentException. The exception is silently caught by IsKeepAliveCustomizationSupported(), which returns false, making the entire method a no-op.
On Windows, the IOControl(KeepAliveValues) code path uses a byte[] and returns early, so it works correctly and never hits the broken code.
Result: Linux connections use OS default keepalive (tcp_keepalive_time=7200s, 2 hours) instead of the intended 30 seconds. Dead idle RNTBD connections survive for 2+ hours instead of ~39 seconds.
To Reproduce
The core issue is a C# type system problem — boxed uint is not int:
// This is what the SDK does on Linux (from IL of Microsoft.Azure.Cosmos.Direct.dll):
using var socket = new Socket(SocketType.Stream, ProtocolType.Tcp);
uint keepAliveTime = 30u; // field type in Connection.cs
// Resolves to SetSocketOption(level, name, object) — NOT the int overload
// Inside: "optionValue is int" → false for boxed uint → throws ArgumentException
socket.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, keepAliveTime);
// This works — what it should do:
socket.SetSocketOption(SocketOptionLevel.Tcp, (SocketOptionName.TcpKeepAliveTime, (int)keepAliveTime);
IL from Connection.SetKeepAliveSocketOptions in v3.58.0 confirming the uint → box UInt32 → object overload path:
IL_0010: ldsfld uint32 ...Connection::SocketOptionTcpKeepAliveInterval
IL_0015: box [netstandard]System.UInt32
IL_001a: callvirt instance void Socket::SetSocketOption(SocketOptionLevel, SocketOptionName, object)
And IsKeepAliveCustomizationSupported silently swallows the exception:
try {
socket.SetSocketOption(..., socketOptionTcpKeepAliveInterval); // uint → throws
return true;
} catch {
return false; // keepalive customization "not supported"
}
Expected behavior
On Linux, TCP keepalive should be configured to tcp_keepalive_time=30s, tcp_keepalive_intvl=1s — matching the Windows IOControl(KeepAliveValues) configuration. Dead idle connections should be detected within ~39 seconds.
Actual behavior
On Linux, SetKeepAliveSocketOptions is a no-op. Keepalive is enabled (SO_KEEPALIVE=true) but with OS defaults: tcp_keepalive_time=7200s (2 hours). Dead idle RNTBD connections remain in the connection pool for up to 2 hours.
We discovered this during a production incident where a transient network issue affected cross-region Direct TCP connections to the CosmosDB write endpoint. We run 5 identical partitions — 4 on Windows and 1 on Linux. All hit the same issue simultaneously. The 4 Windows partitions recovered in under 5 minutes. The Linux partition took ~55 minutes. From service logs, we identified dead RNTBD connections (status: Connected, callsPendingReceive: 0, lastReceive timestamps 20-57 minutes stale) that remained in the pool producing intermittent 408 (RequestTimeout) errors whenever randomly selected by the load balancer.
Environment summary
SDK Version: 3.55.0 (also confirmed in 3.58.0 IL)
OS Version: Microsoft Azure Linux 3.0 (AKS), .NET 9.0.14
Additional context
Suggested fix — cast to int at the call site or change the field types:
// Option A: cast at call site
clientSocket.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveInterval, (int)SocketOptionTcpKeepAliveInterval);
clientSocket.SetSocketOption(SocketOptionLevel.Tcp, SocketOptionName.TcpKeepAliveTime, (int)SocketOptionTcpKeepAliveTime);
// Option B: change field types from uint to int
private static readonly int SocketOptionTcpKeepAliveInterval = (int)GetUInt32FromEnvironmentVariableOrDefault(...);
private static readonly int SocketOptionTcpKeepAliveTime = (int)GetUInt32FromEnvironmentVariableOrDefault(...);
The issue appears to be present in both 3.55.0 and 3.58.0 (latest stable) based on IL inspection of
Microsoft.Azure.Cosmos.Direct.dll. Since the source code for the Direct transport layer is not publicly available, this investigation was done by decompiling the binary - we may be missing context.Describe the bug
TCP keepalive time and interval are never applied on Linux in Direct TCP mode.
Connection.SetKeepAliveSocketOptionspassesuintvalues toSocket.SetSocketOption, which resolves to theobjectoverload. A boxeduintfails theis intcheck inside that overload, causingArgumentException. The exception is silently caught byIsKeepAliveCustomizationSupported(), which returnsfalse, making the entire method a no-op.On Windows, the
IOControl(KeepAliveValues)code path uses abyte[]and returns early, so it works correctly and never hits the broken code.Result: Linux connections use OS default keepalive (
tcp_keepalive_time=7200s, 2 hours) instead of the intended 30 seconds. Dead idle RNTBD connections survive for 2+ hours instead of ~39 seconds.To Reproduce
The core issue is a C# type system problem — boxed
uintis notint:IL from
Connection.SetKeepAliveSocketOptionsin v3.58.0 confirming theuint→box UInt32→objectoverload path:And
IsKeepAliveCustomizationSupportedsilently swallows the exception:Expected behavior
On Linux, TCP keepalive should be configured to
tcp_keepalive_time=30s,tcp_keepalive_intvl=1s— matching the WindowsIOControl(KeepAliveValues)configuration. Dead idle connections should be detected within ~39 seconds.Actual behavior
On Linux,
SetKeepAliveSocketOptionsis a no-op. Keepalive is enabled (SO_KEEPALIVE=true) but with OS defaults:tcp_keepalive_time=7200s(2 hours). Dead idle RNTBD connections remain in the connection pool for up to 2 hours.We discovered this during a production incident where a transient network issue affected cross-region Direct TCP connections to the CosmosDB write endpoint. We run 5 identical partitions — 4 on Windows and 1 on Linux. All hit the same issue simultaneously. The 4 Windows partitions recovered in under 5 minutes. The Linux partition took ~55 minutes. From service logs, we identified dead RNTBD connections (status: Connected,
callsPendingReceive: 0,lastReceivetimestamps 20-57 minutes stale) that remained in the pool producing intermittent 408 (RequestTimeout) errors whenever randomly selected by the load balancer.Environment summary
SDK Version: 3.55.0 (also confirmed in 3.58.0 IL)
OS Version: Microsoft Azure Linux 3.0 (AKS), .NET 9.0.14
Additional context
Suggested fix — cast to
intat the call site or change the field types: