Description
Hi,
I have stumble upon an error scenario that is difficult to reproduce. I am using OkHttp3 (5.0.0-alpha.11) and Retrofit (2.9.0) with Java 11.
I use Retrofit with OkHttp3 on a backend service that makes quite a lot of post requests to another backend server using the asynchronous mode, but, although it works perfectly fine almost all the time, we had a few cases where the OkHttp3 client closed many of the active requests with the IOException with message "Connection reset" (sorry, I don't have the stacktrace). This, per se, is not an issue as we do retry these requests and the retries did succeed.
The real problem here is that a few post requests we created just a few hundred milliseconds after we got the above exception for the active requests, NEVER got a response. And when I say a response, it even includes a failure response such as an exception or a timeout. It was like these requests never existed. The problem is that the application rely on any response (successful or failure) to move on, and not getting anything is very bad.
I suspect this could be a sign of some kind of race condition somewhere. Unfortunately, I can't reproduce it locally...
I tried to understand what the "Connection reset" error meant, but I couldn't find documentation about that, even a search on the square libraries in GitHub didn't give me information.
OkHttp3 default client configuration:
ConnectionSpec connectionSpec = new ConnectionSpec.Builder(ConnectionSpec.MODERN_TLS)
.tlsVersions(TlsVersion.TLS_1_2, TlsVersion.TLS_1_3)
.build();
client = new OkHttpClient.Builder()
.connectionSpecs(Arrays.asList(connectionSpec))
.connectionPool(new ConnectionPool(2_000, 300, TimeUnit.SECONDS))
.addInterceptor(getUserAgentHeaderInterceptor())
.build();
client.dispatcher().setMaxRequests(10_000);
client.dispatcher().setMaxRequestsPerHost(10_000);
New client based on default client:
protected OkHttpClient createHttpClient(OkHttpClient defaultClient, BaseClientConfig config) {
OkHttpClient.Builder builder = defaultClient.newBuilder();
builder.callTimeout(Duration.ofMillis(30_000));
builder.connectTimeout(Duration.ofMillis(5_000));
builder.readTimeout(Duration.ofMillis(30_000));
builder.writeTimeout(Duration.ofMillis(10_000));
return builder.build();
}
Thanks,
Camiel