Description
TLDR: HTTP::Client.get
HTTPS request in 16.92 ms 🐢 vs 1.43 ms 🐇
HTTPS GET request:
new SSL_CTX 59.11 ( 16.92ms) (± 5.70%) 66.9kB/op 11.87× slower
reuse SSL_CTX 701.40 ( 1.43ms) (± 1.21%) 66.8kB/op fastest
The bug: HTTP::Client
creates a new OpenSSL:SSL::Context::Client
for every https://...
connection.
Desired fix: use a global, default OpenSSL::SSL::Context::Client
in HTTP::Client
We have a sharded cluster of Crystal processes doing a large number of HTTPS requests per second at Heii On-Call because we're continuously monitoring our customers' API endpoints and websites. I'm embarrassed that I've only just discovered that we're burning an order of magnitude more CPU time on HTTPS requests than we need to be, all because HTTP::Client
is creating a new OpenSSL:SSL::Context::Client
for every new connection.
Creating a new SSL_CTX is slow because of loading CA certificates
OpenSSL::SSL::Context::Client.new
calls OpenSSL::SSL::Context#set_default_verify_paths
, which calls LibSSL.ssl_ctx_set_default_verify_paths
. This loads all of the system CA certificates:
The CAfile is processed on execution of the SSL_CTX_load_verify_locations() function.
Here's a quick benchmark:
docker run --rm -it crystallang/crystal:1.15.1 /bin/bash
# crystal eval --release << EOF
require "benchmark"
require "openssl"
Benchmark.ips(calculation: 5.seconds, warmup: 1.second, interactive: false) { |x|
x.report("new") { OpenSSL::SSL::Context::Client.new }
x.report("insecure") { OpenSSL::SSL::Context::Client.insecure }
x.report("insecure+load") { OpenSSL::SSL::Context::Client.insecure.set_default_verify_paths }
}
EOF
new 126.77 ( 7.89ms) (± 1.80%) 118B/op 75.68× slower
insecure 9.59k (104.23µs) (± 3.46%) 112B/op fastest
insecure+load 126.61 ( 7.90ms) (± 2.41%) 113B/op 75.78× slower
# crystal version
Crystal 1.15.1 [89944bf17] (2025-02-04)
LLVM: 18.1.6
Default target: x86_64-unknown-linux-gnu
# dpkg -l openssl | tail -n 1
ii openssl 3.0.13-0ubuntu3.4 amd64 Secure Sockets Layer toolkit - cryptographic utility
(Note that this issue may vary based on OpenSSL versions. It's possibly a performance regression that is being tracked openssl/openssl#20286 .)
SSL_CTX is designed to be set up once and reused
A single SSL_CTX object can be used to create many connections (each represented by a separate SSL object). [...] Note that you should not normally make changes to an SSL_CTX after the first SSL object has been created from it.
SSL_CTX: This is the global context structure which is created by a server or client once per program life-time and which holds mainly default values for the SSL structures which are later created for the connections.
One SSL_CTX can be used for an unlimited number of connections
An SSL_CTX object should not be changed after it is used to create any SSL objects or from multiple threads concurrently, since the implementation does not provide serialization of access for these cases.
An SSL_CTX may be used on multiple threads provided it is not reconfigured.
HTTP::Client.get benchmark
docker run --rm -it crystallang/crystal:1.15.1 /bin/bash
# crystal eval --release << EOF
require "benchmark"
require "http"
require "openssl"
uri = URI.parse(ENV["SSLTEST_URL"])
global_openssl_client_context = OpenSSL::SSL::Context::Client.new
headers = HTTP::Headers{"Connection" => "close"}
puts "HTTPS GET request:"
Benchmark.ips(calculation: 5.seconds, warmup: 1.second, interactive: false) do |x|
x.report("new SSL_CTX") do
HTTP::Client.get(uri, headers: headers)
end
x.report("reuse SSL_CTX") do
client = HTTP::Client.new(uri, tls: global_openssl_client_context)
client.get(uri.request_target, headers: headers)
client.close
end
end
EOF
HTTPS GET request:
new SSL_CTX 59.11 ( 16.92ms) (± 5.70%) 66.9kB/op 11.87× slower
reuse SSL_CTX 701.40 ( 1.43ms) (± 1.21%) 66.8kB/op fastest
(SSLTEST_URL
is an internal HTTPS service running on the same machine.)
Possible resolution
In src/openssl/ssl/context.cr
, I could imagine:
class_getter(default) { new }
and then in src/http/client.cr#initialize
replacing the call to OpenSSL::SSL::Context::Client.new
with OpenSSL::SSL::Context::Client.default
.
I've tested it -- this works and fixes the performance issue.
However, in #2689 I saw that @jhass removed a OpenSSL::SSL::Context.default
so as not to expose a global mutable default, which makes sense.
Is there a good solution here? It would be great to use a single global default SSL_CTX
for performance, but it seems like leaking the potentially-mutable context is inevitable. For example, could we create a ReadOnlyContext
that didn't allow mutation?