Description
In one ScyllaDB workload using an HTTPS server, we noticed that each connection establishment causes a roughly-30ms stall.
While the fact that each TLS handshake is taking 30ms is sad (it means that each shard can only do about 30 of those per second...), what is much more troubling for a Seastar applications is that these handshakes happen without preemption points, and cause a 30ms stall and potentially huge latencies for other requests running on this shard.
This issue isn't about making handshakes faster (which we should do) or reducing their numbers (which we've been doing - #2154 is one attempt at reducing their number). This isssue is about avoiding the stall during the handshake if we can't avoid the handshake.
I can think of different ways to avoid these stalls, with decreasing level of desirability but increasing easiness of implementation:
- Modify the TLS implementation to use Seastar futures and incorporate preemption checks. This is probably not a realistic solution without massive modifications to OpenSSL - unless OpenSSL comes with hooks to do that, which I'm guessing it doesn't.
- A simpler version of 1, probably still requiring modifications to OpenSSL but much fewer, is to run these TLS handshakes in a
seastar::thread
and add preemption points in the right places. - An approach that could work without modifications to OpenSSL is run in it in a different Linux thread. This will be ugly but we've already been reserving in some setups separate cores for networking, so maybe it makes sense to do the same also for TLS requests. Or, even if we run these TLS threads on the same cores as ordinary Seastar (the horror!), we'll still get stalls (when the Seastar thread isn't running) but probably not 30ms stalls.
Another thing we should do that I'll tack onto this issue but perhaps should be split into a different issue, is to add metrics that will be useful for analyzing these slow TLS handshake problems. Perhaps count the number of handshakes or count of various cryptographic calculations or something, and perhaps we can also count the amount of time that each handshake takes (if there is no preemption, it's easy to calculate this time).