Description
I played around with some load testing last night using h2load
and wanted to put the results up somewhere in case it's useful.
I also put together a few experiments with batched writing as well at potatosalad/hyper@chunked. The various patterns can be seen under lib/hyper/handlers
.
The Hyper.Handlers.ChunkBatchSpawn
implementation wound up having the most balanced performance behavior from preliminary testing, which is what is referred to as "hyperbeam chunked" below.
Changes made that increased performance slightly:
- Added
batch_send_resp
function - Added
Hyper.Batch
process to enqueue and send writes tobatch_send_resp
- Changed
#[derive(NifMap)]
to#[derive(NifTuple)]
for the request struct - Wrote a simplified
proc_lib
-basedHyper.Protocol
to replaceHyper.Server
Note: There is something wrong with the HTTP/1.1 path on the rusterlium/hyper@batched branch that occasionally causes the server to slow down to the ~100 req/sec range. It's inconsistent, and seems to have a more pronounced effect as more pipelining is used (keeping the number of clients small, but streams high). Not sure what that's all about.
All tests were done on a 2017 MacBook Pro 13" with a 2-core 2.5GHz Intel Core i7 and 16 GB of LPDDR3 RAM.
Protocol | Clients | Streams | Server | Requests / sec |
---|---|---|---|---|
h1 |
100 |
10 |
hyper single_threaded | 140,000 |
h1 |
10 |
100 |
hyper single_threaded | 145,000 |
h2c |
100 |
10 |
hyper single_threaded | 100,000 |
h2c |
10 |
100 |
hyper single_threaded | 145,000 |
h1 |
100 |
10 |
hyper hello | 220,000 |
h1 |
10 |
100 |
hyper hello | 260,000 |
h2c |
100 |
10 |
hyper hello | 135,000 |
h2c |
10 |
100 |
hyper hello | 180,000 |
h1 |
100 |
10 |
hyperbeam master | 65,000 |
h1 |
10 |
100 |
hyperbeam master | 55,000 |
h2c |
100 |
10 |
hyperbeam master | 80,000 |
h2c |
10 |
100 |
hyperbeam master | 90,000 |
h1 |
100 |
10 |
hyperbeam batched | 70,000 |
h1 |
10 |
100 |
hyperbeam batched | 45,000 |
h2c |
100 |
10 |
hyperbeam batched | 95,000 |
h2c |
10 |
100 |
hyperbeam batched | 100,000 |
h1 |
100 |
10 |
hyperbeam chunked | 80,000 |
h1 |
10 |
100 |
hyperbeam chunked | 50,000 |
h2c |
100 |
10 |
hyperbeam chunked | 100,000 |
h2c |
10 |
100 |
hyperbeam chunked | 125,000 |
h1 |
100 |
10 |
cowboy handler | 30,000 |
h1 |
10 |
100 |
cowboy handler | 35,000 |
h2c |
100 |
10 |
cowboy handler | 17,000 |
h2c |
10 |
100 |
cowboy handler | 17,000 |
h2c |
100 |
10 |
cowboy stream handler | 35,000 |
h2c |
10 |
100 |
cowboy stream handler | 40,000 |
h1 |
100 |
10 |
go net/http | 85,000 |
h1 |
10 |
100 |
go net/http | 95,000 |
h2c |
100 |
10 |
go net/http | 45,000 |
h2c |
10 |
100 |
go net/http | 55,000 |
h1 |
100 |
10 |
h2o | 155,000 |
h1 |
10 |
100 |
h2o | 125,000 |
h2c |
100 |
10 |
h2o | 200,000 |
h2c |
10 |
100 |
h2o | 300,000 |
h1 |
100 |
10 |
nginx | 95,000 |
h1 |
10 |
100 |
nginx | 55,000 |
h2c |
100 |
10 |
nginx | 1,000 |
h2c |
10 |
100 |
nginx | 0 |
Server | Version | Command |
---|---|---|
hyper single_threaded | v0.12.24-12-g2114950c | cargo run --release --example single_threaded |
hyper hello | v0.12.24-12-g2114950c | cargo run --release --example hello |
hyperbeam master | 09459c0 | elixir -S mix run --no-halt |
hyperbeam batched | 97f2ad9 | elixir -S mix run --no-halt |
hyperbeam chunked | c7f9c48 | elixir -S mix run --no-halt |
cowboy handler | OTP/21.1,elixir/1.8.1,cowboy/2.6.1 | elixir -S mix run --no-halt |
cowboy stream handler | OTP/21.1,elixir/1.8.1,cowboy/2.6.1 | elixir -S mix run --no-halt |
go net/http | go1.11.4 | go build server.go && ./server |
h2o | 2.3.0-DEV@330e23eb | h2o --conf "$(pwd)/h2o.conf" |
nginx | nginx/1.15.9 | nginx -c "$(pwd)/nginx.conf" |
Client commands were:
# h1 (100 clients, 10 streams)
h2load --h1 --duration=5s --warm-up-time=1s --interval=1s --clients=100 --max-concurrent-streams=10 --requests=0 'http://127.0.0.1:3000/'
# h1 (10 clients, 100 streams)
h2load --h1 --duration=5s --warm-up-time=1s --interval=1s --clients=10 --max-concurrent-streams=100 --requests=0 'http://127.0.0.1:3000/'
# h2c (100 clients, 10 streams)
h2load --duration=5s --warm-up-time=1s --interval=1s --clients=100 --max-concurrent-streams=10 --requests=0 'http://127.0.0.1:3000/'
# h2c (10 clients, 100 streams)
h2load --duration=5s --warm-up-time=1s --interval=1s --clients=10 --max-concurrent-streams=100 --requests=0 'http://127.0.0.1:3000/'
Server implementations aren't fully documented anywhere (yet), but the cowboy and go related things are in potatosalad/ssm-stress-test.