-
Notifications
You must be signed in to change notification settings - Fork 132
Description
I noticed that when a server is running in krun container and expose service via TSI, a significant performance drop is present when concurrent short connection is present. So I created some small programs to benchmark what is the number of connection that TSI can handle:
https://github.com/karuboniru/socket_bench
This program just runs ping-pong between client and server, and client will disconnect after 10-100 iterations and restart another connection with 16 threads.
The results are
Normal crun container with pasta networking
=== BENCHMARK RESULTS ===
Threads : 16
Time (Total) : 5.74 s
Connections : 15988
Conn. Rate : 2783 conn/s
--------------------------------
Sent Pkts : 873111
Recv Pkts : 873111
Loss Pkts : 0
Loss Rate : 0.000000 %
Matched : 873111
Success Rate : 100 %
--------------------------------
Tx QPS : 152028
Throughput : 74.23 MB/s
================================
Fedora 43 QEMU guest with passt network (vhost-user)
=== BENCHMARK RESULTS ===
Threads : 16
Time (Total) : 10.18 s
Connections : 58533
Conn. Rate : 5747 conn/s
--------------------------------
Sent Pkts : 3211271
Recv Pkts : 3211271
Loss Pkts : 0
Loss Rate : 0.000000 %
Matched : 3211271
Success Rate : 100 %
--------------------------------
Tx QPS : 315317
Throughput : 153.96 MB/s
================================
Krun container with TSI network
=== BENCHMARK RESULTS ===
Threads : 16
Time (Total) : 20.03 s
Connections : 64
Conn. Rate : 3 conn/s
--------------------------------
Sent Pkts : 3276
Recv Pkts : 321
Loss Pkts : 2955
Loss Rate : 90.201465 %
Matched : 321
Success Rate : 100 %
--------------------------------
Tx QPS : 163
Throughput : 0.08 MB/s
================================
Note that Conn. Rate : 3 conn/s is very low for krun and TSI, and I don't see any high CPU usage during test with krun, which indicates that there might be a severe lock contention in the network stack of krun when creating/destroying connections.
And the number of connections is an integer multiple of number of threads, I suspect there is something that are forcing the creating/destroying process to be synchronous, causing once small window per 5s when new connection could be created?