This documents an issue with gRPC on high latency links where the call latency is multiple times the roundtrip network latency once the packet size exceeds a certain value.
Related: grpc/grpc-go#5599 which was closed without any conclusions.
-
Create two test VMs on GCP in regions
us-central1andasia-southeast1- Machine type:
c2-standard-4 - All other settings are defaults (Debian bookworm boot image, default network, ...)
- Machine type:
-
Use
gcloud compute ssh ... --tunnel-through-iapto SSH into both machines:-
Get their internal network IPs:
lacop@grpc-latency-us:~$ ip addr 2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000 inet 10.128.0.57/32 metric 100 scope global dynamic ens4 ##### lacop@grpc-latency-sg:~$ ip addr 2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000 inet 10.148.0.3/32 metric 100 scope global dynamic ens4 -
Run ping to get baseline latency numbers:
lacop@grpc-latency-us:~$ ping -q -c 100 -i 0.1 10.148.0.3 PING 10.148.0.3 (10.148.0.3) 56(84) bytes of data. --- 10.148.0.3 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 9963ms rtt min/avg/max/mdev = 205.520/205.555/206.562/0.101 ms, pipe 3 ##### lacop@grpc-latency-sg:~$ ping -q -c 100 -i 0.1 10.128.0.57 PING 10.128.0.57 (10.128.0.57) 56(84) bytes of data. --- 10.128.0.57 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 9963ms rtt min/avg/max/mdev = 205.523/205.548/205.624/0.017 ms, pipe 3 -
Setup iperf (
sudo apt-get install iperf3) and measure bandwidth:lacop@grpc-latency-us:~$ iperf3 -s # If you didn't start the daemon on install ##### lacop@grpc-latency-sg:~$ iperf3 -c 10.128.0.57 -t 30 -i 5 Connecting to host 10.128.0.57, port 5201 [ 5] local 10.148.0.3 port 50344 connected to 10.128.0.57 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-5.00 sec 53.2 MBytes 89.3 Mbits/sec 0 6.04 MBytes [ 5] 5.00-10.00 sec 70.0 MBytes 117 Mbits/sec 0 6.04 MBytes [ 5] 10.00-15.00 sec 72.5 MBytes 122 Mbits/sec 0 6.04 MBytes [ 5] 15.00-20.00 sec 70.0 MBytes 117 Mbits/sec 0 6.04 MBytes [ 5] 20.00-25.00 sec 70.0 MBytes 117 Mbits/sec 0 6.04 MBytes [ 5] 25.00-30.00 sec 73.8 MBytes 124 Mbits/sec 0 6.04 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-30.00 sec 410 MBytes 115 Mbits/sec 0 sender [ 5] 0.00-30.21 sec 409 MBytes 114 Mbits/sec receiver(Other direction yields similar results.)
-
Install golang and uv on both VMs:
wget https://go.dev/dl/go1.24.4.linux-amd64.tar.gz sudo rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.24.4.linux-amd64.tar.gz && rm go1.24.4.linux-amd64.tar.gz export PATH=$PATH:/usr/local/go/bin sudo apt-get install pipx && pipx install uv -
Clone this repository on both VMs.
-
We will run gRPC echo service with the following schema:
message EchoRequest {
bytes payload = 1;
}
message EchoResponse {
int64 payload_length = 1;
}
service EchoService {
rpc Echo(EchoRequest) returns (EchoResponse);
}
Updating the schema
All the generated files are committed to the repo to make it simpler to run, likely no need to run this. Included only for reference.go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
uv run --with=grpcio==1.73.0,grpcio-tools==1.73.0 \
python3 -m grpc_tools.protoc \
-I. \
--python_out=python/ --pyi_out=python/ --grpc_python_out=python/ \
--go_out=golang/echo/ --go_opt=paths=source_relative --go-grpc_out=golang/echo/ --go-grpc_opt=paths=source_relative \
echo.proto
Start the server on one VM:
lacop@grpc-latency-us:~/grpc-repro/golang$ go run server/server.go
2025/07/08 09:44:29 server listening at [::]:50051
Run the client with small payload size first, expected latency is same as ping RTT (ignoring the initial connection setup):
lacop@grpc-latency-sg:~/grpc-repro/golang$ go run client/client.go --address 10.128.0.57 --size 16384 --repeats 5 --delay 1
Connecting to 10.128.0.57:50051...
Payload size: 16384 bytes, repeats: 5, delay: 1.000000 seconds
Sending payload of size 16384 bytes...
Response received: Payload length = 16384 bytes
Round trip time: 822.929856ms
Response received: Payload length = 16384 bytes
Round trip time: 410.677761ms
Response received: Payload length = 16384 bytes
Round trip time: 410.665896ms
Response received: Payload length = 16384 bytes
Round trip time: 410.685099ms
Response received: Payload length = 16384 bytes
Round trip time: 410.678903ms
All requests completed successfully.
Now try larger payload size. Latency will double at some point:
- With
--size 14000still get 200ms consistently. - With
--size 15000get a mix of 200ms and 400ms. - With
--size 16000get 400ms consistently.
I noticed that sending requests more often reduces the latency to RTT:
lacop@grpc-latency-sg:~/grpc-repro/golang$ go run client/client.go --address 10.128.0.57 --size 32000 --repeats 5 --delay 1
Connecting to 10.128.0.57:50051...
Payload size: 32000 bytes, repeats: 5, delay: 1.000000 seconds
Sending payload of size 32000 bytes...
Response received: Payload length = 32000 bytes
Round trip time: 822.89423ms
Response received: Payload length = 32000 bytes
Round trip time: 410.679966ms
Response received: Payload length = 32000 bytes
Round trip time: 411.580503ms
Response received: Payload length = 32000 bytes
Round trip time: 411.614844ms
Response received: Payload length = 32000 bytes
Round trip time: 411.134367ms
All requests completed successfully.
lacop@grpc-latency-sg:~/grpc-repro/golang$ go run client/client.go --address 10.128.0.57 --size 32000 --repeats 5 --delay 0.1
Connecting to 10.128.0.57:50051...
Payload size: 32000 bytes, repeats: 5, delay: 0.100000 seconds
Sending payload of size 32000 bytes...
Response received: Payload length = 32000 bytes
Round trip time: 822.8642ms
Response received: Payload length = 32000 bytes
Round trip time: 207.584479ms
Response received: Payload length = 32000 bytes
Round trip time: 207.65431ms
Response received: Payload length = 32000 bytes
Round trip time: 207.610368ms
Response received: Payload length = 32000 bytes
Round trip time: 207.738869ms
All requests completed successfully.
So does sending dummy requests in the background. This setting is sending 32k payload every one second, but a background goroutine is sending single byte payload every 250ms on the same channel:
lacop@grpc-latency-sg:~/grpc-repro/golang$ go run client/client.go --address 10.128.0.57 --size 32000 --repeats 5 --delay 1 --background-sender-delay 0.25
Connecting to 10.128.0.57:50051...
Payload size: 32000 bytes, repeats: 5, delay: 1.000000 seconds
Background sender started with a delay of 0.250000 seconds...
Sending payload of size 32000 bytes...
Response received: Payload length = 32000 bytes
Round trip time: 409.942309ms
Response received: Payload length = 32000 bytes
Round trip time: 207.081078ms
Response received: Payload length = 32000 bytes
Round trip time: 207.058354ms
Response received: Payload length = 32000 bytes
Round trip time: 207.003944ms
Response received: Payload length = 32000 bytes
Round trip time: 207.266047ms
All requests completed successfully.
Similar timing can be achieved when sending even 10x larger payloads as long as there are frequent requests.
All of this suggests the problem is the gRPC / HTTP2 flow control which reduces window size after some idle period, and each new request has to wait for 2*RTT or longer to get the response and window resized.
I tried every permutation of the gRPC settings I could find.
On server side:
s := grpc.NewServer(
// 32MiB static window sizes
grpc.StaticStreamWindowSize(32*1024*1024),
grpc.StaticConnWindowSize(32*1024*1024),
// 32MiB initial window sizes
grpc.InitialWindowSize(32*1024*1024),
grpc.InitialConnWindowSize(32*1024*1024),
// 32MiB max message and buffer sizes
grpc.MaxRecvMsgSize(32*1024*1024),
grpc.MaxSendMsgSize(32*1024*1024),
grpc.ReadBufferSize(32*1024*1024),
grpc.WriteBufferSize(32*1024*1024),
)On client side:
conn, err := grpc.NewClient(fmt.Sprintf("%s:%d", *address, *port),
grpc.WithTransportCredentials(insecure.NewCredentials()),
// 32MiB static window sizes
grpc.WithStaticConnWindowSize(32*1024*1024),
grpc.WithStaticStreamWindowSize(32*1024*1024),
// 32MiB initial window sizes
grpc.WithInitialConnWindowSize(32*1024*1024),
grpc.WithInitialWindowSize(32*1024*1024),
// 32MiB max message and buffer sizes
grpc.WithReadBufferSize(32*1024*1024),
grpc.WithWriteBufferSize(32*1024*1024),
grpc.WithDefaultCallOptions(
grpc.MaxCallRecvMsgSize(32*1024*1024),
grpc.MaxCallSendMsgSize(32*1024*1024),
),
)This requires replace google.golang.org/grpc => google.golang.org/grpc v1.75.0-dev to get grpc/grpc-go#8283 for the StaticStreamWindowSize and StaticConnWindowSize options.
None of these (or various combinations of them) had any effect.
Ran sudo tcpdump -i ens4 -w session1.pcap on the client side and captured the call of go run client/client.go --address 10.128.0.57 --size 16000 --repeats 5 --delay 1 --background-sender-delay 0. Filtered down to the one relevant TCP stream, results stored in filtered.pcap file.
10.148.0.3 is the client, 10.128.0.57 is the server.
Initial window messages look fine, they configure window to 32MiB:

However looking at a sample message:

We see initial POST from client to server (packet #48), but the body is only sent at T+204ms (#54) after getting the ACK (#53).
Using python client (with grpcio package) with the same golang server behaves the same way:
lacop@grpc-latency-sg:~/grpc-repro/python$ ./client.py 10.128.0.57:50051 12000 5 1 0
Connecting to 10.128.0.57:50051...
Payload size: 12000 bytes, repeats: 5, delay: 1.0 seconds, background delay: 0.0 seconds
Response received: Payload length = 12000 bytes
Round trip time: 631.498ms
Response received: Payload length = 12000 bytes
Round trip time: 205.205ms
Response received: Payload length = 12000 bytes
Round trip time: 205.154ms
Response received: Payload length = 12000 bytes
Round trip time: 205.291ms
Response received: Payload length = 12000 bytes
Round trip time: 205.247ms
lacop@grpc-latency-sg:~/grpc-repro/python$ ./client.py 10.128.0.57:50051 16000 5 1 0
Connecting to 10.128.0.57:50051...
Payload size: 16000 bytes, repeats: 5, delay: 1.0 seconds, background delay: 0.0 seconds
Response received: Payload length = 16000 bytes
Round trip time: 819.888ms
Response received: Payload length = 16000 bytes
Round trip time: 409.416ms
Response received: Payload length = 16000 bytes
Round trip time: 409.456ms
Response received: Payload length = 16000 bytes
Round trip time: 409.979ms
Response received: Payload length = 16000 bytes
Round trip time: 409.448ms
lacop@grpc-latency-sg:~/grpc-repro/python$ ./client.py 10.128.0.57:50051 16000 5 1 0.25
Connecting to 10.128.0.57:50051...
Payload size: 16000 bytes, repeats: 5, delay: 1.0 seconds, background delay: 0.25 seconds
Response received: Payload length = 16000 bytes
Round trip time: 831.116ms
Response received: Payload length = 16000 bytes
Round trip time: 206.303ms
Response received: Payload length = 16000 bytes
Round trip time: 268.517ms
Response received: Payload length = 16000 bytes
Round trip time: 206.253ms
Response received: Payload length = 16000 bytes
Round trip time: 206.335ms
The previous report in grpc/grpc-go#5599 went nowhere mostly due to lack of a self-contained reproduction.
I have tried to build in local.go, using https://pkg.go.dev/google.golang.org/[email protected]/benchmark/latency to inject latency into the local connection.
This does not behave exactly the same way, much larger payload sizes are needed to increase the latency, and it is not consistently 2x after certain size:
$ go run local/local.go --latency=0.1 --payload-size=1500000 --delay=2
2025/07/08 14:18:09 Simulating latency of 0.10 seconds
2025/07/08 14:18:09 server listening at 127.0.0.1:56747
2025/07/08 14:18:10 Connecting to server at 127.0.0.1:56747...
2025/07/08 14:18:10 [client] Sending payload of size 1500000 bytes...
2025/07/08 14:18:11 [server] Received: 1500000
2025/07/08 14:18:11 [client] Response received: Payload length = 1500000 bytes, RTT: 1.832299625s
2025/07/08 14:18:13 [client] Sending payload of size 1500000 bytes...
2025/07/08 14:18:14 [server] Received: 1500000
2025/07/08 14:18:14 [client] Response received: Payload length = 1500000 bytes, RTT: 617.209541ms
2025/07/08 14:18:16 [client] Sending payload of size 1500000 bytes...
2025/07/08 14:18:16 [server] Received: 1500000
2025/07/08 14:18:16 [client] Response received: Payload length = 1500000 bytes, RTT: 508.438ms
2025/07/08 14:18:18 [client] Sending payload of size 1500000 bytes...
2025/07/08 14:18:19 [server] Received: 1500000
2025/07/08 14:18:19 [client] Response received: Payload length = 1500000 bytes, RTT: 406.6395ms
2025/07/08 14:18:21 [client] Sending payload of size 1500000 bytes...
2025/07/08 14:18:21 [server] Received: 1500000
2025/07/08 14:18:21 [client] Response received: Payload length = 1500000 bytes, RTT: 303.73075ms
2025/07/08 14:18:23 Server stopped gracefully.
$ go run local/local.go --latency=0.1 --payload-size=2000000 --delay=2
2025/07/08 14:18:54 Simulating latency of 0.10 seconds
2025/07/08 14:18:54 server listening at 127.0.0.1:56751
2025/07/08 14:18:55 Connecting to server at 127.0.0.1:56751...
2025/07/08 14:18:55 [client] Sending payload of size 2000000 bytes...
2025/07/08 14:18:57 [server] Received: 2000000
2025/07/08 14:18:57 [client] Response received: Payload length = 2000000 bytes, RTT: 2.03785975s
2025/07/08 14:18:59 [client] Sending payload of size 2000000 bytes...
2025/07/08 14:19:00 [server] Received: 2000000
2025/07/08 14:19:00 [client] Response received: Payload length = 2000000 bytes, RTT: 812.116083ms
2025/07/08 14:19:02 [client] Sending payload of size 2000000 bytes...
2025/07/08 14:19:02 [server] Received: 2000000
2025/07/08 14:19:02 [client] Response received: Payload length = 2000000 bytes, RTT: 412.359167ms
2025/07/08 14:19:04 [client] Sending payload of size 2000000 bytes...
2025/07/08 14:19:05 [server] Received: 2000000
2025/07/08 14:19:05 [client] Response received: Payload length = 2000000 bytes, RTT: 411.370292ms
2025/07/08 14:19:07 [client] Sending payload of size 2000000 bytes...
2025/07/08 14:19:07 [server] Received: 2000000
2025/07/08 14:19:07 [client] Response received: Payload length = 2000000 bytes, RTT: 413.100041ms
2025/07/08 14:19:09 Server stopped gracefully.
Sending the dummy background messages does not seem to help, the latency is slowly improving without it too even with longer delay.
Not clear to me what is different about this setup compared to the real network latency on two VMs. But at least it reproduces the unexpected result that with 200ms (100ms one way) latency the gRPC latency is 2x that.