High TTFT and Throughput Drop in gRPC Mode Under High Concurrency (vs HTTP Mode) #17001
Unanswered
yansiyu550
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Under the same benchmarking conditions, I tested the Router in both HTTP and gRPC modes using our benchmark tool.
What I observed is:
My question is:
Is this behavior expected?
Could it be because the current gRPC implementation has not been fully optimized for high-concurrency scenarios (e.g., connection management, flow control, or thread scheduling)?
I’m using four 1P1D (1 Prefill + 1 Decode) pairs. The router command is as follows:
Beta Was this translation helpful? Give feedback.
All reactions