RPC results between main llamacpp and ik lcpp (GLM 4.6 full GPU, soon DeepSeek V3/R1 or Kimi K2) #1043
Replies: 4 comments 5 replies
-
|
Why not @Thireus quants? Why not llama-sweep-bench ? |
Beta Was this translation helpful? Give feedback.
-
|
@Panchovix For the OOM issues in |
Beta Was this translation helpful? Give feedback.
-
|
Thank you! |
Beta Was this translation helpful? Give feedback.
-
Well, I am using the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello guys, hope you're having a good day.
Just some small tests with RPC as today, 6th December of 2025 if it helps for reference.
My setup is:
Server PC:
Client PC:
For the first test, I used GLM 4.6
Without RPC, using all GPUs on the server but the 3090s:
In lcpp: 1105.13 t/s PP, 27.80 t/s TG
In iklcpp (24576 ctx): 1176.91 t/s PP, 26.12 t/s TG
I had to reduce the ctx on iklcpp as I was getting OOM.
With RPC, replacing 1 5090 with RPC
In lcpp: 782.66 t/s PP, 23.88 t/s TG
In iklcpp (24576 ctx): 825.39 t/s PP, 22.5 t/s TG (use RPC0[192.168.50.2:50052] instead of RPC0 on devices)
For DeepSeek V3 0324:
RPC without a 5090
lcpp: 211.25 t/s PP, 10.73 t/s TG
iklcpp: 217.68 t/s PP, 10.63 t/s TG
RPC with 8 GPUs
lcpp: 216.95 t/s PP, 11.43 t/s TG
iklcpp: 234.02 t/s PP, 11.52 t/s TG
Hope it helps!
EDIT 19/12/25: Added info of DeepSeek V3 0324 Q3_K_XL with RPC-
Beta Was this translation helpful? Give feedback.
All reactions