llm software analytics tools? #1105

magikRUKKOLA · 2025-04-10T00:13:56Z

magikRUKKOLA
Apr 10, 2025

I am trying to estimate how much faster it can be to run the inference on 16 x RTX 24GB VRAM instead of ... any other match of CPUs from AMD or Intel. Instead of building and testing it myself I would just get some data regarding the technical info (the performance, the capabilities, the possibilities of tighter optimisations at some poiint (?) ). Then I would have to factor in the pcie p2p gpu-gpu interactions (or, alternatively, some additinal NVLINK can be used?), the overhead of schedulers etc. ... ? In the end its possible to give very precise estimate of the performance in the real world.

But at the same time, does it make sense to go with such tensor parallel GPU setup ? May be its better upgrade to the latest DDR5 12 channel and overclock it? Hm ...

But my question is about the parser for the llm analytics regarding the nature of the quants inside the llm. Their types, their sizes, the distiribution of their sizes etc. Some GPUs support of F8 for example? Will that FP8 be (can be?) applied to the task of llm inference -- its unclear, but my point is that its better off to plan ahead of the action.

So, any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm software analytics tools? #1105

{{title}}

Replies: 0 comments

Select a reply

llm software analytics tools? #1105

magikRUKKOLA Apr 10, 2025

Replies: 0 comments

magikRUKKOLA
Apr 10, 2025