This repository was archived by the owner on May 29, 2025. It is now read-only.
Replies: 1 comment
-
|
I've never tested on Xeons with good amount of fast memory, but in theory they should work well - this project has AVX2 kernels written to support CPU inference and uses OMP for multi-threading. There might be some tuning that would be required if the target HW has a lot of memory bandwidth available though, as CPU perf was only tested on desktop systems with a moderate amount of memory bandwidth (eg ~50 GB/s on Zen 4 platform). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
XEON seems a good platform to host large model
Beta Was this translation helpful? Give feedback.
All reactions