search on the CPU was slow #4399
Replies: 2 comments
-
|
If your vector index is 40GB and your GPU only has 20GB of VRAM, you won’t be able to load the entire index onto the GPU directly Quantization compresses the index significantly, often reducing it to a size that fits in GPU memory. You’ll get a big speed boost with a small drop in accuracy. There are various types and index strings. So you have to find a perfect tradeoff of accuracy and speed. |
Beta Was this translation helpful? Give feedback.
-
|
Feel free to provide more info about which Faiss index you are using, and folks can comment on recommendations. Feel free to paste a repro script of how you call Faiss. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
My vector knowledge base was initially created with a size of 40g, but the search on the CPU was slow. I wanted to move it to the GPU for search. But my GPU only has 20g of video memory. Do we have a good solution to this problem? Or, is there any better way to improve search speed without moving it to the GPU?
Beta Was this translation helpful? Give feedback.
All reactions