-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Preconditioning #137997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Preconditioning #137997
Conversation
…ctors to become indistinguishable
| final FloatVectorValues floatVectorValues = getFloatVectorValues(fieldWriter.fieldInfo, fieldWriter.delegate, maxDoc); | ||
| FloatVectorValues floatVectorValues = getFloatVectorValues(fieldWriter.fieldInfo, fieldWriter.delegate, maxDoc); | ||
| // precondition the vectors if necessary | ||
| floatVectorValues = preconditionVectors(floatVectorValues); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can precondition just immediately before quantization and precondition the centroids after clustering. Maybe applying it before clustering works ok...I need to think about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried both; seems to work fine to precondition before creating the centroids; still worth thinking about though. I will post numbers for review.
|
Ran on a variety of datasets so we could see the impact of preconditioning on QPS in both scenarios where we expect to see improvements and scenarios where we don't. My quick assessment from running these is we generally seem to get the best result out of using the variance-based permutation matrix. For equivalent recall it has reasonable QPS. And in general when attempting to recover recall there are some dataset where it's absolutely necessary to use preconditioning. It would seem this is valuable even at a segment level imo. Thinking that it makes sense to follow up here by investigating if I can sample the average variances across dimensions as a mechanism to determine when to use preconditioning automatically. Below are all the discussed datasets except for mnist. dbpediacohereglove 100dhotpotqamulti_towerfashion mnistgistsift |
I realize that this is likely best, in that its applying the permutation per segment already. But the huge benefit of random is that it can be applied once, globally for the query vector and conceivably random permutation will be much cheaper at index and merge as the permutation matrix doesn't need to evolve with the vector distributions, and instead is static. |
Initial pass at a preconditioning utils and wiring through for the disk bbq next format so we can iterate to see it's impact on quantization.