Support float16 quantization and other quantization performance improvements#41
Merged
1yefuwang1 merged 5 commits intomainfrom Feb 10, 2026
Merged
Support float16 quantization and other quantization performance improvements#411yefuwang1 merged 5 commits intomainfrom
1yefuwang1 merged 5 commits intomainfrom
Conversation
…unroll Add benchmark suites for QuantizeF32ToF16, QuantizeF32ToBF16, F16ToF32, and BF16ToF32. Unroll the HalfFloatToF32 main loop to process 2*NF elements per iteration, improving instruction-level parallelism and yielding a 1.6-2x speedup on F16/BF16 to F32 dequantization. Co-Authored-By: Claude <noreply@anthropic.com>
Replace hn::Transform with a manually 4x-unrolled multiply loop in F32 NormalizeImpl for ~1.3x speedup at dim >= 512. Fix BF16 InnerProduct benchmark that was incorrectly measuring F32 overload. Add ClobberMemory to normalize benchmarks. Co-Authored-By: Claude <noreply@anthropic.com>
* Initial plan * Add float16 quantization support: SIMD ops, distance spaces, vector types, quantization, virtual table support, tests, and benchmarks Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Fix test bug: use correct loop variable j instead of i in Normalize_F32ToF16 test Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Final: float16 quantization support complete Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact from tracking and add to gitignore Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact symlink from repository Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Guard F16-specific SIMD ops with #if !HWY_HAVE_FLOAT16 to use native ops when available Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Guard F16-specific SIMD ops with #if !HWY_HAVE_FLOAT16 to use native ops when available Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact from tracking Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com>
Owner
Author
|
@copilot inspect the CI failure and try to fix it |
Contributor
|
@1yefuwang1 I've opened a new pull request, #42, to work on those changes. Once the pull request is ready, I'll request review from you. |
* Initial plan * Add Float16 vector type handling in query executor Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.