[ENH] Add SIMD for maxscore by Sicheng-Pan · Pull Request #6865 · chroma-core/chroma

Sicheng-Pan · 2026-04-10T01:43:55Z

Description of changes

This is PR #4 of the BlockMaxMaxScore series, stacked on hammad/maxscore_lazy_cursor. It adds SIMD acceleration for the two hottest scalar paths in the query engine.

New functionality
- SIMD f16→f32 bulk conversion (sparse_posting_block.rs): Replaces the scalar convert_f16_to_f32 with a platform-dispatched implementation:
  - aarch64 NEON: Bit-manipulation approach (shift+mask+bias) processing 8 values per iteration via vld1q_u16 / vmovl_u16 / vreinterpretq_f32_u32. Handles all normal f16 values; subnormals map to tiny positive values (acceptable for SPLADE/BM25 weights).
  - x86_64 F16C: _mm256_cvtph_ps intrinsic, 8 values per iteration. Runtime-detected via is_x86_feature_detected!("f16c").
  - Scalar fallback: Unchanged half crate conversion for unsupported platforms.
  - Used by decompress_values_into() — the bulk value decompression path for Eager cursors and ensure_forward_block.
- SIMD filter_competitive (maxscore.rs): Replaces the scalar budget-pruning compaction with SIMD-accelerated 4-wide comparison:
  - aarch64 NEON: vcgtq_f32 comparison, per-lane mask extraction, branchless scatter of survivors.
  - x86_64 SSE2: _mm_cmpgt_ps + _mm_movemask_ps, bit-scan scatter. Runtime-detected.
  - Scalar fallback: Unchanged loop for unsupported platforms.
  - Both handle remainder elements (not multiple of 4) with scalar tail.

Test plan

convert_f16_simd_matches_scalar — Verifies SIMD f16→f32 output matches scalar at 14 different sizes (1, 3, 7, 8, 9, 15, 16, 17, 31, 63, 64, 100, 256, 1000) including remainder paths.
filter_competitive_simd_matches_scalar — Verifies SIMD filter output matches scalar at 11 sizes including remainder paths.
filter_competitive_all_pass / filter_competitive_none_pass — Edge cases for budget pruning.
All existing roundtrip and recall tests exercise the SIMD paths transparently (dispatch is automatic).
Tests pass locally with cargo test

Migration plan

No migration needed. This is a drop-in performance optimization with no format or API changes.

Observability plan

No new instrumentation needed.

Documentation Changes

No user-facing API changes. SAFETY comments added to all unsafe SIMD blocks.

github-actions · 2026-04-10T01:44:04Z

Sicheng-Pan · 2026-04-10T01:44:12Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

[ENH] Add maxscore index to metadata segment #6880
[ENH] Add maxscore option in schema #6878
[ENH] Benchmark maxscore #6866
[ENH] Add SIMD for maxscore #6865 👈 (View in Graphite)
[ENH] Add maxscore lazy cursor #6829
[ENH] Add basic maxscore writer/reader #6825
[ENH] Add SparsePostingBlock #6823
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

propel-code-bot · 2026-04-10T01:44:39Z

Add SIMD Acceleration for maxscore Candidate Filtering and f16 Value Decompression

This PR introduces SIMD-accelerated implementations for two hot paths in the Rust sparse query engine: candidate budget pruning in maxscore and bulk f16→f32 conversion in sparse posting block decompression. The changes add architecture-specific fast paths for x86_64 and aarch64, with runtime feature detection on x86_64 and scalar fallbacks preserved for unsupported platforms.

In rust/index/src/sparse/maxscore.rs, filter_competitive now dispatches to SIMD implementations (SSE2 on x86_64, NEON on aarch64) while retaining the original scalar logic in filter_competitive_scalar. In rust/types/src/sparse_posting_block.rs, convert_f16_to_f32 now dispatches to F16C (x86_64) or NEON (aarch64) conversion routines, with remainder handling and scalar fallback maintained. New tests validate SIMD/scalar equivalence across multiple non-aligned sizes and edge cases.

This summary was automatically generated by @propel-code-bot

propel-code-bot

No issues were found; the SIMD enhancements appear correct, well-tested, and low risk.

Status: No Issues Found | Risk: Low

Review Details

📁 2 files reviewed | 💬 0 comments

…NEON/SSE2)

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add SparsePostingBlock #6823

Open

2 tasks

Sicheng-Pan marked this pull request as ready for review April 10, 2026 01:44

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add basic maxscore writer/reader #6825

Open

1 task

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add maxscore lazy cursor #6829

Open

1 task

Sicheng-Pan changed the title ~~Add SIMD f16→f32 conversion (NEON/F16C) and SIMD filter_competitive (NEON/SSE2)~~ [ENH] Add SIMD for maxscore Apr 10, 2026

propel-code-bot bot reviewed Apr 10, 2026

View reviewed changes

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from 6bbf6e7 to d47e07c Compare April 10, 2026 03:11

Sicheng-Pan force-pushed the hammad/maxscore_lazy_cursor branch from b3e6c5a to fcca860 Compare April 10, 2026 03:11

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Benchmark maxscore #6866

Open

5 tasks

Sicheng-Pan force-pushed the hammad/maxscore_lazy_cursor branch from fcca860 to da68907 Compare April 10, 2026 03:26

Sicheng-Pan force-pushed the hammad/maxscore_simd branch 2 times, most recently from 6920d76 to 0e9eaa3 Compare April 10, 2026 17:18

Sicheng-Pan force-pushed the hammad/maxscore_lazy_cursor branch from da68907 to 59904b4 Compare April 10, 2026 17:18

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add maxscore option in schema #6878

Open

Sicheng-Pan force-pushed the hammad/maxscore_lazy_cursor branch from 59904b4 to c76ce23 Compare April 10, 2026 20:11

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from 0e9eaa3 to ac08b0c Compare April 10, 2026 20:11

Add SIMD f16→f32 conversion (NEON/F16C) and SIMD filter_competitive (…

014e4ed

…NEON/SSE2)

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from ac08b0c to 014e4ed Compare April 10, 2026 20:36

Sicheng-Pan force-pushed the hammad/maxscore_lazy_cursor branch from c76ce23 to 4b94ecf Compare April 10, 2026 20:36

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add maxscore index to metadata segment #6880

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add SIMD for maxscore#6865

[ENH] Add SIMD for maxscore#6865
Sicheng-Pan wants to merge 1 commit intohammad/maxscore_lazy_cursorfrom
hammad/maxscore_simd

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Uh oh!

propel-code-bot bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

propel-code-bot bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Sicheng-Pan commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

github-actions bot commented Apr 10, 2026

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

Sicheng-Pan commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

propel-code-bot bot commented Apr 10, 2026 •

edited

Loading