Skip to content

Hoist SIMD dispatch out of approx_topk hot loops#5073

Open
algoriddle wants to merge 1 commit intofacebookresearch:mainfrom
algoriddle:export-D100169491
Open

Hoist SIMD dispatch out of approx_topk hot loops#5073
algoriddle wants to merge 1 commit intofacebookresearch:mainfrom
algoriddle:export-D100169491

Conversation

@algoriddle
Copy link
Copy Markdown
Contributor

Summary:
Three call sites dispatch SIMD level inside hot loops via
HeapWithBuckets::bs_addn(), which calls with_simd_level_256bit on
every invocation. The SIMD level is constant for the process lifetime,
so this is pure overhead.

  • LocalSearchQuantizer::icm_encode_step: dispatch was inside
    n × n_iters × M loop. Wrap omp parallel for in
    with_simd_level_256bit, call HeapWithBucketsCMaxFloat<16,1,SL>
    directly.

  • residual_quantizer_encode_steps.cpp (2 call sites): dispatch per
    iteration of n-parallel loops. Hoist outside omp loop. For the
    second site, merge with the existing with_simd_level_256bit for
    compute_cent_distances_simd.

  • Add approx_topk_by_mode() helper to approx_topk.h to
    consolidate the duplicated switch(approx_topk_mode) blocks.

Differential Revision: D100169491

Summary:
Three call sites dispatch SIMD level inside hot loops via
HeapWithBuckets::bs_addn(), which calls with_simd_level_256bit on
every invocation. The SIMD level is constant for the process lifetime,
so this is pure overhead.

- LocalSearchQuantizer::icm_encode_step: dispatch was inside
  n × n_iters × M loop. Wrap omp parallel for in
  with_simd_level_256bit, call HeapWithBucketsCMaxFloat<16,1,SL>
  directly.

- residual_quantizer_encode_steps.cpp (2 call sites): dispatch per
  iteration of n-parallel loops. Hoist outside omp loop. For the
  second site, merge with the existing with_simd_level_256bit for
  compute_cent_distances_simd.

- Add approx_topk_by_mode<SL>() helper to approx_topk.h to
  consolidate the duplicated switch(approx_topk_mode) blocks.

Differential Revision: D100169491
@meta-cla meta-cla bot added the CLA Signed label Apr 10, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Apr 10, 2026

@algoriddle has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100169491.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant