Request Description
1. Motivation & Model Examples
As discussed in PR #33633 with @mitruska and @nshchego, the current v11::TopK and earlier operations have implementation-defined NaN ordering behavior. Because different frontend frameworks handle NaNs differently (e.g., NumPy treats them as smallest, PyTorch treats them as largest), there is a need for a deterministic, configurable approach to NaN handling in OpenVINO.
Model Examples Benefiting from this:
- Multimodal AI Models (CLIP, Vision Transformers): Embeddings can occasionally produce
NaN values due to numerical instabilities in FP16/BF16 projections. If TopK propagates these NaNs unpredictably, it corrupts downstream similarity searches.
- RAG (Retrieval-Augmented Generation) Pipelines: When retrieving the top
K relevant document chunks, a single rogue NaN similarity score can currently push valid, highly-relevant documents out of the TopK results, breaking the retrieval chain entirely.
2. Proof of Concept (POC)
Following @mitruska's recommendation to prepare a POC to define constraints and benefits, I have built a comprehensive standalone C++ implementation here:
POC Repository & README: Lagmator22/TopK-NaN-OpenVINO
The POC demonstrates a proposed nan_mode attribute with three explicit modes:
NAN_AS_SMALLEST (Matches NumPy behavior: NaNs are pushed to the bottom of descending sorts)
NAN_AS_LARGEST (Matches PyTorch behavior: NaNs are pushed to the top of descending sorts)
NONE (Undefined/Fastest path: Preserves strict backward compatibility with current OpenVINO performance footprints)
3. Constraints & Benchmarking
To ensure no performance regressions for existing users, the POC includes micro-benchmarks comparing the NONE path against the NAN_AS_SMALLEST/NAN_AS_LARGEST paths.
- When
nan_mode = NONE is selected, the sorting completely bypasses the NaN checks, ensuring identical performance to the current v11::TopK.
- When explicit handling is requested, the overhead is minimal and safely isolated.
4. Next Steps
Based on these findings, I am proposing the introduction of v17::TopK referencing this nan_mode structure. I would appreciate it if the maintainers could review the POC implementation and let me know if this architectural direction aligns with the project's vision!
(Note: I plan to write the upstream PR for v17::TopK as part of my ongoing open-source contributions to OpenVINO).
CC: @mitruska @nshchego @kblaszczak-intel @praasz
Feature Use Case
No response
Issue submission checklist
Request Description
1. Motivation & Model Examples
As discussed in PR #33633 with @mitruska and @nshchego, the current
v11::TopKand earlier operations have implementation-defined NaN ordering behavior. Because different frontend frameworks handle NaNs differently (e.g., NumPy treats them as smallest, PyTorch treats them as largest), there is a need for a deterministic, configurable approach to NaN handling in OpenVINO.Model Examples Benefiting from this:
NaNvalues due to numerical instabilities in FP16/BF16 projections. IfTopKpropagates these NaNs unpredictably, it corrupts downstream similarity searches.Krelevant document chunks, a single rogueNaNsimilarity score can currently push valid, highly-relevant documents out of the TopK results, breaking the retrieval chain entirely.2. Proof of Concept (POC)
Following @mitruska's recommendation to prepare a POC to define constraints and benefits, I have built a comprehensive standalone C++ implementation here:
POC Repository & README: Lagmator22/TopK-NaN-OpenVINO
The POC demonstrates a proposed
nan_modeattribute with three explicit modes:NAN_AS_SMALLEST(Matches NumPy behavior: NaNs are pushed to the bottom of descending sorts)NAN_AS_LARGEST(Matches PyTorch behavior: NaNs are pushed to the top of descending sorts)NONE(Undefined/Fastest path: Preserves strict backward compatibility with current OpenVINO performance footprints)3. Constraints & Benchmarking
To ensure no performance regressions for existing users, the POC includes micro-benchmarks comparing the
NONEpath against theNAN_AS_SMALLEST/NAN_AS_LARGESTpaths.nan_mode = NONEis selected, the sorting completely bypasses the NaN checks, ensuring identical performance to the currentv11::TopK.4. Next Steps
Based on these findings, I am proposing the introduction of
v17::TopKreferencing thisnan_modestructure. I would appreciate it if the maintainers could review the POC implementation and let me know if this architectural direction aligns with the project's vision!(Note: I plan to write the upstream PR for v17::TopK as part of my ongoing open-source contributions to OpenVINO).
CC: @mitruska @nshchego @kblaszczak-intel @praasz
Feature Use Case
No response
Issue submission checklist