ov-master-rebase: vision model divide-by-zero and tokenizer DLL loading in build-master

xzhan34 · xzhan34 · commit 911492744dc4 · 2026-03-12T13:24:13.000+08:00
Two runtime bugs fixed for build-master (ENABLE_NEW_ARCH_OPS=OFF):

1. Vision model INT4 quantization (divide-by-zero crash):
   The shared SafetensorsWeightFinalizer applied INT4_ASYM quantization to both
   text and vision models. Vision encoder weights must NOT be quantized - INT4
   weights cause STATUS_INTEGER_DIVIDE_BY_ZERO (0xC0000094) in the CPU plugin
   during vision inference. Fix: use a separate non-quantizing finalizer for
   create_qwen3_omni_vision_model.

2. Tokenizer DLL not found (core.cpp:193 exception):
   The tokenizers_dll_name CMake variable was only defined inside the
   if(ENABLE_NEW_ARCH_OPS) guard block. When ENABLE_NEW_ARCH_OPS=OFF, the
   post-build copy for the always-built targets (modeling_qwen3_omni,
   modeling_qwen3_omni_tts_min) used an empty variable, so openvino_tokenizers.dll
   was never copied next to the exe. Fix: define tokenizers_dll_name before the
   always-built targets section.
diff --git a/src/cpp/src/modeling/samples/CMakeLists.txt b/src/cpp/src/modeling/samples/CMakeLists.txt
@@ -48,6 +48,17 @@ set_target_properties(modeling_qwen3_vl PROPERTIES
 
 endif() # ENABLE_NEW_ARCH_OPS -- modeling_qwen3_vl
 
+# ── Tokenizer library name (needed for post-build copy on all targets) ──
+if(NOT tokenizers_dll_name)
+    if(WIN32)
+        set(tokenizers_dll_name "openvino_tokenizers$<$<CONFIG:Debug>:d>.dll")
+    elseif(APPLE)
+        set(tokenizers_dll_name "libopenvino_tokenizers$<$<CONFIG:Debug>:d>.dylib")
+    else()
+        set(tokenizers_dll_name "libopenvino_tokenizers$<$<CONFIG:Debug>:d>.so")
+    endif()
+endif()
+
 # ── Always-built targets (needed for build-master too) ──
 
 add_executable(modeling_qwen3_omni
diff --git a/src/cpp/src/modeling/samples/modeling_qwen3_omni.cpp b/src/cpp/src/modeling/samples/modeling_qwen3_omni.cpp
@@ -688,14 +688,19 @@ int main(int argc, char* argv[]) try {
     ov::genai::safetensors::SafetensorsWeightSource source(std::move(data));
     ov::genai::safetensors::SafetensorsWeightFinalizer finalizer(quant_config);
 
+    // Vision model must NOT be quantized — INT4/INT8 weights cause divide-by-zero
+    // in the CPU plugin during vision inference.
+    ov::genai::modeling::weights::QuantizationConfig no_quant;
+    ov::genai::safetensors::SafetensorsWeightFinalizer vision_finalizer(no_quant);
+
     auto text_model = ov::genai::modeling::models::create_qwen3_omni_text_model(
         omni_cfg,
         source,
         finalizer,
         false,
         true);
 
-    auto vision_model = ov::genai::modeling::models::create_qwen3_omni_vision_model(omni_cfg, source, finalizer);
+    auto vision_model = ov::genai::modeling::models::create_qwen3_omni_vision_model(omni_cfg, source, vision_finalizer);
 
     if (precision_mode == PrecisionMode::kFP32 ||
         precision_mode == PrecisionMode::kInfFp32KvInt8 ||