You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"[](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/transformers/Choosing_Your_Engine_ONNX_vs_OpenVINO_in_Spark_NLP.ipynb)\n",
12
+
"\n",
13
+
"# Choosing Your Inference Engine: ONNX vs OpenVINO in Spark NLP 🚀\n",
14
+
"\n",
15
+
"This notebook walks you through the `engine` parameter introduced in Spark NLP, which lets you **choose which deep learning backend** is used when downloading pretrained models.\n",
"- **`pretrainedEngine(name, lang, engine=...)`** — download a pretrained model with a specific engine backend\n",
24
+
"\n",
25
+
"Let's keep in mind a few things before we start 😊\n",
26
+
"- The engine you pick **changes the actual binary file downloaded** — ONNX models ship `.onnx` weights while OpenVINO models ship `.xml`/`.bin` weights. You can verify this with `ls` directly in the Spark NLP cache folder.\n",
27
+
"- All engines produce the **same results** for the same model — the difference is purely about runtime performance characteristics and hardware compatibility.\n",
28
+
"- ONNX is the default and works on all hardware. OpenVINO is optimized for Intel CPUs/GPUs/NPUs and can give significant speedups on those platforms."
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m772.6/772.6 kB\u001b[0m \u001b[31m39.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
76
+
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m200.5/200.5 kB\u001b[0m \u001b[31m13.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
77
+
"\u001b[?25h Building wheel for pyspark (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
78
+
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
79
+
"dataproc-spark-connect 1.1.0 requires pyspark[connect]~=4.0.0, but you have pyspark 3.5.4 which is incompatible.\u001b[0m\u001b[31m\n",
"print(f\"\\n✅ Loaded! Engine reported by model: {model_openvino.getEngine()}\")"
225
+
],
226
+
"id": "download_openvino_cell"
227
+
},
228
+
{
229
+
"cell_type": "markdown",
230
+
"metadata": {
231
+
"id": "section8"
232
+
},
233
+
"source": [
234
+
"## 4. When to Use Which Engine\n",
235
+
"\n",
236
+
"### Quick decision guide\n",
237
+
"\n",
238
+
"| Scenario | Recommended engine |\n",
239
+
"|---|---|\n",
240
+
"| Running on a mixed or unknown hardware cluster | `onnx` (default) |\n",
241
+
"| Running on Intel CPUs (Xeon, Core) | `openvino` |\n",
242
+
"| Running on Intel integrated GPU or Arc GPU | `openvino` |\n",
243
+
"| Running on Intel NPU (Core Ultra) | `openvino` |\n",
244
+
"| Running on NVIDIA GPU | `onnx` (with CUDA EP) |\n",
245
+
"| Reproducing results from old Spark NLP models | `tensorflow` |\n",
246
+
"| Maximum portability and ecosystem compatibility | `onnx` |\n",
247
+
"\n",
248
+
"### Performance notes\n",
249
+
"\n",
250
+
"- **OpenVINO** typically gives **1.5×–4× throughput improvement** over ONNX on Intel CPUs due to model-level graph optimizations and hardware-specific kernel fusion.\n",
251
+
"- **ONNX** is the safest choice for heterogeneous clusters (a mix of Intel, AMD, ARM workers) since it runs everywhere.\n",
252
+
"- Both `onnx` and `openvino` are significantly faster than `tensorflow` for inference in Spark NLP.\n",
0 commit comments