Skip to content

Commit d0de0eb

Browse files
committed
Create Choosing_Your_Engine_ONNX_vs_OpenVINO_in_Spark_NLP.ipynb
1 parent 9d004e1 commit d0de0eb

1 file changed

Lines changed: 282 additions & 0 deletions

File tree

Lines changed: 282 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,282 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"id": "header_cell"
7+
},
8+
"source": [
9+
"![JohnSnowLabs](https://sparknlp.org/assets/images/logo.png)\n",
10+
"\n",
11+
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/transformers/Choosing_Your_Engine_ONNX_vs_OpenVINO_in_Spark_NLP.ipynb)\n",
12+
"\n",
13+
"# Choosing Your Inference Engine: ONNX vs OpenVINO in Spark NLP 🚀\n",
14+
"\n",
15+
"This notebook walks you through the `engine` parameter introduced in Spark NLP, which lets you **choose which deep learning backend** is used when downloading pretrained models.\n",
16+
"\n",
17+
"Spark NLP supports multiple inference backends:\n",
18+
"- **`tensorflow`** — the original TensorFlow backend (older models)\n",
19+
"- **`onnx`** — high-performance cross-platform runtime via [ONNX Runtime](https://onnxruntime.ai/) *(default since Spark NLP 5.0.0)*\n",
20+
"- **`openvino`** — Intel-optimized runtime via [OpenVINO™ Toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html) *(since Spark NLP 5.4.0)*\n",
21+
"\n",
22+
"The engine parameter is exposed through:\n",
23+
"- **`pretrainedEngine(name, lang, engine=...)`** — download a pretrained model with a specific engine backend\n",
24+
"\n",
25+
"Let's keep in mind a few things before we start 😊\n",
26+
"- The engine you pick **changes the actual binary file downloaded** — ONNX models ship `.onnx` weights while OpenVINO models ship `.xml`/`.bin` weights. You can verify this with `ls` directly in the Spark NLP cache folder.\n",
27+
"- All engines produce the **same results** for the same model — the difference is purely about runtime performance characteristics and hardware compatibility.\n",
28+
"- ONNX is the default and works on all hardware. OpenVINO is optimized for Intel CPUs/GPUs/NPUs and can give significant speedups on those platforms."
29+
],
30+
"id": "header_cell"
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {
35+
"id": "toc_cell"
36+
},
37+
"source": [
38+
"## Table of Contents\n",
39+
"\n",
40+
"1. [Install Dependencies](#1-install-dependencies)\n",
41+
"2. [Start Spark NLP](#2-start-spark-nlp)\n",
42+
"3. [Download the Same Model with Different Engines](#4-download-the-same-model-with-different-engines)\n",
43+
"4. [When to Use Which Engine](#8-when-to-use-which-engine)"
44+
],
45+
"id": "toc_cell"
46+
},
47+
{
48+
"cell_type": "markdown",
49+
"metadata": {
50+
"id": "section1"
51+
},
52+
"source": [
53+
"## 1. Install Dependencies\n",
54+
"\n"
55+
],
56+
"id": "section1"
57+
},
58+
{
59+
"cell_type": "code",
60+
"execution_count": 1,
61+
"metadata": {
62+
"id": "install_cell",
63+
"colab": {
64+
"base_uri": "https://localhost:8080/"
65+
},
66+
"outputId": "590dd327-671f-4018-e7b7-f8ad2d98437e"
67+
},
68+
"outputs": [
69+
{
70+
"output_type": "stream",
71+
"name": "stdout",
72+
"text": [
73+
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m317.3/317.3 MB\u001b[0m \u001b[31m3.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
74+
"\u001b[?25h Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
75+
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m772.6/772.6 kB\u001b[0m \u001b[31m39.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
76+
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m200.5/200.5 kB\u001b[0m \u001b[31m13.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
77+
"\u001b[?25h Building wheel for pyspark (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
78+
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
79+
"dataproc-spark-connect 1.1.0 requires pyspark[connect]~=4.0.0, but you have pyspark 3.5.4 which is incompatible.\u001b[0m\u001b[31m\n",
80+
"\u001b[0m"
81+
]
82+
}
83+
],
84+
"source": [
85+
"!pip install -q pyspark==3.5.4 spark-nlp==6.4.1rc1"
86+
],
87+
"id": "install_cell"
88+
},
89+
{
90+
"cell_type": "markdown",
91+
"metadata": {
92+
"id": "section2"
93+
},
94+
"source": [
95+
"## 2. Start Spark NLP\n",
96+
"\n",
97+
"Let's start Spark with Spark NLP included via our simple `start()` function."
98+
],
99+
"id": "section2"
100+
},
101+
{
102+
"cell_type": "code",
103+
"execution_count": 1,
104+
"metadata": {
105+
"id": "start_spark_cell",
106+
"colab": {
107+
"base_uri": "https://localhost:8080/"
108+
},
109+
"outputId": "716a0c7a-a439-48b8-bfd1-e6255a2f8739"
110+
},
111+
"outputs": [
112+
{
113+
"output_type": "stream",
114+
"name": "stdout",
115+
"text": [
116+
"Spark NLP version : 6.4.1-rc1\n",
117+
"Apache Spark version: 3.5.4\n"
118+
]
119+
}
120+
],
121+
"source": [
122+
"from pyspark.sql import SparkSession\n",
123+
"import sparknlp\n",
124+
"\n",
125+
"spark = SparkSession.builder \\\n",
126+
" .appName(\"Spark NLP\") \\\n",
127+
" .config(\"spark.driver.memory\", \"16G\") \\\n",
128+
" .config(\"spark.kryoserializer.buffer.max\", \"2000M\") \\\n",
129+
" .config(\"spark.jars\", \"/content/sparknlp.jar\") \\\n",
130+
" .getOrCreate()\n",
131+
"\n",
132+
"print(\"Spark NLP version :\", sparknlp.version())\n",
133+
"print(\"Apache Spark version:\", spark.version)"
134+
],
135+
"id": "start_spark_cell"
136+
},
137+
{
138+
"cell_type": "code",
139+
"source": [
140+
"import sparknlp\n",
141+
"\n",
142+
"spark = sparknlp.start()\n",
143+
"\n",
144+
"print(\"Spark NLP version :\", sparknlp.version())\n",
145+
"print(\"Apache Spark version:\", spark.version)"
146+
],
147+
"metadata": {
148+
"id": "X34E1cckR_9t"
149+
},
150+
"id": "X34E1cckR_9t",
151+
"execution_count": null,
152+
"outputs": []
153+
},
154+
{
155+
"cell_type": "markdown",
156+
"metadata": {
157+
"id": "section4"
158+
},
159+
"source": [
160+
"## 3. Download the Same Model with Different Engines\n",
161+
"\n",
162+
"Let's download **`distilbert_base_cased`** using both the `onnx` and `openvino` engines.\n"
163+
],
164+
"id": "section4"
165+
},
166+
{
167+
"cell_type": "code",
168+
"execution_count": 2,
169+
"metadata": {
170+
"id": "import_cell",
171+
"colab": {
172+
"base_uri": "https://localhost:8080/"
173+
},
174+
"outputId": "206112bd-d214-4f8f-ee9f-debbfbe2f7e5"
175+
},
176+
"outputs": [
177+
{
178+
"output_type": "stream",
179+
"name": "stdout",
180+
"text": [
181+
"distilbert_base_cased download started this may take some time.\n",
182+
"Approximate size to download 232.5 MB\n",
183+
"[OK!]\n",
184+
"\n",
185+
"✅ Loaded! Engine reported by model: onnx\n"
186+
]
187+
}
188+
],
189+
"source": [
190+
"from sparknlp.annotator import *\n",
191+
"\n",
192+
"model_onnx = DistilBertEmbeddings.pretrainedEngine(\"distilbert_base_cased\", \"en\", engine=\"onnx\")\n",
193+
"\n",
194+
"print(f\"\\n✅ Loaded! Engine reported by model: {model_onnx.getEngine()}\")"
195+
],
196+
"id": "import_cell"
197+
},
198+
{
199+
"cell_type": "code",
200+
"execution_count": 3,
201+
"metadata": {
202+
"id": "download_openvino_cell",
203+
"colab": {
204+
"base_uri": "https://localhost:8080/"
205+
},
206+
"outputId": "58e86ca0-809d-4f0a-9b90-76bfa473a9da"
207+
},
208+
"outputs": [
209+
{
210+
"output_type": "stream",
211+
"name": "stdout",
212+
"text": [
213+
"distilbert_base_cased download started this may take some time.\n",
214+
"Approximate size to download 232.5 MB\n",
215+
"[OK!]\n",
216+
"\n",
217+
"✅ Loaded! Engine reported by model: openvino\n"
218+
]
219+
}
220+
],
221+
"source": [
222+
"model_openvino = DistilBertEmbeddings.pretrainedEngine(\"distilbert_base_cased\", \"en\", engine=\"openvino\")\n",
223+
"\n",
224+
"print(f\"\\n✅ Loaded! Engine reported by model: {model_openvino.getEngine()}\")"
225+
],
226+
"id": "download_openvino_cell"
227+
},
228+
{
229+
"cell_type": "markdown",
230+
"metadata": {
231+
"id": "section8"
232+
},
233+
"source": [
234+
"## 4. When to Use Which Engine\n",
235+
"\n",
236+
"### Quick decision guide\n",
237+
"\n",
238+
"| Scenario | Recommended engine |\n",
239+
"|---|---|\n",
240+
"| Running on a mixed or unknown hardware cluster | `onnx` (default) |\n",
241+
"| Running on Intel CPUs (Xeon, Core) | `openvino` |\n",
242+
"| Running on Intel integrated GPU or Arc GPU | `openvino` |\n",
243+
"| Running on Intel NPU (Core Ultra) | `openvino` |\n",
244+
"| Running on NVIDIA GPU | `onnx` (with CUDA EP) |\n",
245+
"| Reproducing results from old Spark NLP models | `tensorflow` |\n",
246+
"| Maximum portability and ecosystem compatibility | `onnx` |\n",
247+
"\n",
248+
"### Performance notes\n",
249+
"\n",
250+
"- **OpenVINO** typically gives **1.5×–4× throughput improvement** over ONNX on Intel CPUs due to model-level graph optimizations and hardware-specific kernel fusion.\n",
251+
"- **ONNX** is the safest choice for heterogeneous clusters (a mix of Intel, AMD, ARM workers) since it runs everywhere.\n",
252+
"- Both `onnx` and `openvino` are significantly faster than `tensorflow` for inference in Spark NLP.\n",
253+
"\n"
254+
],
255+
"id": "section8"
256+
}
257+
],
258+
"metadata": {
259+
"colab": {
260+
"provenance": []
261+
},
262+
"kernelspec": {
263+
"display_name": "Python 3 (ipykernel)",
264+
"language": "python",
265+
"name": "python3"
266+
},
267+
"language_info": {
268+
"codemirror_mode": {
269+
"name": "ipython",
270+
"version": 3
271+
},
272+
"file_extension": ".py",
273+
"mimetype": "text/x-python",
274+
"name": "python",
275+
"nbformat_minor": 5,
276+
"pygments_lexer": "ipython3",
277+
"version": "3.10.0"
278+
}
279+
},
280+
"nbformat": 4,
281+
"nbformat_minor": 5
282+
}

0 commit comments

Comments
 (0)