|
36 | 36 | "source": [
|
37 | 37 | "## Create Elastic Cloud deployment\n",
|
38 | 38 | "\n",
|
39 |
| - "If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial.\n", |
40 |
| - "\n", |
41 |
| - "TODO: Instruct user to disable ML node autoscaling?" |
| 39 | + "If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial." |
42 | 40 | ],
|
43 | 41 | "metadata": {
|
44 | 42 | "collapsed": false
|
|
94 | 92 | "from elasticsearch import Elasticsearch, exceptions\n",
|
95 | 93 | "from urllib.request import urlopen\n",
|
96 | 94 | "from getpass import getpass\n",
|
97 |
| - "import json" |
| 95 | + "import json\n", |
| 96 | + "import time" |
98 | 97 | ],
|
99 | 98 | "metadata": {
|
100 | 99 | "collapsed": false
|
|
207 | 206 | "\n",
|
208 | 207 | "Let's create the inference endpoint by using the [Create inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html).\n",
|
209 | 208 | "\n",
|
210 |
| - "For this example we'll use the [ELSER service](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html), but the inference API also supports [many other inference services](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc).\n", |
211 |
| - "\n", |
212 |
| - "NOTE: If the inference creation request times out, wait a moment and try again" |
| 209 | + "For this example we'll use the [ELSER service](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html), but the inference API also supports [many other inference services](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc)." |
213 | 210 | ],
|
214 | 211 | "metadata": {
|
215 | 212 | "collapsed": false
|
|
227 | 224 | " # Inference endpoint does not exist\n",
|
228 | 225 | " pass\n",
|
229 | 226 | "\n",
|
230 |
| - "client.options(request_timeout=60).inference.put_model(\n", |
231 |
| - " task_type=\"sparse_embedding\",\n", |
232 |
| - " inference_id=\"my-elser-endpoint\",\n", |
233 |
| - " body={\n", |
234 |
| - " \"service\": \"elser\",\n", |
235 |
| - " \"service_settings\": {\"num_allocations\": 1, \"num_threads\": 1},\n", |
236 |
| - " },\n", |
237 |
| - ")" |
| 227 | + "try:\n", |
| 228 | + " client.options(request_timeout=60, max_retries=3, retry_on_timeout=True).inference.put_model(\n", |
| 229 | + " task_type=\"sparse_embedding\",\n", |
| 230 | + " inference_id=\"my-elser-endpoint\",\n", |
| 231 | + " body={\n", |
| 232 | + " \"service\": \"elser\",\n", |
| 233 | + " \"service_settings\": {\"num_allocations\": 1, \"num_threads\": 1},\n", |
| 234 | + " },\n", |
| 235 | + " )\n", |
| 236 | + " print(\"Inference endpoint created successfully\")\n", |
| 237 | + "except exceptions.BadRequestError as e:\n", |
| 238 | + " if e.error == \"resource_already_exists_exception\":\n", |
| 239 | + " print(\"Inference endpoint created successfully\")\n", |
| 240 | + " else:\n", |
| 241 | + " raise e\n" |
238 | 242 | ],
|
239 | 243 | "metadata": {
|
240 | 244 | "collapsed": false
|
241 | 245 | },
|
242 | 246 | "id": "8ee2188ea71324f5"
|
243 | 247 | },
|
| 248 | + { |
| 249 | + "cell_type": "markdown", |
| 250 | + "source": [ |
| 251 | + "Once the endpoint is created, we must wait until the backing ELSER service is deployed.\n", |
| 252 | + "This can take a few minutes to complete." |
| 253 | + ], |
| 254 | + "metadata": { |
| 255 | + "collapsed": false |
| 256 | + }, |
| 257 | + "id": "e94fd66761fd8087" |
| 258 | + }, |
| 259 | + { |
| 260 | + "cell_type": "code", |
| 261 | + "execution_count": null, |
| 262 | + "outputs": [], |
| 263 | + "source": [ |
| 264 | + "inference_endpoint_info = client.inference.get_model(\n", |
| 265 | + " inference_id=\"my-elser-endpoint\",\n", |
| 266 | + ")\n", |
| 267 | + "model_id = inference_endpoint_info[\"endpoints\"][0][\"service_settings\"][\"model_id\"]\n", |
| 268 | + "\n", |
| 269 | + "while True:\n", |
| 270 | + " status = client.ml.get_trained_models_stats(\n", |
| 271 | + " model_id=model_id,\n", |
| 272 | + " )\n", |
| 273 | + "\n", |
| 274 | + " deployment_stats = status[\"trained_model_stats\"][0].get(\"deployment_stats\")\n", |
| 275 | + " if deployment_stats is None:\n", |
| 276 | + " print(\"ELSER Model is currently being deployed.\")\n", |
| 277 | + " continue\n", |
| 278 | + " \n", |
| 279 | + " nodes = deployment_stats.get(\"nodes\")\n", |
| 280 | + " if nodes is not None and len(nodes) > 0:\n", |
| 281 | + " print(\"ELSER Model has been successfully deployed.\")\n", |
| 282 | + " break\n", |
| 283 | + " else:\n", |
| 284 | + " print(\"ELSER Model is currently being deployed.\")\n", |
| 285 | + " time.sleep(5)" |
| 286 | + ], |
| 287 | + "metadata": { |
| 288 | + "collapsed": false |
| 289 | + }, |
| 290 | + "id": "adb33329ce20b2f1" |
| 291 | + }, |
244 | 292 | {
|
245 | 293 | "cell_type": "markdown",
|
246 | 294 | "source": [
|
|
0 commit comments