Skip to content

[ML] Double deployment of trained model causes assertion error #105518

Open
@maxhniebergall

Description

@maxhniebergall

Elasticsearch Version

8.14.0-SNAPSHOT

Installed Plugins

No response

Java Version

JBR-17.0.9+8-1166.2-nomod

OS Version

23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:30:44 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6000 arm64

Problem Description

When locally building elasticsearch (in debug mode), an assertion error occurs when attempting to perform inference.

[2024-02-14T14:14:06,790][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [runTask-0] fatal error in thread [elasticsearch[runTask-0][ml_native_inference_comms][T#3]], exiting java.lang.AssertionError
        at [email protected]/org.elasticsearch.xpack.ml.inference.deployment.NlpInferenceInput.extractInput(NlpInferenceInput.java:55)
        at [email protected]/org.elasticsearch.xpack.ml.inference.deployment.InferencePyTorchAction.doRun(InferencePyTorchAction.java:104)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at [email protected]/org.elasticsearch.xpack.ml.inference.pytorch.PriorityProcessWorkerExecutorService$OrderedRunnable.run(PriorityProcessWorkerExecutorService.java:58)
        at [email protected]/org.elasticsearch.xpack.ml.job.process.AbstractProcessWorkerExecutorService.start(AbstractProcessWorkerExecutorService.java:122)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Steps to Reproduce

  1. create deployment (I don't think this necessarily needs to be with the inference service, but thats what I tried)
(base) mh@Maxs-MacBook-Pro elasticsearch % curl -X PUT "localhost:9200/_inference/text_embedding/a-deployment-id2?pretty" \
-H 'Content-Type: application/json' -u elastic-admin:elastic-password \
-d'
  {
    "service": "text_embedding",
    "service_settings": {
      "num_allocations": 1,
      "num_threads": 1,
      "model_id": ".multilingual-e5-small"
    }
  }
'
{
  "model_id" : "a-deployment-id2",
  "task_type" : "text_embedding",
  "service" : "text_embedding",
  "service_settings" : {
    "num_allocations" : 1,
    "num_threads" : 1,
    "model_id" : ".multilingual-e5-small"
  },
  "task_settings" : { }
}
  1. Put the same model:
PUT /_ml/trained_models/.multilingual-e5-small?pretty
{
  "input": {
	"field_names": ["text_field"]
 }
}
  1. Run inference and the process crashes due to an assertion error.
[2024-02-14T14:14:06,790][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [runTask-0] fatal error in thread [elasticsearch[runTask-0][ml_native_inference_comms][T#3]], exiting java.lang.AssertionError
        at [email protected]/org.elasticsearch.xpack.ml.inference.deployment.NlpInferenceInput.extractInput(NlpInferenceInput.java:55)
        at [email protected]/org.elasticsearch.xpack.ml.inference.deployment.InferencePyTorchAction.doRun(InferencePyTorchAction.java:104)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at [email protected]/org.elasticsearch.xpack.ml.inference.pytorch.PriorityProcessWorkerExecutorService$OrderedRunnable.run(PriorityProcessWorkerExecutorService.java:58)
        at [email protected]/org.elasticsearch.xpack.ml.job.process.AbstractProcessWorkerExecutorService.start(AbstractProcessWorkerExecutorService.java:122)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions