-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
Feature request
Am attempting to run the Qwen3 embedding model in a similar way to the code outlined on the Qwen3-embedding model card. When running with ON it is missing the 2 methods encode
and similarity
. Can we get them added so the code snippet below can run which is nearly identical to running with normal sentence transformers code ?
from optimum.neuron import NeuronModelForSentenceTransformers
# Load the model
model = NeuronModelForSentenceTransformers("Qwen/Qwen3-Embedding-0.6B")
# The queries and documents to embed
queries = [
"What is the capital of China?",
"Explain gravity",
]
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]
# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
Motivation
Customers want to run embedding models on Trainium and we want to enable them to do so in a similar way to the sentence transformers API
Your contribution
Can test and give feedback
Metadata
Metadata
Assignees
Labels
No labels