Skip to content

Add support for encode and similarity for sentence transformer models #975

@mmcclean-aws

Description

@mmcclean-aws

Feature request

Am attempting to run the Qwen3 embedding model in a similar way to the code outlined on the Qwen3-embedding model card. When running with ON it is missing the 2 methods encode and similarity. Can we get them added so the code snippet below can run which is nearly identical to running with normal sentence transformers code ?

from optimum.neuron import NeuronModelForSentenceTransformers

# Load the model
model = NeuronModelForSentenceTransformers("Qwen/Qwen3-Embedding-0.6B")

# The queries and documents to embed
queries = [
    "What is the capital of China?",
    "Explain gravity",
]
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]

# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)

# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)

Motivation

Customers want to run embedding models on Trainium and we want to enable them to do so in a similar way to the sentence transformers API

Your contribution

Can test and give feedback

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions