-
Notifications
You must be signed in to change notification settings - Fork 178
Open
Description
I’ve observed significant discrepancies in the embeddings produced by Infinity compared to SentenceTransformer for the same model:
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.
Example
When computing the cosine similarity between the embeddings of the two inputs mountains and joyeux noel:
def cosine_similarity(vector1, vector2):
"""
Calculate cosine similarity between two vectors
"""
dot_product = np.dot(vector1, vector2)
magnitude1 = np.linalg.norm(vector1)
magnitude2 = np.linalg.norm(vector2)
if magnitude1 == 0 or magnitude2 == 0:
return 0
return dot_product / (magnitude1 * magnitude2)- Infinity result:
0.497474 - SentenceTransformer result:
0.354079
The similarity score from SentenceTransformer matches what is reported in both:
- Hugging Face UI
- Hugging Face Text Embeddings Inference (TEI)
This suggests Infinity is producing different embeddings than the expected reference implementations.
Reproduction
Infinity (CPU):
docker run --rm -it \
-p 8080:8080 \
michaelf34/infinity:latest-cpu \
v2 \
--engine optimum \
--port 8080 \
--model-id sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2Hugging Face TEI (CPU):
docker run -p 8081:80 -v $volume:/data --pull always \
ghcr.io/huggingface/text-embeddings-inference-cpu:1.8 \
--model-id sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2SentenceTransformer code
model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
embeddings = model.encode([text1, text2])Metadata
Metadata
Assignees
Labels
No labels