-
Notifications
You must be signed in to change notification settings - Fork 178
Open
Description
Model description
The qwen3 models easily outperform nearly every other open source model for embeddings, however it does not work in infinity due to outdated transformers.
My docker compose file:
version: '3.8'
services:
infinity:
image: michaelf34/infinity:latest
environment:
DO_NOT_TRACK: 1 # Disable telemetry
INFINITY_BETTERTRANSFORMER: True
HF_HOME: /app/data # Use /app/data
INFINITY_MODEL_ID: Qwen/Qwen3-Embedding-4B;Qwen/Qwen3-Reranker-4B # Model(s), semicolon separated
INFINITY_PORT: 7997 # Port
INFINITY_API_KEY: foo # Optional API key
INFINITY_DEVICE: cuda
INFINITY_VECTOR_DISK_CACHE: True
volumes:
- ./infinity:/app/data:rw # Persist /app/data to ./infinity in the current directory
- ./models:/data/models:ro # Mount local models to /data/models in read-only mode
- ./infinity:/data/hf_cache:rw # Mount cache data to /data/hf_cache in read-write mode
ports:
- "7997:7997" # Flexible port mapping
command: v2
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
infinity: # Named volume declaration
This results in the error:
infinity-1 | ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
To fix this, you should just be able to update transformers to >4.51.0 as per the qwen documentation.
Open source status & huggingface transformers.
- The model implementation is available on transformers
- The model weights are available on huggingface-hub
- I verified that the model is currently not running in the latest version
pip install infinity_emb[all] --upgrade - I made the authors of the model aware that I want to use it with infinity_emb & check if they are aware of the issue.
dotmobo
Metadata
Metadata
Assignees
Labels
No labels