You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SentenceTransformerEmbeddingFunction does not expose any of the parallelism parameters that sentence-transformers natively supports. The __call__ method hard-codes only two arguments:
This means batch_size, pool, and multi-device encoding are silently unavailable. When embedding large document sets locally, this forces users to either subclass the wrapper themselves or pre-compute embeddings outside Chroma entirely — defeating the purpose of the built-in embedding function.
Describe the proposed solution
Add two optional parameters to SentenceTransformerEmbeddingFunction:
batch_size: int = 32 — passed directly to .encode(). Controls how many documents are processed per forward pass. No behaviour change for existing users. multiprocess_devices: list[str] | None = None — when provided, starts a persistent sentence-transformers multi-process pool across the given devices (e.g. ["cuda:0", "cuda:1"] or ["cpu", "cpu", "cpu", "cpu"]). Pool is created once in __init__ and reused across every __call__ to avoid process-spawn overhead per batch.
Both parameters default to existing behaviour, so the change is fully backward compatible.
Alternatives considered
Using FastEmbedEmbeddingFunction — FastEmbed does expose batch_size and parallel ([ENH]: FastEmbed embedding function support #1986), but it is a different library with a different model registry. It is not a drop-in replacement for users already on sentence-transformers models.
Pre-computing embeddings outside Chroma — users can pass pre-computed embeddings directly to .add(), but this breaks the collection-level embedding function contract and requires managing embedding logic outside Chroma.
Describe the problem
SentenceTransformerEmbeddingFunction does not expose any of the parallelism parameters that sentence-transformers natively supports. The
__call__method hard-codes only two arguments:This means
batch_size,pool, and multi-device encoding are silently unavailable. When embedding large document sets locally, this forces users to either subclass the wrapper themselves or pre-compute embeddings outside Chroma entirely — defeating the purpose of the built-in embedding function.Describe the proposed solution
Add two optional parameters to SentenceTransformerEmbeddingFunction:
batch_size: int = 32 — passed directly to .encode(). Controls how many documents are processed per forward pass. No behaviour change for existing users.multiprocess_devices: list[str] | None = None — when provided, starts a persistent sentence-transformers multi-process pool across the given devices (e.g. ["cuda:0", "cuda:1"] or ["cpu", "cpu", "cpu", "cpu"]). Pool is created once in__init__and reused across every__call__to avoid process-spawn overhead per batch.Both parameters default to existing behaviour, so the change is fully backward compatible.
Alternatives considered
Importance
would make my life easier
Additional Information
No response