Support for bit precision in the Inference API text_embedding task

### Description

Some inference API providers now support embedding models with each dimension defined as a single bit. For example, the v3 models from Cohere offer this capability. Since we already handle the bit element type in the dense vector field, it would be beneficial to extend this support to allow the text_embedding task of the inference API to output vectors with bit precision.


Typically, bit vectors are paired with float or byte vectors to improve recall by rescoring the hits from bit vectors with higher precision vectors. To support this, we suggest allowing the text_embedding task to generate multiple vectors for the same input at different precisions (e.g., bits + floats or bits + int8). While this functionality is already available in the Cohere API, implementing it in the inference API would optimize performance by eliminating the need to make two separate API calls for each precision, thereby reducing costs for users.

This would require the mapping to be defined with two fields, each corresponding to a different precision.

Additionally, we should evaluate whether the semantic_text field could natively support this scenario.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for bit precision in the Inference API text_embedding task #111747

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for bit precision in the Inference API text_embedding task #111747

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions