Skip to content

[Cosmos] [Tracking] Embedding Generation in the Python SDK (V0) #46729

@ananth7592

Description

@ananth7592

[Cosmos] [Tracking] Embedding Generation in the Python SDK (V0)

Summary

V0 plan (parity with .NET tracking issue Azure/azure-cosmos-dotnet-v3#5830) for adding automatic embedding generation for hybrid / vector queries in the Python Cosmos SDK (azure-cosmos).

When a customer writes:

SELECT TOP 10 *
FROM c
ORDER BY VectorDistance(c.text, 'big brown cat')

the gateway rewrites it to:

SELECT TOP 10 *
FROM c
ORDER BY VectorDistance(c.embedding, @documentdb-hybridsearchquery-embedding-0)

…and returns an embeddingParameterMap mapping @documentdb-hybridsearchquery-embedding-0'big brown cat'. The SDK calls the registered EmbeddingGenerator, injects the returned vectors as query parameters, and executes per-partition.

Architecture

Protocols (core package — azure-cosmos)

from typing import Protocol, Sequence

@runtime_checkable
class EmbeddingGenerator(Protocol):
    def generate_embeddings(self, texts: Sequence[str]) -> Sequence[Sequence[float]]: ...

@runtime_checkable
class AsyncEmbeddingGenerator(Protocol):
    async def generate_embeddings_async(self, texts: Sequence[str]) -> Sequence[Sequence[float]]: ...

Passed via embedding_generator= keyword on query_items (sync + async).

Provider package (azure-cosmos-embedding)

Separate optional PyPI package — analogous to the .NET Microsoft.Azure.Cosmos.Embedding. Ships AzureOpenAIEmbeddingGenerator (implements both protocols). Customers who use a different provider implement the protocol directly. azure-cosmos has no dependency on this package.

VectorEmbeddingPolicy extension

The embeddingSource block (new in V0) is added to each entry in vectorEmbeddings. It carries the endpoint, deployment name, auth type, and source paths:

"embeddingSource": {
  "sourcePaths": ["/title", "/abstract"],
  "deploymentName": "text-embedding-3-small",
  "modelName": "text-embedding-3-small",
  "endpoint": "https://embedding-south-central.cognitiveservices.azure.com/",
  "authType": "ApiKey"
}

A new EmbeddingSource TypedDict (and VectorEmbedding / VectorEmbeddingPolicy typed equivalents) are added to azure-cosmos to support typed construction of AzureOpenAIEmbeddingGenerator.from_embedding_source(...).

Flow

  1. SDK → Gateway: query plan request for raw user query.
  2. Gateway → SDK: rewritten plan + embeddingParameterMap (key → original text).
  3. SDK → EmbeddingGenerator: one batched call with all texts.
  4. SDK → backend: per-partition execution with embeddings injected as parameters.

Hard GW dependency: Python has no serviceinterop DLL. All query plans come from the gateway. Python SDK changes can land immediately, but end-to-end testing requires the gateway change to be deployed.

Sub-issues

Core package (azure-cosmos)

Provider package (azure-cosmos-embedding)

Open items

  1. Confirm wire shape of embeddingParameterMap in GW JSON response — object vs array of {key,value}.
  2. Confirm the exact string for the EmbeddingGeneration supported-feature token.
  3. azure-cosmos-embedding version: ship as 1.0.0b1 (beta) at Build, or stage post-Build?
  4. Auth: confirm which auth types (ApiKey, Entra/RBAC) the Foundry / AOAI endpoint supports in V0.
  5. Python SDK owner for this feature: Aayush.

Notes

References

  • .NET tracking issue: Azure/azure-cosmos-dotnet-v3#5830
  • Python files touched (core):
    • sdk/cosmos/azure-cosmos/azure/cosmos/documents.py_QueryFeature token
    • sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py — sync aggregator
    • sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py — async aggregator
    • sdk/cosmos/azure-cosmos/azure/cosmos/container.pyquery_items keyword
    • sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py — async query_items keyword

Metadata

Metadata

Assignees

Labels

Cosmosfeature-requestThis issue requires a new behavior in the product in order be resolved.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions