-
Notifications
You must be signed in to change notification settings - Fork 523
Description
We are continuously addressing and improving the SDK, if possible, make sure the problem persists in the latest SDK version.
Describe the bug
When executing a LINQ query that filters on a nested collection (e.g. using .Any() or .All()), combined with an OrderByRank operation (such as for hybrid or vector search), the SDK translates the expression into a SQL query that uses a join on a nested select. The query fails because Order By Rank does not support the use of joins. Ideally, the queryable provider should emit an EXISTS clause (in-place instead of a join alias), ensuring the query works as expected.
To Reproduce
Steps to reproduce the behavior:
-
Define a document model with a nested collection:
public class Product { public string id { get; set; } public List<Tag> Tags { get; set; } public float[] Embedding { get; set; } } public class Tag { public string Name { get; set; } }
-
Run a LINQ query performing a hybrid search and filtering with
.Any()on the nested collection:var query = container.GetItemLinqQueryable<Product>() .Where(p => p.Tags.Any(t => t.Name == "Electronics")) .OrderByRank(p =>RRF(p.Embedding.VectorDistance(someQueryVector), ...));
-
View the SQL produced by the SDK (simplified for illustration):
SELECT * FROM c JOIN (SELECT VALUE EXISTS (SELECT VALUE t FROM t IN c.Tags WHERE t.Name = "Electronics")) AS t WHERE t ORDER BY RANK RRF(VectorDistance(c.Embedding, @queryVector), FullTextScore(...))
This fails with an error similar to:
The JOIN operator is not allowed with the ORDER BY RANK clause.
Expected behavior
The query provider should generate an EXISTS clause for the collection filter without the need of a wrapping SELECT, enabling the use of OrderByRank. For example:
SELECT * FROM c
WHERE EXISTS (SELECT VALUE t FROM t IN c.Tags WHERE t.Name = "Electronics")
ORDER BY RANK RRF(VectorDistance(c.Embedding, @queryVector), FullTextScore(...))Actual behavior
The SDK currently generates a join when using .Any() or .All() on a nested collection, leading to unsupported queries if combined with OrderByRank.
Environment summary
SDK Version: latest version
OS Version: Windows 11 (also tested and confirmed on Ubuntu Linux in our deployment environment)
Additional context
We are using OData to build up our expression trees, so are not able to write the queries by hand. So, this blocks us from lambda filters on collections and semantic/hybrid search on our APIs