Skip to content

Cosmos Full Text Search support #35868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 15, 2025
Merged

Cosmos Full Text Search support #35868

merged 2 commits into from
Apr 15, 2025

Conversation

maumar
Copy link
Contributor

@maumar maumar commented Mar 29, 2025

  • Adding model building API to configure property as full-text search enabled, as well as setup the index for it,
  • Adding model validation (e.g. FTS index not matching FTS property),
  • Adding EF.Functions stubs and translations for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF (for hybrid),
  • Adding logic in SelectExpression to produce ORDER BY RANK when necessary,
  • Adding validation when attempting to mix with ORDER BY RANK with regular ORDER BY,
  • Rewrite OFFSET/LIMIT from parameter to constant when ORDER BY RANK is present.
  • Adding model building support for default language on the container level

Also fixed / added support for vector search on owned types (since it shares logic with FTS) and added some tests.

outstanding work:

  • support for FTS Container building using Azure.ResourceManager.CosmosDb (currently blocked on updated package being released)

Fixes #35476
Fixes #35851
Fixes #35852
Fixes #35853
Fixes #35867

@maumar maumar requested a review from Copilot March 29, 2025 02:22
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for full-text search in Cosmos by introducing new full-text index types, translators, model validations, and extensions.

  • Introduces full-text indexing support in container creation and client wrappers.
  • Adds a new CosmosFullTextSearchTranslator and corresponding DbFunctions extensions.
  • Updates model validation and Fluent API extensions to include full-text search configuration.

Reviewed Changes

Copilot reviewed 23 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/EFCore.Cosmos/Storage/Internal/CosmosClientWrapper.cs Adds support for full-text indexing in container creation, but the new full-text index assignment may override vector index settings.
src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs Modifies method checking logic that may prematurely reject valid method calls by changing && to
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs Implements a new translator for full-text search functions.
src/EFCore.Cosmos/Query/Internal/Expressions/SqlFunctionExpression.cs Introduces an extra constructor parameter to flag scoring functions.
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs Adds a RankOrdering property and methods to apply rank-based ordering.
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs Adjusts ordering logic to support scoring functions and rank ordering.
src/EFCore.Cosmos/Query/Internal/CosmosQuerySqlGenerator.cs Updates SQL generation to include rank ordering handling.
src/EFCore.Cosmos/Metadata/Internal/CosmosAnnotationNames.cs Defines new annotations for full-text search configuration.
src/EFCore.Cosmos/Infrastructure/Internal/CosmosModelValidator.cs Validates full-text index configuration, ensuring only single-property indexes with language annotation are defined.
src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs, CosmosPropertyBuilderExtensions.cs, CosmosIndexExtensions.cs, CosmosIndexBuilderExtensions.cs, CosmosDbFunctionsExtensions.cs Adds Fluent API and extension methods for configuring full-text search on properties and indexes.
Files not reviewed (3)
  • Directory.Packages.props: Language not supported
  • src/EFCore.Cosmos/EFCore.Cosmos.csproj: Language not supported
  • src/EFCore.Cosmos/Properties/CosmosStrings.Designer.cs: Language not supported
Comments suppressed due to low confidence (2)

src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs:33

  • Changing the logical operator from '&&' to '||' alters the intended logic and causes the translator to return null for methods declared on CosmosDbFunctionsExtensions that are not named 'VectorDistance'. Revisit the condition to ensure it only returns null when both conditions are unmet.
if (method.DeclaringType != typeof(CosmosDbFunctionsExtensions) || method.Name != nameof(CosmosDbFunctionsExtensions.VectorDistance))

src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs:137

  • [nitpick] The Experimental attribute for full-text search methods is using 'CosmosVectorSearchExperimental' instead of the more appropriate 'CosmosFullTextSearchExperimental'. This may cause confusion regarding the feature's experiment status.
[Experimental(EFDiagnostics.CosmosVectorSearchExperimental)]

@maumar maumar force-pushed the cosmos_fts_take_omega branch 2 times, most recently from 0f39e44 to a836b63 Compare March 31, 2025 23:41
@maumar maumar force-pushed the cosmos_fts_take_omega branch from a836b63 to 898cc5f Compare April 1, 2025 00:17
@roji
Copy link
Member

roji commented Apr 1, 2025

@maumar just to note that I'm holding back from reviewing since this is still in draft (you seemed to maybe want people to take a look in yesterday's meeting). Let me know if you want a review etc.

@maumar
Copy link
Contributor Author

maumar commented Apr 1, 2025

@roji feel free to hold off on reviewing a draft. I'm pretty close to marking it for review I think

@maumar maumar force-pushed the cosmos_fts_take_omega branch from a45f839 to 245066a Compare April 1, 2025 23:20
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up doing a quick review even though the PR is still in draft... Here are some comments.

@maumar maumar force-pushed the cosmos_fts_take_omega branch 3 times, most recently from 73b0297 to 30e0bd9 Compare April 2, 2025 02:27
@maumar maumar force-pushed the cosmos_fts_take_omega branch from 30e0bd9 to ebebcfc Compare April 2, 2025 03:18
@maumar maumar force-pushed the cosmos_fts_take_omega branch from ebebcfc to 68ca715 Compare April 3, 2025 01:20
Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @AndriySvyryd in case you want to take a quick look for the metadata side and Cosmos container building part.

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maumar see batch of comments.

I think we should maybe discuss the FTS property vs. index question in design...

@maumar maumar force-pushed the cosmos_fts_take_omega branch 2 times, most recently from 64a2808 to 0933839 Compare April 5, 2025 01:27
maumar added a commit that referenced this pull request Apr 7, 2025
Fixes #35476
Fixes #35853 (need to fix this one, otherwise vector translator will try to translate full text methods and fail)

Description
This PR enables full-text search queries using EF Core 9 when targeting Azure Cosmos Db. This is one of the flagship new features for Cosmos and the aim here is to help with it's adoption. This is very limited port of the full feature we are adding in EF 10 Preview 4. We are only adding querying capabilities: function stubs for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF as well as logic translating these signatures to built-in Cosmos functions.
No model building or data manipulation - containers need to be created outside EF (using Cosmos SDK or in the Data Explorer).

Customer impact
Customers will be able to use the upcoming full text search capabilities when working with Azure Cosmos Db without the need to upgrade to EF 10 preview.

How found
Partner team ask.

Regression
No

Testing
Extensively tested on EF 10, manual testing on EF9. End-to-end testing is not possible because we can't create containers programmatically (no support for it inside EF Core itself, and the Cosmos SDK which supports it is currently only available in beta, so we can't take dependency on it). Instead, we created containers and data using EF 10, ported all the query tests from EF 10 and ran them using the EF9 bits.

Risk
Low. Code here is purely additive and actually localized to only a handful of places in the code: validation in ApplyOrdering/AppendOrdering, FTS method translator, parameter inliner and sql generator. Feature is marked as experimental and quirks have been added.
@maumar maumar force-pushed the cosmos_fts_take_omega branch 3 times, most recently from d9fa4f3 to b554e29 Compare April 9, 2025 01:50
@maumar
Copy link
Contributor Author

maumar commented Apr 9, 2025

@roji @AndriySvyryd I updated the PR addressing the Property/Index API and fixing a few small errors. You can review just the second commit - all the changes are there.

  • I put the HasDefaultFullTextLanguage setting on the entity type builder, rather than creating a container builder just for it. We already have some container-level settings (e.g. TTL) and we defined them on entity type builder, so going for consistency here.
  • Added tests to make sure values are propagated correctly (if property specifies language, use that, otherwise use default, otherwise use en-US). These tests need to be adjusted once cosmos supports more languages, for now they all just throw Cosmos exception complaining about not supported language. I manually verified that we wire-up things correctly when creating the container.
  • Added test for using full text property without the index and removed validation preventing this scenario.
  • Conditionally skipping the tests when running on live service due to Azure.ResourceManager.CosmosDb not supporting FTS.

@maumar maumar force-pushed the cosmos_fts_take_omega branch from b554e29 to 65cc1e5 Compare April 9, 2025 02:05
@maumar maumar force-pushed the cosmos_fts_take_omega branch 2 times, most recently from 8d8771f to 993bc56 Compare April 9, 2025 10:56
@maumar
Copy link
Contributor Author

maumar commented Apr 9, 2025

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@maumar maumar force-pushed the cosmos_fts_take_omega branch 3 times, most recently from 5c2c1ea to 49bedd3 Compare April 14, 2025 10:08
maumar added a commit that referenced this pull request Apr 14, 2025
…35909)

* Limited port of #35868

Fixes #35476
Fixes #35853 (need to fix this one, otherwise vector translator will try to translate full text methods and fail)

Description
This PR enables full-text search queries using EF Core 9 when targeting Azure Cosmos Db. This is one of the flagship new features for Cosmos and the aim here is to help with it's adoption. This is very limited port of the full feature we are adding in EF 10 Preview 4. We are only adding querying capabilities: function stubs for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF as well as logic translating these signatures to built-in Cosmos functions.
No model building or data manipulation - containers need to be created outside EF (using Cosmos SDK or in the Data Explorer).

Customer impact
Customers will be able to use the upcoming full text search capabilities when working with Azure Cosmos Db without the need to upgrade to EF 10 preview.

How found
Partner team ask.

Regression
No

Testing
Extensively tested on EF 10, manual testing on EF9. End-to-end testing is not possible because we can't create containers programmatically (no support for it inside EF Core itself, and the Cosmos SDK which supports it is currently only available in beta, so we can't take dependency on it). Instead, we created containers and data using EF 10, ported all the query tests from EF 10 and ran them using the EF9 bits.

Risk
Low. Code here is purely additive and actually localized to only a handful of places in the code: validation in ApplyOrdering/AppendOrdering, FTS method translator, parameter inliner and sql generator. Feature is marked as experimental and quirks have been added.

* fix FTContainsAll/Any for constants,
change TFScore signature to accept params,
fixed SqlFunctionExpression to use the new ctor internally,
added mandatory property overrides for FragmentExpression (vector search)
fixed parameter inliner to not match parameter of type string, but only string[] when processing ContainsAll/Any
@maumar maumar force-pushed the cosmos_fts_take_omega branch from 49bedd3 to 31e670c Compare April 15, 2025 18:02
- Adding model building API to configure property as full-text search enabled, as well as setup the index for it,
- Adding model validation (e.g. FTS index not matching FTS property),
- Adding EF.Functions stubs and translations for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF (for hybrid),
- Adding logic in SelectExpression to produce ORDER BY RANK when necessary,
- Adding validation when attempting to mix with ORDER BY RANK with regular ORDER BY,
- Rewrite OFFSET/LIMIT from parameter to constant when ORDER BY RANK is present.
- Adding model building support for default language on the container level

Also fixed / added support for vector search on owned types (since it shares logic with FTS) and added some tests.

outstanding work:
- support for FTS Container building using Azure.ResourceManager.CosmosDb (currently blocked on updated package being released)

Fixes #35476
Fixes #35853
Fixes #35867
Fixes #35852
@maumar maumar force-pushed the cosmos_fts_take_omega branch from 31e670c to b987e7f Compare April 15, 2025 18:51
@maumar maumar enabled auto-merge (squash) April 15, 2025 19:44
@maumar maumar merged commit 0024b2b into main Apr 15, 2025
7 checks passed
@maumar maumar deleted the cosmos_fts_take_omega branch April 15, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment