Skip to content

Match operator language evolution #125665

Open
@ioanatia

Description

@ioanatia

Search functionality in ES|QL is being driven primarily through search functions. Functions allow for a lot of flexibility and customisation of options, but ultimately do not feel as delightful as other language constructions, e.g. operators or commands. This issue is an exploration on how we want to evolve full text search as a language construct in ES|QL.

tracked in #123043

Current state

We have already delivered match, qstr as ES|QL functions and we are now working on adding multi_match and match_phrase.
We also have a dedicated operator : which translates to a match query.
The operator has a short and delightful syntax, while the functions are more verbose but support configurable options:

FROM books | WHERE title:"Harry Potter"

is equivalent to

FROM books | WHERE match(title, "Harry Potter")

Future evolution of the match operator

We see that users naturally gravitate towards using the match operator, rather than the match function.
Right now we ask them to switch to using functions if they want something more than just doing a simple match.
We should be looking at how can we bridge the gap between the match operator and the functionality of full text functions, while preserving the simplicity of the operator syntax.

Match on all fields

In KQL or Lucene query syntax it is very easy to match on all fields e.g. *:Error.
To achieve this in ES|QL, we require using the qstr or kql function:

FROM logs*
| WHERE qstr("*:Error")

With the match operator this could be expressed in a much simpler way:

FROM logs*
| WHERE *:"Error"

Similar to qstr, we should use the fields that are defined in index.query.default_field.

Match phrase

We are planning on adding support for match_phrase through functions.
We should also be looking at how we can add match phrase support in the match operator.

This could be through a special syntax:

FROM logs*
| WHERE message:`Internal Server Error`

Boosting

Boosting is a popular option, that has a well know notation in the query DLS using ^ that we can reuse for the match operator:

FROM books METADATA _score
| WHERE title^0.8:"Harry Potter" OR plot^0.2:"Harry Potter"
| SORT _score DESC

Fuzziness

Fuzziness is another popular option when doing lexical search that also has its own notation in the Query DLS.
For the match operator:

FROM books
| WHERE title:"Harry Potter"~AUTO OR plot:"Harry Potter"~10

Multi match

The match operator right now supports a field.
We could add support to query multiple fields (and translate to something like multi-match):

FROM books
| WHERE title,plot,tagline:"Harry Potter"

In combination with boosting:

FROM books
| WHERE title^0.7,plot^0.1,tagline^0.2:"Harry Potter"

cc @carlosdelest

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions