Description
Search functionality in ES|QL is being driven primarily through search functions. Functions allow for a lot of flexibility and customisation of options, but ultimately do not feel as delightful as other language constructions, e.g. operators or commands. This issue is an exploration on how we want to evolve full text search as a language construct in ES|QL.
tracked in #123043
Current state
We have already delivered match
, qstr
as ES|QL functions and we are now working on adding multi_match
and match_phrase
.
We also have a dedicated operator :
which translates to a match query.
The operator has a short and delightful syntax, while the functions are more verbose but support configurable options:
FROM books | WHERE title:"Harry Potter"
is equivalent to
FROM books | WHERE match(title, "Harry Potter")
Future evolution of the match operator
We see that users naturally gravitate towards using the match operator, rather than the match function.
Right now we ask them to switch to using functions if they want something more than just doing a simple match.
We should be looking at how can we bridge the gap between the match operator and the functionality of full text functions, while preserving the simplicity of the operator syntax.
Match on all fields
In KQL or Lucene query syntax it is very easy to match on all fields e.g. *:Error
.
To achieve this in ES|QL, we require using the qstr
or kql
function:
FROM logs*
| WHERE qstr("*:Error")
With the match operator this could be expressed in a much simpler way:
FROM logs*
| WHERE *:"Error"
Similar to qstr
, we should use the fields that are defined in index.query.default_field.
Match phrase
We are planning on adding support for match_phrase through functions.
We should also be looking at how we can add match phrase support in the match operator.
This could be through a special syntax:
FROM logs*
| WHERE message:`Internal Server Error`
Boosting
Boosting is a popular option, that has a well know notation in the query DLS using ^
that we can reuse for the match operator:
FROM books METADATA _score
| WHERE title^0.8:"Harry Potter" OR plot^0.2:"Harry Potter"
| SORT _score DESC
Fuzziness
Fuzziness is another popular option when doing lexical search that also has its own notation in the Query DLS.
For the match operator:
FROM books
| WHERE title:"Harry Potter"~AUTO OR plot:"Harry Potter"~10
Multi match
The match operator right now supports a field.
We could add support to query multiple fields (and translate to something like multi-match):
FROM books
| WHERE title,plot,tagline:"Harry Potter"
In combination with boosting:
FROM books
| WHERE title^0.7,plot^0.1,tagline^0.2:"Harry Potter"