Description
One way to think about how an sltr
query functions is that it is a bool
query with custom scoring function.
For example given the following featureset definition:
{
"featurset": {
"features": [
{
"name": "title_text_match",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"title": "{{query_text}}"
}
}
},
{
"name": "description_text_match",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"description": "{{query_text}}"
}
}
},
{
"name": "description_knn_match",
"params": [
"query_embedding"
],
"template_language": "mustache",
"template": "{\"knn\":{\"field\":\"description_vector\",\"k\":10,\"query_vector\":{{#toJson}}query_embedding{{/toJson}}}}"
}
]
}
}
and a model example_model
which was created using the above featureset, the following sltr
query:
{
"sltr": {
"model": "example_model",
"params": {
"query_text": "the text query",
"query_embedding": [1.0, 0.4, ...]
}
}
}
Can be thought conceptually as:
{
"bool": {
"filter": {
"match_all": {}
},
"should": [
{
"match": {
"title": "the text query"
}
},
{
"match": {
"description": "the text query"
}
},
{
"knn": {
"field": "description_vector",
"k": 10,
"query_vector": [1.0, 0.4, ...]
}
}
],
"minimum_should_match": 0,
// plus also use a special scoring function defined by example_model
}
}
It would be great if the features used by the model could have a requirement of a minimum which should match so that the sltr
:
{
"sltr": {
"model": "example_model",
"params": {
"query_text": "the text query",
"query_embedding": [1.0, 0.4, ...]
},
"minimum_should_match": 1
}
}
which would translates to roughly the following:
{
"bool": {
"should": [
{
"match": {
"title": "the text query"
}
},
{
"match": {
"description": "the text query"
}
},
{
"knn": {
"field": "description_vector",
"k": 10,
"query_vector": [1.0, 0.4, ...]
}
}
],
"minimum_should_match": 1,
// plus also use a special scoring function defined by example_model
}
}
This would make sltr
queries more viable to use as part of the initial query and not need to be part of a rescore phase. The use case for this would be to use non-linear models (such as an LambdaMART model) as a means to deal with query clauses which have different scoring distributions which make them difficult to combined using a linear combination.