Skip to content

Single query on the uri wildcard field caused OOM #128201

Closed
@jilldoty-elastic

Description

@jilldoty-elastic

Elasticsearch Version

Serverless

Installed Plugins

No response

Java Version

bundled

OS Version

Serverless

Problem Description

A query on the uri wildcard field being expanded in 260k term queries; this resulted in 2.4GB consumed by one query.

Looking at the heap dump, we can see that the source contained more than 9K terms on a couple of url.* fields, that based on mappings they're both defined as wildcard fields.

... "url": { "properties": { ..., "full": { "type": "wildcard", "fields": { "text": { "type": "match_only_text" } } }, "original": { "type": "wildcard", "fields": { "text": { "type": "match_only_text" } } }, ... } ...

This query generates ~27K BinaryDvConfirmedAutomatonQuery queries, which in turn, each one of them holds 10 (based on MAX_CLAUSES_IN_APPROXIMATION_QUERY)TermQueries with a size of ~0.01 MB. So we end up with a single query consuming ~2GB of heap.

There is a failover mechanism for the individual wildcard queries to prevent "expensive" ones, but these are all batched underneath a should clause, so even though could be individually ok, collectively, they could cause OOMs.

We should identify in advance potential issues with the queries and fail early (to avoid OOMs).

Steps to Reproduce

Unavailable

Logs (if relevant)

No response

Metadata

Metadata

Assignees

Labels

:Search Relevance/SearchCatch all for Search Relevance>bugTeam:Search RelevanceMeta label for the Search Relevance team in Elasticsearchpriority:highA label for assessing bug priority to be used by ES engineers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions