Skip to content

Single query on the uri wildcard field caused OOM #128201

Open
@jilldoty-elastic

Description

@jilldoty-elastic

Elasticsearch Version

Serverless

Installed Plugins

No response

Java Version

bundled

OS Version

Serverless

Problem Description

A query on the uri wildcard field being expanded in 260k term queries; this resulted in 2.4GB consumed by one query.

Looking at the heap dump, we can see that the source contained more than 9K terms on a couple of url.* fields, that based on mappings they're both defined as wildcard fields.

... "url": { "properties": { ..., "full": { "type": "wildcard", "fields": { "text": { "type": "match_only_text" } } }, "original": { "type": "wildcard", "fields": { "text": { "type": "match_only_text" } } }, ... } ...

This query generates ~27K BinaryDvConfirmedAutomatonQuery queries, which in turn, each one of them holds 10 (based on MAX_CLAUSES_IN_APPROXIMATION_QUERY)TermQueries with a size of ~0.01 MB. So we end up with a single query consuming ~2GB of heap.

There is a failover mechanism for the individual wildcard queries to prevent "expensive" ones, but these are all batched underneath a should clause, so even though could be individually ok, collectively, they could cause OOMs.

We should identify in advance potential issues with the queries and fail early (to avoid OOMs).

Steps to Reproduce

Unavailable

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions