Skip to content

Too many cancel tasks causing Circuit breaker #120582

Open
@douglli

Description

@douglli

Elasticsearch Version

7.14.2

Installed Plugins

No response

Java Version

openjdk version "1.8.0_232"

OS Version

CentOS Linux release 7.9 (Final)

Problem Description

When a query containing one hundred terms causes a memory meltdown in node a, which then cancels the tasks of other nodes, and causes Circuit breaker exception in two other nodes (b, c), running log like this:
[2025-01-22T11:38:51,422][WARN ][o.e.t.InboundHandler ] [xxx] Circuit breaker exception[transport_response], bytes wanted[30950235606], bytes limited[30893565542], status[TOO_MANY_REQUESTS]

The GC situation of node a is as follows:

Image

many errors occurs at node b,c :
[2025-01-22T11:38:48,950][WARN ][o.e.t.TaskCancellationService] [xxx] failed to remove ban for tasks with the parent [xxx:4469158904] on connection [NodeChannels[{xxx}{jBzstAr8T_OdeyQa7K8lHw}{xl6G7wcfQCez-1kCF37Q6Q}{xxx}{xxx:9300}{hilmrst}]]: [xxx][xxx:9300][internal:admin/tasks/ban]

The GC situation of node b,c is as follows:

Image

Steps to Reproduce

Why does a large number of cancel tasks cause the memory to rise so quickly, until leading to a memory Circuit breaker exception

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions