Skip to content

Perf regression, too many small bulk requests  #13024

Closed
@StephanErb

Description

@StephanErb

APM Server version (apm-server version):
8.13.2

Description of the problem including expected versus actual behavior:

I believe there is a regression in the number of bulk requests/s emitted by APM Server towards Elastic. I don't have all data for it, but I believe the issue started in 8.12.x and then got worse with the upgrade to 8.13.x.

This is from the Elastic cloud console: On otherwise identical clients/agents, we see that with the upgrade from 8.12 to 8.13 the number of indexing requests/s jumped noticeably.

Screenshot_2024-04-16_at_23 59 50

The number of documents/s that get ingested didn't really change much, but our hot nodes started to run under higher load. Fortunately, we observed that setting index.translog.durability=async on the all APM indices drops IOPS per hot node back to acceptable levels:

Screenshot_2024-04-21_at_21 35 36

On a decently tuned bulk_max_size setting I'd not expect such a big drop in IOPS/s. I am thus wondering if the APM server is misconfigured in newer versions.

It almost feels like the server uses the agent defaults instead of its custom bulk size option? Was it maybe forgotten to set preset: custom into the server config once elastic/elastic-agent#3797 got merged into the agent?

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions