Skip to content

[Enhancement] Improve Pulsar Broker cache defaults to get better out-of-the-box performance #23466

@lhotari

Description

@lhotari

Search before asking

  • I searched in the issues and found nothing similar.

Mailing list discussion thread: https://lists.apache.org/thread/5od69114jfrzo9dkbllxycq8o7ns341y

Motivation

It's crucial to tune the Pulsar broker cache since the defaults in Pulsar are not optimal. Besides poor performance for Pulsar use cases, this leads to wasted CPU and unnecessary network transfer costs in cloud environments.
Tuning the Pulsar broker cache improves performance and reduces costs, especially with high fan-out use cases, Key_Shared subscriptions, and tiered storage.

Solution

Here are some settings which would be better defaults.

  • maxMessagePublishBufferSizeInMB - not broker cache related, but it's necessary to set it to an explicit value when fine-tuning broker cache settings so that direct memory OOM can be avoided. Default is 50% of direct memory. Set to 500
  • managedLedgerCacheSizeMB - the default is 20% of direct memory. It's better to set it to an explicit value to avoid direct memory OOM. Set to 512
  • managedLedgerMaxReadsInFlightSizeInMB - this feature is disabled by default. It's useful for avoiding direct memory OOM, which is a known issue with the default dispatcherDispatchMessagesInSubscriptionThread=true setting unless managedLedgerMaxReadsInFlightSizeInMB is set. Set to 500. The value should be higher than dispatcherMaxReadBatchSize * maxMessageSize.
  • managedLedgerCacheEvictionTimeThresholdMillis - the default 1000 is too low. Set to 10000
  • managedLedgerCacheEvictionIntervalMs - the default 10 is too low. Set to 5000 to avoid spending a lot of CPU with cache eviction.
  • managedLedgerMinimumBacklogCursorsForCaching - the default 0 disables backlog cursors (catch-up read) caching. Set to 3
  • managedLedgerMinimumBacklogEntriesForCaching - the default 1000 is way too high. Set to 1
  • managedLedgerMaxBacklogBetweenCursorsForCaching - the default 10000 is way too low. Set to 2147483647 to disable the limit completely.

Sample settings for broker cache tuning:

yaml format:

  maxMessagePublishBufferSizeInMB: 500
  managedLedgerCacheSizeMB: 512
  managedLedgerMaxReadsInFlightSizeInMB: 500
  managedLedgerCacheEvictionTimeThresholdMillis: 10000
  managedLedgerCacheEvictionIntervalMs: 5000
  managedLedgerMinimumBacklogCursorsForCaching: 3
  managedLedgerMinimumBacklogEntriesForCaching: 1
  managedLedgerMaxBacklogBetweenCursorsForCaching: 2147483647

broker.conf format:

maxMessagePublishBufferSizeInMB=500
managedLedgerCacheSizeMB=512
managedLedgerMaxReadsInFlightSizeInMB=500
managedLedgerCacheEvictionTimeThresholdMillis=10000
managedLedgerCacheEvictionIntervalMs=5000
managedLedgerMinimumBacklogCursorsForCaching=3
managedLedgerMinimumBacklogEntriesForCaching=1
managedLedgerMaxBacklogBetweenCursorsForCaching=2147483647

managedLedgerMaxReadsInFlightSizeInMB will have to be set to value that is higher than dispatcherMaxReadBatchSize * maxMessageSize. Otherwise it could result in error Time-out elapsed while acquiring enough permits on the memory limiter to read from ledger [ledgerid], [topic], estimated read size [read size] bytes for [dispatcherMaxReadBatchSize] entries (check managedLedgerMaxReadsInFlightSizeInMB).
dispatcherMaxReadBatchSize defaults to 100 and maxMessageSize defaults to 5MB in bytes.
There's a separate issue to address the problem when managedLedgerMaxReadsInFlightSizeInMB < ``dispatcherMaxReadBatchSize*maxMessageSize`, #23482.

Alternatives

No response

Anything else?

The broker cache hit rate can be monitored with the Grafana dashboard found at https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/grafana-dashboards/broker-cache-by-broker.json (Apache 2.0 license). The broker cache also impacts offloading. Offloading can be monitored with the dashboard available at https://github.com/apache/pulsar/blob/master/grafana/dashboards/offloader.json .

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions