Confusion between pekko.cluster.failure-detector and pekko.remote.watch-failure-detector #2657

regiskuckaertz · 2026-02-07T16:01:52Z

regiskuckaertz
Feb 7, 2026

Hi. We have had an incident recently where a long GC pause caused the heartbeat to fail. I was surprised because we had such an incident a long time ago and set the property pekko.cluster.failure-detector.acceptable-heartbeat-pause for that specific reason (default is 3s).

However, the value that was logged in the message Previous heartbeat was sent [10738] ms ago did not come close to the acceptable pause we had set. After re-reading the docs, it seems the pause must be specified at pekko.remote.watch-failure-detector.acceptable-heartbeat-pause (default is 10s).

Both properties appear in the reference.conf, respectively under the cluster and remote projects. I've been looking at the code and now I am no longer sure the former is used anywhere ... except in some deprecated ClusterClient module under cluster-tools.

Can you please help me understand if/when either pekko.cluster.failure-detector and pekko.remote.watch-failure-detector are used? Would it make sense to drop one or the other, at least update the documentation to make it clearer?

regiskuckaertz · 2026-02-10T10:03:40Z

regiskuckaertz
Feb 10, 2026
Author

@pjfanning @He-Pin gentle ping ☝️

0 replies

hanishi · 2026-02-17T13:33:57Z

hanishi
Feb 17, 2026

pekko.cluster.failure-detector is used by Pekko Cluster to determine whether a cluster member/node is reachable or unreachable based on heartbeat statistics, influencing cluster membership state and downing decisions, while pekko.remote.watch-failure-detector is used by Pekko Remoting’s DeathWatch mechanism to decide when a remote ActorSystem/address should be considered unreachable so that watched remote actors can emit Terminated; the two are not redundant because they operate at different layers (cluster membership vs remote watch semantics), and logs like “Previous heartbeat was sent … ms ago” just means the last heartbeat left this JVM about 10.7 seconds ago.

pekko.cluster.failure-detector.acceptable-heartbeat-pause
https://pekko.apache.org/docs/pekko/1.0/typed/failure-detector.html

pekko.remote.watch-failure-detector.acceptable-heartbeat-pause
https://pekko.apache.org/docs/pekko/1.3/remoting-artery.html

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion between pekko.cluster.failure-detector and pekko.remote.watch-failure-detector #2657

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Confusion between pekko.cluster.failure-detector and pekko.remote.watch-failure-detector #2657

Uh oh!

regiskuckaertz Feb 7, 2026

Replies: 2 comments

Uh oh!

regiskuckaertz Feb 10, 2026 Author

Uh oh!

hanishi Feb 17, 2026

regiskuckaertz
Feb 7, 2026

regiskuckaertz
Feb 10, 2026
Author

hanishi
Feb 17, 2026