Confusion between pekko.cluster.failure-detector and pekko.remote.watch-failure-detector #2657
Replies: 2 comments
-
|
@pjfanning @He-Pin gentle ping ☝️ |
Beta Was this translation helpful? Give feedback.
-
|
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi. We have had an incident recently where a long GC pause caused the heartbeat to fail. I was surprised because we had such an incident a long time ago and set the property
pekko.cluster.failure-detector.acceptable-heartbeat-pausefor that specific reason (default is 3s).However, the value that was logged in the message
Previous heartbeat was sent [10738] ms agodid not come close to the acceptable pause we had set. After re-reading the docs, it seems the pause must be specified atpekko.remote.watch-failure-detector.acceptable-heartbeat-pause(default is 10s).Both properties appear in the
reference.conf, respectively under theclusterandremoteprojects. I've been looking at the code and now I am no longer sure the former is used anywhere ... except in some deprecatedClusterClientmodule undercluster-tools.Can you please help me understand if/when either
pekko.cluster.failure-detectorandpekko.remote.watch-failure-detectorare used? Would it make sense to drop one or the other, at least update the documentation to make it clearer?Beta Was this translation helpful? Give feedback.
All reactions