Skip to content

FailoverHeartbeatTTL should be configurable #10341

Open
@tommyalatalo

Description

@tommyalatalo

Proposal

The FailoverHeartbeatTTL used when the leader node goes down is hard coded to 5 minutes.
This is a very long time in our use case where we're running clients and servers colocated on 3 nodes.
We need this parameter to be configurable so that it can be lowered to make recovery on colocated server+client nodes faster, since the tasks from the lost client node will not be replaced until the FailoverHeartbeatTTL time expires.

Short: make FailoverHeartbeatTTL configurable in the server config file.

Use-cases

Faster recovery when running a cluster with co-located server and client nodes, for instance 3VMs with nomad server+client on each, and no other nodes in the cluster.

Attempted Solutions

None available, FailoverHeartBeatTTL is hard coded and not exposed as a configuration parameter.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Needs Roadmapping

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions