Problems with connection to Nomad master (non-leader) causes all allocations restart

### Nomad version
Nomad v1.5.6
BuildDate 2023-05-19T18:26:13Z
Revision https://github.com/hashicorp/nomad/commit/8af70885c02ab921dedbdf6bc406a1e886866f80

#### Cluster structure
3 master nodes:
10.1.15.21 - leader
10.1.15.22
10.1.15.23
2 client nodes:
10.1.15.31
10.1.15.32
3 consul cluster nodes:
10.1.15.11
10.1.15.12
10.1.15.13

### Operating system and Environment details
Fedora release 35 (Thirty Five)

### Issue
Issue is related to https://github.com/hashicorp/nomad/issues/17973. 
In the https://github.com/hashicorp/nomad/issues/17973 issue, after our leader Node1 had had CSI/cpu/mem problems we initially rebooted it. 
Then cluster lost leadership and Node2 became new leader. Cluster worked fine.
Then Node1 was back online and joined the cluster, but it was CSI corrupted, and hanged right after joining the cluster.
Then our cluster lost leadership. New leader was not elected.
But this time client nodes were not down, and all allocations on the whole cluster were restarted.

### Reproduction steps
After we removed CSI the only way to reproduce the issue quickly is to block 4647 port on non-leader node:
`iptables -A INPUT -p tcp --destination-port 4647 -j DROP`
We assume that it imitates the issue we had with CSI because it blocks not all, but a part of functionality of a node.
In this case we block nomad-server2 (non-leader).

#### Expected Result
We expected cluster to not fail. Two left nodes are fine, one of them is leader. Allocations on client nodes are not restarted.

#### Actual Result
New leader was not elected, all allocations on client nodes were restarted.
`nomad server members` output on leader node:
```
nomad-server-1.global  10.1.15.21  4648  alive   true    3             1.5.6  dc1         global
nomad-server-2.global  10.1.15.22  4648  alive   false   3             1.5.6  dc1         global
nomad-server-3.global  10.1.15.23  4648  alive   false   3             1.5.6  dc1         global
```
`nomad server members` output on non-leader node nomad-server-2 (where 4647 is blocked):
```
nomad-server-1.global  10.1.15.21  4648  alive   false   3             1.5.6  dc1         global
nomad-server-2.global  10.1.15.22  4648  alive   false   3             1.5.6  dc1         global
nomad-server-3.global  10.1.15.23  4648  alive   false   3             1.5.6  dc1         global

Error determining leaders: 1 error occurred:
        * Region "global": Unexpected response code: 500 (rpc error: failed to get conn: rpc error: lead thread didn't get connection)
```

#### Nomad logs
[client1.log](https://github.com/hashicorp/nomad/files/12090947/client1.log)
[client2.log](https://github.com/hashicorp/nomad/files/12090948/client2.log)
[server1-leader.log](https://github.com/hashicorp/nomad/files/12090949/server1-leader.log)
[server2.log](https://github.com/hashicorp/nomad/files/12090950/server2.log)
[server3.log](https://github.com/hashicorp/nomad/files/12090951/server3.log)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problems with connection to Nomad master (non-leader) causes all allocations restart #17974

Nomad version

Cluster structure

Operating system and Environment details

Issue

Reproduction steps

Expected Result

Actual Result

Nomad logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problems with connection to Nomad master (non-leader) causes all allocations restart #17974

Description

Nomad version

Cluster structure

Operating system and Environment details

Issue

Reproduction steps

Expected Result

Actual Result

Nomad logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions