JetStream split-brain due to single-node power failure with pauses/partitions

### Observed behavior

In NATS 2.12.1, with a five-node cluster and replication-factor 5, JetStream streams can end up in split-brain when a single node experiences a simulated power failure. Acknowledged records are visible to consumers connected to some nodes, but not others. This problem is readily reproducible with as little as a single process crash, and some careful process pauses before and after the crash, which helps ensure that specific nodes are the first to form a quorum. I suspect this behavior should also be possible without process pauses, but it should be significantly less frequent.

For example, take [this test run](https://github.com/user-attachments/files/23638240/20251119T155654.398-0600.zip), which killed node n3 at approximately 45 seconds, leaving n1, n2, n4, and n5 intact. Before killing n3 we paused n1 and n2, and when n3 was restarted, we resumed those nodes and paused n4 and n5.

Note that there is a brief pause (presumably for leader election) when we first pause n1 and n2. Upon killing and restarting n3 at ~45 seconds, operations fail to complete for a brief time as n3 starts up. Then the cluster begins accepting writes again.

<img width="900" height="400" alt="Image" src="https://github.com/user-attachments/assets/9cb9665a-db39-4b24-b031-c768b68c0bcd" />

Unfortunately, this was unsafe: n1 and n2 were missing acknowledged writes because they were paused, and n3 lost acknowledged writes, thanks to NATS' choice not to sync records to disk. Allowing n1, n2, and n3 to continue accepting writes after this time caused nodes n1 and n2 to lose roughly five seconds worth of writes immediately prior to the crash, even though those records were present on other nodes! Writes were lost both before and after acknowledged writes performed by the same process, which tells us that the log has holes in it: a violation of NATS' Linearizability claim.

<img width="900" height="400" alt="Image" src="https://github.com/user-attachments/assets/6d9d8cc5-eaf7-43a4-bb61-1d285cdf2b8b" />

This was meant as a specific example of #7564, which noted that JetStream will acknowledge publish calls even when nodes have not written those records to disk. This is generally unsafe in consensus systems because of scenarios like the one outlined above. However, this case actually led to replica *divergence*, which is even weirder--I'm filing a separate issue for it.

Sometimes records are missing on n3. In this particular case, they were missing from n1 and n4, which is particularly odd, given the structure of the process pauses. Also, a postfix was missing on n5!

<img width="900" height="400" alt="Image" src="https://github.com/user-attachments/assets/1b160712-eac1-448c-bf68-325389d70f49" />

The critical phenomenon here is the process crash. The pauses here are used to control which nodes can become leaders, and to broaden the windows of concurrency--both of which raise our chances to observe data loss. You can get the same effect with network partitions, and I suspect variable network latencies might be sufficient to cause this too.

### Expected behavior

NATS replicas should definitely not diverge, and should also not lose data when a single node crashes.

### Server and client version

NATS 2.12.1, jnats 2.24.0

### Host environment

This is a cluster of LXC nodes running under Jepsen.

### Steps to reproduce

With the NATS Jepsen test at 1d295bbc93620522087400660e187c1a733450c6, try:

```
lein run test --nemesis pause-kill --time-limit 60 --version 2.12.1 --rate 1000 --final-time-limit 30 --sync-interval 10 --lazyfs
```

This works with any positive sync-interval, but the default two minutes means that we have to wait two minutes before NATS does its initial sync--this shortening the interval lets us run a shorter test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

JetStream split-brain due to single-node power failure with pauses/partitions #7567

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

JetStream split-brain due to single-node power failure with pauses/partitions #7567

Description

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions