Nomad client after restart loses track of CSI volumes with staging still in use

### Nomad version
Output from `nomad version`

```
Nomad v1.8.9+ent
BuildDate 2025-01-14T19:11:47Z
Revision fc0f34f5b196ce9fcd0c62b6a5e7ce23934826d4
```

### Operating system and Environment details

Alpine 3.20

### Issue

When using a CSI plugin with staging (e.g. CephCSI for CephFS volumes), Nomad client correctly keeps track of which mounted CSI volumes with access mode `multi-node-multi-writer` are being used by more than one alloc. When two allocations are using the same volume, the staging mount is shared between them, and when one gets stopped the staging mount is kept as long as the other allocation is running (i.e. only `NodeUnpublishVolume` on the volume is called, but not `NodeUnstageVolume`).

However, when Nomad client gets restarted, it loses track of which CSI volumes are still being used by other allocations. In the scenario above with two allocations using the same volume, when one alloc gets stopped, Nomad calls `NodeUnpublishVolume` followed by `NodeUnstageVolume`, even though the volume is still being used by the other allocation.

From the allocation's perspective, the consequences of this apparently depend on what job drivers are being used and how mount propagation is configured on the host. In my environment with Docker driver and parent mount having `shared` propagation it took a while to notice this bug. After unmounting the stage mount in the host mount namespace the allocation that was left running still kept the mount in its own mount namespace, and hence nothing really broke from its perspective. When the second allocation was stopped, Nomad client was able to properly unmount the volume. However, I'd expect more disastrous consequences in less favorable environments.

### Reproduction steps

1. Setup a Nomad cluster with a CSI plugin with staging (e.g. Ceph-CSI)
2. Create a CSI volume with `multi-node-multi-writer` access mode (e.g. CephFS volume)
3. Run two jobs that use the same volume on the same node.
4. Restart Nomad client
5. Stop one of the jobs, Nomad will call `NodeUnstageVolume` on the volume despite the other job still using the volume.

#### Expected Result

After restart, Nomad client should continue to keep track which allocations are using the same CSI volumes with staging.

#### Actual Result

After restart, when there are more than one allocations using the same CSI volume with staging, on stop of one of the allocs Nomad client potentially breaks the volume mount for the other alloc by unmounting the staging path.

### Job file (if appropriate)



### Nomad Server logs (if appropriate)

### Nomad Client logs (if appropriate)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nomad client after restart loses track of CSI volumes with staging still in use #25813

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Expected Result

Actual Result

Job file (if appropriate)

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nomad client after restart loses track of CSI volumes with staging still in use #25813

Description

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Expected Result

Actual Result

Job file (if appropriate)

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions