[2.4] Behaviour change: Using zpool status or zfs get on a pool with spun down HDDs causes them to spin up

### System information
Type | Version/Name
 --- | ---
Distribution Name	| Proxmox
Distribution Version	| 9.1.5
Kernel Version	| Linux 6.17.9-1-pve
Architecture	| x86_64
OpenZFS Version	| zfs-2.4.0-pve1


## Summary

Since upgrading from OpenZFS 2.3.x to 2.4.0, metadata/status queries such as:

```bash
zpool status
zfs get -Hp available,used <pool>
```

can cause an otherwise completely idle HDD-backed pool to spin up.

This did **not** occur on OpenZFS 2.3.x under the same workload and configuration.

Previously, the HDD pool would only spin up when explicit data reads or writes were directed to it.


## Environment Description

- Pool name: `hdd-pool`
- Backed solely by spinning hard drives
- Used as cold storage
- Typically accessed once per month to copy data from SSD-backed pools
- Otherwise fully idle and allowed to spin down

Drives are spun down automatically using `hd-idle`.

On OpenZFS 2.3.x, they would remain spun down indefinitely while idle.

### Additional Information

For approximately two years, I have been running a prometheus exporter https://github.com/pdf/zfs_exporter with the following collectors enabled:

```
--collector.dataset-filesystem
--properties.dataset-filesystem="available,logicalused,quota,referenced,used,usedbydataset,usedsnap,written"
--collector.dataset-volume --properties.dataset-volume="available,logicalused,referenced,used,usedbydataset,usedsnap,volsize,written"
--collector.pool
```

Under OpenZFS 2.3.x, the exporter scraped all pools (including `hdd-pool`) without triggering HDD spin-ups.

I also frequently ran:

```bash
zpool status
zfs get -Hp available,used hdd-pool
```

These commands did **not** previously cause spin-ups.


### Behaviour Since OpenZFS 2.4

After upgrading to 2.4.0:

- The Prometheus ZFS exporter occasionally spins up the HDD pool.
- `pvestatd` queries (e.g. `zfs get`) occasionally spin up the pool.
- Manual `zpool status` and `zfs get -Hp available,used` sometimes spins up the pool.

Spin-up does not always happen immediately, but may occur after sufficient idle time. I've been seeing intervals between spin-ups of 6-12 hours on average, leading to about 2-3 spin ups per day.

This suggests metadata required to answer these queries is no longer being satisfied purely from in-memory state (SPA/ARC), and is requiring vdev I/O under some conditions.

### My current workaround

I have added `hdd-pool` to the exclusion list of the Prometheus ZFS exporter, and disabled the Proxmox storage on the pool to avoid the `pvestatd` queries. This is inconvenient however, as it removes the pool from my grafana views.

I've also stopped doing `zpool status` and `zfs get` commands without specifying a pool explicitly, to avoid querying `hdd-pool` state. However due to muscle memory I often forget, causing an unnecessary spin up.

The only way to avoid this entirely seems to be to export the pool when not in use, so that it does not show up when doing `zpool status` and similar commands. However that means that my existing cronjobs to handle automatic monthly backups/scrubs do not work, and will require updating to import the pool at the start and export the pool after completion, which is inconvenient.


## Related issue

This is distinct from #18082 but likely related.

I encountered that issue as well (HDDs not spinning down due to TXG time flushing) and mitigated it by increasing `/sys/module/zfs/parameters/spa_note_txg_time` to `31557600` i.e. 1 year.

After working around that issue, metadata/status queries still occasionally cause spin-ups. This report concerns that separate behavior change.



## Steps to Reproduce

1. Create a ZFS pool backed only by HDDs.
2. Ensure no datasets are actively accessed.
3. Work around #18082 by doing e.g.:
    ```
    echo 31557600 > /sys/module/zfs/parameters/spa_note_txg_time
    ```
4. Spin down drives:
    ```bash
    hdparm -y /dev/sdX
    ```
5. Wait until the pool is idle.
6. After some time has passed, run:

   ```bash
   zpool status
   ```

   or

   ```bash
   zfs get -Hp available,used <pool>
   ```

    Repeat this last step until your drives spin up. In my experience, it can take multiple hours.


## Questions

1. Was there an intentional change in 2.4 affecting this functionality?
2. Is this related to internal TXG time/statistics changes discussed in #18082 ?
3. Is there a tunable to restore previous behavior?
4. Is this considered a regression?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2.4] Behaviour change: Using zpool status or zfs get on a pool with spun down HDDs causes them to spin up #18247

System information

Summary

Environment Description

Additional Information

Behaviour Since OpenZFS 2.4

My current workaround

Related issue

Steps to Reproduce

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Version/Name
Distribution Name	Proxmox
Distribution Version	9.1.5
Kernel Version	Linux 6.17.9-1-pve
Architecture	x86_64
OpenZFS Version	zfs-2.4.0-pve1

[2.4] Behaviour change: Using zpool status or zfs get on a pool with spun down HDDs causes them to spin up #18247

Description

System information

Summary

Environment Description

Additional Information

Behaviour Since OpenZFS 2.4

My current workaround

Related issue

Steps to Reproduce

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions