Skip to content

virtio-block needs VIRTIO_BLK_F_FLUSH / VIRTIO_BLK_T_FLUSH support. #492

Open
@luqmana

Description

@luqmana

While investigating seemingly much worse performance for the nvme device compared to virtio-block, @rmustacc pointed out we set the Volatile Write Cache bit for nvme devices but not the similar flush capability for virtio-block.

As a quick test, I tried clearing the vwc bit and rerunning pgbench with an nvme device and got similar (if not very slightly better) results compared to virtio-block (both using the file backend). Given the fact the file backend also doesn't use any sync flags on open, the speedup we were seeing on virtio-block makes sense: turns out it's faster to just assume things are synchronous (when in actually they're not) and never call flush/fsync.

The issue is we don't ever try to negotiate the VIRTIO_BLK_F_FLUSH feature today. Per the VIRTIO spec1:

An implementation that does not offer VIRTIO_BLK_F_FLUSH and does not commit completed writes will not be resilient to data loss in case of crashes.

In addition to advertising and trying to negotiate VIRTIO_BLK_F_FLUSH, we then subsequently need to support VIRTIO_BLK_T_FLUSH commands and forward appropriately to the backend.

Note: implementation wise we can also just choose to always commit writes even without flush support:

If VIRTIO_BLK_F_FLUSH was not offered by the device, the device MAY also commit writes to persistent device backend storage before reporting their completion.

But this relies on better support on the backend's side as well.

Footnotes

  1. 5.2.6.2 Device Requirements: Device Operation

Metadata

Metadata

Assignees

No one assigned

    Labels

    storageRelated to storage devices/backends.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions