Skip to content

H5FD__direct_write() makes an unnecessary copy #2714

Open
@weikra

Description

@weikra

Describe the bug
Already aligned buffers are copied to a aligned buffer. Leading to slow writes.

Expected behavior
Aligned buffer should be directly written to the file

Platform (please complete the following information)

  • HDF5 version (if building from a maintenance branch, please include the commit hash)
    1.14.0
  • OS and version
    Ubuntu 20.04
  • Compiler and version
    gcc 9.4
  • Build system (e.g. CMake, Autotools) and version
    cmake
  • Any configure options you specified
    HDF5_ENABLE_DIRECT_VFD=ON
  • MPI library and version (parallel HDF5)
    No mpi

Additional context
The issue is taken from this forum post, where I posted a solution that I have tested
https://forum.hdfgroup.org/t/write-perfomance-of-raw-data/10963

I do not understand the use of "_must_align", it seems to me that it can not get to false if you have the o_direct flag enabled which you do in this case. So a solution could be to skip the must_align check and do a run time check in the H5FD__direct_write() function. Like this:


diff --git a/src/H5FDdirect.c b/src/H5FDdirect.c
index ccb8c85cc0..19169b34d5 100644
--- a/src/H5FDdirect.c
+++ b/src/H5FDdirect.c
@@ -1090,7 +1090,8 @@ H5FD__direct_write(H5FD_t *_file, H5FD_mem_t H5_ATTR_UNUSED type, hid_t H5_ATTR_
      * write it directly to the file.  If not, read a bigger and aligned data
      * first, update buffer with user data, then write the data out.
      */
-    if (!_must_align || ((addr % _fbsize == 0) && (size % _fbsize == 0) && ((size_t)buf % _boundary == 0))) {
+    if ((size % _fbsize == 0) && ((size_t)buf % _boundary == 0)) {
+        addr  = (addr / _fbsize) * _fbsize;
         /* Seek to the correct location */
         if ((addr != file->pos || OP_WRITE != file->op) && HDlseek(file->fd, (HDoff_t)addr, SEEK_SET) < 0)
             HSYS_GOTO_ERROR(H5E_IO, H5E_SEEKERROR, FAIL, "unable to seek to proper position")

Metadata

Metadata

Assignees

Labels

Component - C LibraryCore C library issues (usually in the src directory)Priority - 1. HighThese are important issues that should be resolved in the next release

Type

Projects

Status

Scheduled/On-Deck

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions