Skip to content

Avoiding extra byte copy inside FileSystemImpl.readFileInternal#5677

Closed
doxlik wants to merge 3 commits intoeclipse-vertx:masterfrom
doxlik:refactor/optimizing-blocking-file-read
Closed

Avoiding extra byte copy inside FileSystemImpl.readFileInternal#5677
doxlik wants to merge 3 commits intoeclipse-vertx:masterfrom
doxlik:refactor/optimizing-blocking-file-read

Conversation

@doxlik
Copy link

@doxlik doxlik commented Aug 13, 2025

Motivation:

In this short PR I did optimization that previously was discussed with @franz1981 in this issue: #5665 regarding avoiding of extra bytes copy in FileSystemImpl.readFileInternal method.

Additional notes:

  1. I could not either "mvn package" or "mvn test" as there multiple Http2 related files inside master is not compiling now.
  2. It is my totally first contribution, sorry if I missed something.

Conformance:

Everything is signed, my eclipse account is identical to github one.

@doxlik doxlik force-pushed the refactor/optimizing-blocking-file-read branch 3 times, most recently from 40030e7 to 0ee2869 Compare August 13, 2025 16:53
Path target = resolveFile(path).toPath();
byte[] bytes = Files.readAllBytes(target);
return Buffer.buffer(bytes);
ByteBuf bb = Unpooled.wrappedBuffer(bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vietj : this is ok? Or you prefer to add a new constructor to `BufferImpl' which still uses the existing Vertx heap types eg VertxUnsafeHeapByteBuf/VertxHeapByteBuf ?
The reason I'm asking is:

  • Unpooled::wrappedBuffer still have Netty reference count checks under the hood, implemented - which have a cost
  • IIRC Vertx itself contains checks to make sure the allocator of a vertx buffer is "known" to vertx e.g.
    public static ByteBuf safeBuffer(ByteBuf byteBuf) {
    Class<?> allocClass;
    if (byteBuf != Unpooled.EMPTY_BUFFER &&
    ((allocClass = byteBuf.alloc().getClass()) == AdaptiveByteBufAllocator.class
    || allocClass == PooledByteBufAllocator.class
    || byteBuf instanceof CompositeByteBuf)) {
    try {
    if (byteBuf.isReadable()) {
    ByteBuf buffer = VertxByteBufAllocator.DEFAULT.heapBuffer(byteBuf.readableBytes());
    buffer.writeBytes(byteBuf, byteBuf.readerIndex(), byteBuf.readableBytes());
    return buffer;
    } else {
    return Unpooled.EMPTY_BUFFER;
    }
    } finally {
    byteBuf.release();
    }
    }
    return byteBuf;
    }

One pain point is that AFAIK Netty's heap buffers (one of the two variant) doesn't expose in public to directly set the wrapped buffer - and that's why Netty itself is using only non-Unsafe variants in Unpooled::wrappedBuffer(byte[]): if we want to do this right in Vertx we could leverage a single implementation (i.e.non-unsafe) as well or modify Netty to expose it for Unsafe variants (I can do it if needed)

@doxlik
Copy link
Author

doxlik commented Aug 14, 2025

@franz1981 thank you for your comments, I removed unrelated changes. Actually they were made by Intelij IDEA code autoformatting and I thought that maybe they are okay, but I agree that they create unnecessary noise, so next time I will try to avoid any of them.

@franz1981
Copy link
Contributor

Fyi @vietj

Copy link
Member

@tsegismont tsegismont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @doxlik and @franz1981

How about changing the perform method body to:

      public Buffer perform() {
        try {
          Path target = resolveFile(path).toPath();
          try (FileChannel fc = FileChannel.open(target, StandardOpenOption.READ)) {
            BufferInternal res = BufferInternal.buffer(); // create with file size hint?
            long length = fc.size();
            res.getByteBuf().writeBytes(fc, 0, (int) length); // check actual size before the cast
            return res;
          }
        } catch (IOException e) {
          throw new FileSystemException(getFileAccessErrorMessage("read", path), e);
        }
      }

It's ok to use BufferInternal in Vert.x impl classes and Netty can take care of loading and writing the file bytes appropriately.

@franz1981
Copy link
Contributor

@tsegismont

It looks like

https://github.com/netty/netty/blob/f922892f44743411ce89e8259d705ed9c9d3763e/buffer/src/main/java/io/netty/buffer/UnpooledHeapByteBuf.java#L292

that's going to allocate the internal NIO buffer which usually is not used in the Vertx heap ones (the internal NIO buffer is something used by Netty while sending data over the wire while communicating w NIO JDK layers...).
But It's still a good suggestion and still an improvement vs what we got - while retaining the same behaviour as we have now

@doxlik wdyt?

@doxlik
Copy link
Author

doxlik commented Sep 3, 2025

@franz1981 Regarding the suggestion, what if file really big and bigger than 2 gb? Then here " res.getByteBuf().writeBytes(fc, 0, (int) length); // check actual size before the cast" even if we check the size, we still need to fallback to something that can handle it, either wrapping or have some "while" loop until fully read the file to the buffer.

Regarding generally low level ByteBuffs stuff, it is super interesting but I have almost no solid knowledge of all its internals, so in terms of buffers / allocators optimization I fully trust to you.

@tsegismont
Copy link
Member

@franz1981 Regarding the suggestion, what if file really big and bigger than 2 gb? Then here " res.getByteBuf().writeBytes(fc, 0, (int) length); // check actual size before the cast" even if we check the size, we still need to fallback to something that can handle it, either wrapping or have some "while" loop until fully read the file to the buffer.

I think it's safe to stick to the behavior of java.nio.file.Files#readAllBytes

If the size is bigger than java.nio.file.Files#MAX_BUFFER_SIZE, an OutOfMemoryError error is thrown

@doxlik
Copy link
Author

doxlik commented Sep 19, 2025

@tsegismont then if we want stick to behavior of java.nio.file.Files#readAllBytes, then should we move to the channel approach as you suggested or we can simply leave the current code that already use readAllBytes?

@doxlik
Copy link
Author

doxlik commented Sep 19, 2025

@franz1981 @vietj I also wanted to ask, what will be further process of the review / PR? I mean is it kinda in direct queue of pending PR's for Julien's review / approval or we are waiting for Vert.x 5.1.0 to proceed with this PR?

@tsegismont
Copy link
Member

@tsegismont then if we want stick to behavior of java.nio.file.Files#readAllBytes, then should we move to the channel approach as you suggested or we can simply leave the current code that already use readAllBytes?

We should move to the channel approach but, for the error cases, we can reproduce the behavior of java.nio.file.Files#readAllBytes

@doxlik doxlik force-pushed the refactor/optimizing-blocking-file-read branch from 735560e to fabebae Compare September 23, 2025 08:53
@doxlik
Copy link
Author

doxlik commented Sep 23, 2025

@tsegismont I published solution where tried to go FileChannel option but within error logic of readAllBytes. Check when you have time, please.

tsegismont added a commit to tsegismont/vert.x that referenced this pull request Sep 23, 2025
Closes eclipse-vertx#5677

The improvement consists in creating the buffer upfront and using Netty API to read the file channel.

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
@tsegismont
Copy link
Member

Thank you for the updates @doxlik

I've created #5730 , that supersedes this one.

@tsegismont tsegismont closed this Sep 23, 2025
tsegismont added a commit to tsegismont/vert.x that referenced this pull request Oct 15, 2025
Closes eclipse-vertx#5677

The improvement consists in creating the buffer upfront and using Netty API to read the file channel.

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants