Skip to content

[Feature Request] Add more options to load models at InferenceSession constructor #23940

Open
@vpenades

Description

@vpenades

Describe the feature request

Right nowm InferenceSession constructor is able to load models from two sources:

  • string (path to model)
  • byte[] (the actual model)

When using the constructors using Byte[] , the previous steps typically involve loading the model from a Stream, and in many cases also involve a MemoryStream that has a .ToArray() that returns the byte[] array of the loaded file.

The problem is that the .ToArray() of a MemoryStream creates a copy of the loaded file because the internal buffers of memory stream are actually larger.

Certainly it could be possible to load the model straight into a byte[] array provided you know the file length beforehand, but that's not always possible nor reliable when using certain Streams. So the safe approach to load a stream is to copy it to a MemoryStream and then extract the bytes from it.

MemoryStream has a way to avoid creating a copy, which is using TryGetBuffer(); that returns an ArraySegment<Byte> , which is what I think the constructors should use instead of Byte[].

So my request is to add additional constructors to InferenceSession:

 public InferenceSession(ArraySegment<Byte> model);
 public InferenceSession(Stream model);

Describe scenario use case

To be able to load models from sources other than file system path or a straight byte[] array, to avoid creating memory copies.

ArraySegment<Byte> modelBytes;

using(var m = new MemoryStream())
{
   using(var s = await httpClient.GetStreamAsync("model url"))
   {
      s.CopyTo(m);
   }
  
   m.TryGetBuffer(out modelBytes);  // get the buffer without creating a copy
}

session = new InferenceSession(modelBytes);

Indirectly, this will help lower the memory pressure when using large models on devices with little memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api:CSharpissues related to the C# APIfeature requestrequest for unsupported feature or enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions