Skip to content

[GPU] use ParallelReadStreamBuf to boost file reading latency#34679

Draft
riverlijunjie wants to merge 2 commits intoopenvinotoolkit:masterfrom
riverlijunjie:river/mmap_parallel_io_opt
Draft

[GPU] use ParallelReadStreamBuf to boost file reading latency#34679
riverlijunjie wants to merge 2 commits intoopenvinotoolkit:masterfrom
riverlijunjie:river/mmap_parallel_io_opt

Conversation

@riverlijunjie
Copy link
Contributor

@riverlijunjie riverlijunjie commented Mar 13, 2026

Details:

Create a custom std::streambuf subclass that internally uses parallel I/O for large reads, exposing a standard std::istream-compatible interface. Any code that currently accepts a std::istream& gets the speedup transparently — no per-plugin changes required.

Tickets:

AI Assistance:

  • AI assistance used: no / yes
  • If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).

@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: GPU OpenVINO GPU plugin labels Mar 13, 2026
@riverlijunjie riverlijunjie changed the title [GPU] mmap io parallel for model cache loading latency [GPU] Memory Mapped (mmap) Tensor Parallel Reading Mar 13, 2026
@riverlijunjie riverlijunjie changed the title [GPU] Memory Mapped (mmap) Tensor Parallel Reading [GPU] use ParallelReadStreamBuf to boost file reading latency Mar 13, 2026
@github-actions github-actions bot added the category: inference OpenVINO Runtime library - Inference label Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: Core OpenVINO Core (aka ngraph) category: GPU OpenVINO GPU plugin category: inference OpenVINO Runtime library - Inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant