Skip to content

Using separate cuda streams for one session #23319

@cozeybozey

Description

@cozeybozey

Describe the issue

I have multiple threads that are calling session.run on one session. I recently made it so I am using pinned memory and asynchronous mem copies, which is working great. However, to do this I am using separate cuda streams for the mem copies. I noticed that session.run does not work with these cuda streams. I can link one cuda stream to a session via the options, but I want to use multiple cuda streams and a different one for every run call. How can I achieve this? Or should I just use multiple sessions instead? But then I will have multiple instances of the same model in memory, which doesn't seem great.

To reproduce

--

Urgency

No response

Platform

Windows

OS Version

10

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

16.2

ONNX Runtime API

C++

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:CUDAissues related to the CUDA execution providerstaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions