Adding CIG context creation in OrtFactory#27267
Adding CIG context creation in OrtFactory#27267praneshgo wants to merge 2 commits intomicrosoft:mainfrom
Conversation
- Added Init and Deinit API which are to be called from application before calling any interop or ORT APIs
| } | ||
|
|
||
| // Create stream on the CIG context | ||
| CUDA_RETURN_IF_ERROR(cudaStreamCreateWithFlags(&stream, cudaStreamNonBlocking)); |
There was a problem hiding this comment.
I think whenever we get the CreateSyncStream call - we have to create a new stream (and when a CiG context is enabled, this stream should get created on the CiG context). I don't think you can re-use the same previously created stream (btw, CUDA allows creating multiple streams in the same context).
There was a problem hiding this comment.
The stream here is a new stream that is created every time CreateSyncStream is called. The current design seems in line with the expectation in the comment.
@ankan-ban Am I missing something here?
There was a problem hiding this comment.
yeah.. sorry, I missed it. You have exactly same code for stream creation in if and else part (maybe move it out of the conditional code).
|
@gaugarg-nv @gedoensmax |
| * For D3D12: ID3D12CommandQueue* | ||
| * For Vulkan: VkQueue (cast to void*) | ||
| * | ||
| * The factory stores this and uses it for synchronization with inference streams. |
There was a problem hiding this comment.
nit: factory can do whatever it wants so I would expect storing or not is an implementation detail and not a requirement.
| * bound to the provided graphics command queue. Once initialized, any OrtSyncStream created for this | ||
| * ep_device via CreateSyncStreamForEpDevice will be created on the interop context, enabling efficient | ||
| * GPU-side synchronization between ONNX Runtime inference and graphics workloads. | ||
| * | ||
| * This must be called BEFORE CreateSyncStreamForEpDevice for the same ep_device. |
There was a problem hiding this comment.
These seem like they're potentially implementation specific details. Isn't it up to the EP factory what it does in the init and how that affects later calls?
Would also be good to explicitly call out that as this is tied to the OrtEpDevice instance it applies across all sessions the OrtEpDevice is used in (i.e. it's global not per-session).
| /** \brief Graphics device handle (optional, may be inferred from command_queue). | ||
| * | ||
| * For D3D12: ID3D12Device* (optional, can be obtained from command queue) | ||
| * For Vulkan: VkDevice (cast to void*) | ||
| */ | ||
| void* device; |
There was a problem hiding this comment.
Can the device be retrieved using info from the OrtEpDevice or does it need to be passed in? Prefer the former as there's no potential mismatch.
|
|
||
| # --- DX Interop Feature --- | ||
| dx_interop_group = parser.add_argument_group("DX Interop Feature") | ||
| dx_interop_group.add_argument("--use_dx_interop", action="store_true", help="Enable DX Interop feature for graphics API synchronization.") |
There was a problem hiding this comment.
nit: --enable_dx_interop would match the help and I think would describe what's it's doing slightly more clearly.
| // bound to the provided graphics command queue/device. | ||
| // This follows Scott's suggestion to pass in everything required so we don't end up with multiple | ||
| // init function signatures, and the factory stores the queue to utilize in stream creation. | ||
| static OrtStatus* ORT_API_CALL InitGraphicsInteropImpl(OrtEpFactory* this_ptr, |
There was a problem hiding this comment.
Not sure where you typically put unit tests for your code but I would have expected them to be in this PR if it's in this repo.
| ORT_API2_STATUS(InitGraphicsInteropForEpDevice, _In_ const OrtEpDevice* ep_device, | ||
| _In_ const OrtGraphicsInteropConfig* config); |
There was a problem hiding this comment.
Can we please add these two functions to the interop API struct instead of the global ORT API so we keep all the interop related pieces in one place?
There was a problem hiding this comment.
Right, could this be a configuration that is "applied to" an OrtEpDevice's InteropApi? Rather than it being an independent thing.
I'm not sure we really need to qualify it as "Graphics" either, that is a bit more domain specific than this would afford.
Can you elaborate on the specific scenarios where an app developer needs this infrastructure as I would like to understand when the existing API is insufficient and it would be good to capture that info in the PR? |
| */ | ||
| void* command_queue; | ||
|
|
||
| /** \brief Graphics device handle (optional, may be inferred from command_queue). |
There was a problem hiding this comment.
We shouldn't absolutely require a command_queue either right? That is more of a performance optimization.
This config shouldn't be required for the InteropApi to work
Description
Motivation and Context