Skip to content

Refactor providers into separate libraries #1190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Feb 13, 2025
Merged

Conversation

RyanUnderhill
Copy link
Contributor

This removes most of the #if USE_CUDA and #if USE_DML blocks for the model handling code. Device memory management is also handled through the DeviceSpan structure and now all data copying is done in a device independent manner.

It's a huge change, and there will be some rough edges when submitted. Goal is to unblock other people needing the changes and then to make larger improvements in future prs.

@natke natke self-requested a review January 27, 2025 18:31
Copy link
Contributor

@natke natke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you enumerate the places where any #ifdefs remain and why they need to be there please

And what impact will the rough edges have and can they be smoothed before you merge this PR?

Copy link
Collaborator

@baijumeswani baijumeswani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass through the code

@RyanUnderhill
Copy link
Contributor Author

Can you enumerate the places where any #ifdefs remain and why they need to be there please

And what impact will the rough edges have and can they be smoothed before you merge this PR?

There are some #if USE_CUDA in our tests, this shouldn't be a problem
There are two #if USE_DML, one in generators.cpp due to it not being a shared library and a second in model.cpp for a similar reason. Making it into a shared library should factor those out and remove the #ifs (the shared library's existence takes the place of the #if, since when it's static you will fail to build without the #if)

The rough edges are just expected simple bugs we'll find and easily fix that I can't find in advance.

baijumeswani
baijumeswani previously approved these changes Feb 4, 2025
@RyanUnderhill RyanUnderhill merged commit 79f3970 into main Feb 13, 2025
14 checks passed
@RyanUnderhill RyanUnderhill deleted the ryanunderhill/providers branch February 13, 2025 23:26
baijumeswani added a commit that referenced this pull request Jun 27, 2025
Before the device interface was introduced in
#1190, the dml
objects were tied to the model. The device interface abstraction
decoupled the device specific objects and the `OgaModel`.

For dml, this meant that the dml objects now lived in a global scope
(they were previously owned by the `OgaModel` and hence had the Model
scope). These dml objects upon instantiation create background threads
that retain hardware resources and prevent the driver threads from
terminating. Since these are now in a global scope, the background
threads continue living beyond the lifetime of the Model and can cause
issues since driver threads may be able to terminate correctly leading
to issues in application layers.

Another pull-request
#1378 made it so that
device allocators are cached and tied to a global ort session. As a
result, this device allocator is also linked to the dml objects. Making
it hard to control the lifetime of the dml objects.

This pull request special cases the dml device type so that it destroys
all linked globally scoped variables when the model is destroyed and
re-creates them when a new model is initialized. This way, the dml
threads terminate when the model is destroyed and release driver threads
so they can do their own thing.
baijumeswani added a commit that referenced this pull request Jun 27, 2025
Before the device interface was introduced in
#1190, the dml
objects were tied to the model. The device interface abstraction
decoupled the device specific objects and the `OgaModel`.

For dml, this meant that the dml objects now lived in a global scope
(they were previously owned by the `OgaModel` and hence had the Model
scope). These dml objects upon instantiation create background threads
that retain hardware resources and prevent the driver threads from
terminating. Since these are now in a global scope, the background
threads continue living beyond the lifetime of the Model and can cause
issues since driver threads may be able to terminate correctly leading
to issues in application layers.

Another pull-request
#1378 made it so that
device allocators are cached and tied to a global ort session. As a
result, this device allocator is also linked to the dml objects. Making
it hard to control the lifetime of the dml objects.

This pull request special cases the dml device type so that it destroys
all linked globally scoped variables when the model is destroyed and
re-creates them when a new model is initialized. This way, the dml
threads terminate when the model is destroyed and release driver threads
so they can do their own thing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants