Refactor providers into separate libraries #1190

RyanUnderhill · 2025-01-16T07:05:41Z

This removes most of the #if USE_CUDA and #if USE_DML blocks for the model handling code. Device memory management is also handled through the DeviceSpan structure and now all data copying is done in a device independent manner.

It's a huge change, and there will be some rough edges when submitted. Goal is to unblock other people needing the changes and then to make larger improvements in future prs.

Details: Add a DML DeviceInterface and DML DeviceBuffer handler. Remove #if blocks that are doing memory copies between device/cpu memory and use the DeviceSpan interface.

Remove as many #if USE_CUDA/USE_DML as possible

src/cuda/interface.cpp

src/dml/interface.cpp

src/models/input_ids.cpp

natke

Can you enumerate the places where any #ifdefs remain and why they need to be there please

And what impact will the rough edges have and can they be smoothed before you merge this PR?

baijumeswani

First pass through the code

test/c_api_tests.cpp

src/models/captured_graph_pool.cpp

src/models/input_ids.cpp

src/generators.h

src/generators.cpp

src/cuda/beam_search_scorer_cuda.cuh

src/models/whisper.cpp

RyanUnderhill · 2025-01-28T19:56:45Z

Can you enumerate the places where any #ifdefs remain and why they need to be there please

And what impact will the rough edges have and can they be smoothed before you merge this PR?

There are some #if USE_CUDA in our tests, this shouldn't be a problem
There are two #if USE_DML, one in generators.cpp due to it not being a shared library and a second in model.cpp for a similar reason. Making it into a shared library should factor those out and remove the #ifs (the shared library's existence takes the place of the #if, since when it's static you will fail to build without the #if)

The rough edges are just expected simple bugs we'll find and easily fix that I can't find in advance.

src/models/model.cpp

Lint

src/models/model.h

Co-authored-by: aciddelgado <[email protected]>

Before the device interface was introduced in #1190, the dml objects were tied to the model. The device interface abstraction decoupled the device specific objects and the `OgaModel`. For dml, this meant that the dml objects now lived in a global scope (they were previously owned by the `OgaModel` and hence had the Model scope). These dml objects upon instantiation create background threads that retain hardware resources and prevent the driver threads from terminating. Since these are now in a global scope, the background threads continue living beyond the lifetime of the Model and can cause issues since driver threads may be able to terminate correctly leading to issues in application layers. Another pull-request #1378 made it so that device allocators are cached and tied to a global ort session. As a result, this device allocator is also linked to the dml objects. Making it hard to control the lifetime of the dml objects. This pull request special cases the dml device type so that it destroys all linked globally scoped variables when the model is destroyed and re-creates them when a new model is initialized. This way, the dml threads terminate when the model is destroyed and release driver threads so they can do their own thing.

RyanUnderhill added 13 commits November 22, 2024 17:24

Use DeviceInterface for debugging

7e4668b

Merge remote-tracking branch 'origin/main' into ryanunderhill/providers

34381af

Merge remote-tracking branch 'origin/main' into ryanunderhill/providers

35e79ce

Summary: Remove #ifdefs for providers and go through device interface.

3823664

Details: Add a DML DeviceInterface and DML DeviceBuffer handler. Remove #if blocks that are doing memory copies between device/cpu memory and use the DeviceSpan interface.

Finish refactoring model processing

41b462a

Remove as many #if USE_CUDA/USE_DML as possible

Merge with main

237fb1e

Fix merge build issues

bdbb09c

Formatting

66321dd

Build fixes

0bc39a5

Merge with main

0f2ea36

Build fix

5244049

Build fix

49b51ef

Fix input_ids issue from merge

d3db2f6

aciddelgado reviewed Jan 21, 2025

View reviewed changes

aciddelgado reviewed Jan 22, 2025

View reviewed changes

src/models/input_ids.cpp Outdated Show resolved Hide resolved

aciddelgado reviewed Jan 22, 2025

View reviewed changes

src/models/input_ids.cpp Show resolved Hide resolved

RyanUnderhill added 14 commits January 21, 2025 16:56

Fix C# unit tests

133d5a0

Try again to fix C# test

b079b74

Merge with main

0e7064c

Test theory

afecf1d

Test instrumenting

1734f5c

Crash investigation

2bc83eb

Extra debug logging

fd788d7

Merge with main

67d914c

Undefined behavior fix in startup

0303592

Don't load cuda library outside of linux & windows

d87807c

Fix iOS break

2df5fe1

Android tweak

6736517

Leftover #ifdef fix

a011fe0

Type tweak

c11704f

natke self-requested a review January 27, 2025 18:31

natke reviewed Jan 27, 2025

View reviewed changes

natke requested review from hanbitmyths, baijumeswani and ajindal1 January 27, 2025 18:34

baijumeswani reviewed Jan 27, 2025

View reviewed changes

Review feedback

45dad2b

edgchen1 reviewed Jan 29, 2025

View reviewed changes

src/models/model.cpp Outdated Show resolved Hide resolved

RyanUnderhill added 5 commits January 29, 2025 12:51

Edward gave me ideas.

53c666c

Clean up allocators, now everything is through p_device_* interfaces.

e804697

Previous change also added device interfaces for webgpu & qnn

f8ed9ce

Lint

Remove accidental change

198e8f8

Device check simplifications

e6b77f2

aciddelgado reviewed Jan 30, 2025

View reviewed changes

src/models/model.h Outdated Show resolved Hide resolved

RyanUnderhill and others added 3 commits January 30, 2025 16:02

Refactor device_type

4f2f084

Merge remote-tracking branch 'origin/main' into ryanunderhill/providers

0765339

Update src/models/model.h

acba52c

Co-authored-by: aciddelgado <[email protected]>

baijumeswani previously approved these changes Feb 4, 2025

View reviewed changes

RyanUnderhill added 2 commits February 12, 2025 18:26

Merge with main

4bcfa33

Fix merge conflicts

68a6ea7

RyanUnderhill dismissed baijumeswani’s stale review via 68a6ea7 February 13, 2025 02:26

Formatting

12e2f76

baijumeswani approved these changes Feb 13, 2025

View reviewed changes

RyanUnderhill merged commit 79f3970 into main Feb 13, 2025
14 checks passed

RyanUnderhill deleted the ryanunderhill/providers branch February 13, 2025 23:26

baijumeswani mentioned this pull request Jun 27, 2025

[DML] Bind the dml global objects to the Model #1590

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor providers into separate libraries #1190

Refactor providers into separate libraries #1190

Uh oh!

RyanUnderhill commented Jan 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

natke left a comment •

edited

Loading

Uh oh!

baijumeswani left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RyanUnderhill commented Jan 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Refactor providers into separate libraries #1190

Refactor providers into separate libraries #1190

Uh oh!

Conversation

RyanUnderhill commented Jan 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

natke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

baijumeswani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RyanUnderhill commented Jan 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

natke left a comment •

edited

Loading