Prefer DirectML for Windows ONNX transcription models#985
Prefer DirectML for Windows ONNX transcription models#985ferologics wants to merge 2 commits intocjpais:mainfrom
Conversation
3b27bd4 to
32b0150
Compare
🧪 Test Build ReadyBuild artifacts for PR #985 are available for testing. Download artifacts from workflow run Artifacts expire after 30 days. |
|
@ferologics can you see if this helps the inference speed for you still, would be quite curious if it just goes back to CPU or works out the box. I kind of think since it's direct ML it might just work on Win 11, would be curious about Win10 too |
|
Tested the CI-built Windows artifact locally and it looks good. What I checked:
Result:
So on my Windows 11 machine the CI-built artifact is still taking the intended GPU path, not silently dropping back to CPU. |
|
Solid. This is amazing news. I will test on my windows machine when I can and see how it goes as well. I'm curious how this will play with integrated GPUs I am slightly wondering if we will need to provide an option to disable this. Just in case CPU is faster for someone. I know another PR in transcribe rs had something like this. Might be worth considering |
Summary
transcribe-rsdependency to a forked git revision with Windows DirectML support for ONNX modelsDirectMLExecutionProvideron Windows, with explicit CPU fallback if provider registration failsValidation
cargo checkcargo check --releasehandy.logshows successful DirectML registration for the Parakeet ONNX sessions244.38saudio):35.368sbefore vs6.99safter (~5.1xfaster, ~35xrealtime)Dependency patch
transcribe-rsto this git revision:ferologics/transcribe-rs@c56480687127070f456ae462d73c5defe964d807transcribe-rsmainline: