Cuda / DirectML question 

Hi there, you made fantastic framework for llms. But what I find very  confusing is how to run this on cuda and direct ml. I simply don't know how to do it in C#..

I there any example? Second question is, do I have to provide different model per cuda, cpu and directml or can it run seamlessly? Or is there a way to convert model to support all or combination of providers? as far as I know onnx it self provides seamles support that's why it's a bit confusing. 

My use case is to deploy to user's device a model and based on his capabilities to choose the provider which can provide best performance. But not in the opposite direction, because I expect my user to know nothing about the ML it self. 

Thank you ✌️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cuda / DirectML question #1037

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cuda / DirectML question #1037

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions