Skip to content

Cuda / DirectML question  #1037

Open
Open
@janjanusek

Description

@janjanusek

Hi there, you made fantastic framework for llms. But what I find very confusing is how to run this on cuda and direct ml. I simply don't know how to do it in C#..

I there any example? Second question is, do I have to provide different model per cuda, cpu and directml or can it run seamlessly? Or is there a way to convert model to support all or combination of providers? as far as I know onnx it self provides seamles support that's why it's a bit confusing.

My use case is to deploy to user's device a model and based on his capabilities to choose the provider which can provide best performance. But not in the opposite direction, because I expect my user to know nothing about the ML it self.

Thank you ✌️

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions