Skip to content

Genny CUDA memory exception #939

Open
@lilhoser

Description

@lilhoser

Describe the bug
Genny using onnxruntimegenai.cuda package fails to load https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx

To Reproduce
Steps to reproduce the behavior:

  1. Run genny with Debug_Cuda configuration selected.
  2. Open C:\Downloads\models\Phi-3-mini-128k-instruct-onnx\cuda\cuda-int4-rtn-block-32 or C:\Downloads\models\Phi-3-mini-128k-instruct-onnx\cuda\cuda-fp16
  3. Type a message, hit enter
  4. See error

Expected behavior

Is this model supported?

Screenshots

image

Desktop (please complete the following information):

  • OS: Windows 11

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions