microsoft / onnxruntime-genai Public

Notifications You must be signed in to change notification settings
Fork 162
Star 656

Code
Issues 88
Pull requests 27
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: microsoft/onnxruntime-genai

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

9 Open 28 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'CUDA execution provider is not enabled in this build.' ep:CUDA

#1191 opened Jan 16, 2025 by karayakar

[Linux] [C#] [0.5.2] Publish/Build Binaries with Microsoft.ML.OnnxRuntimeGenAI.Cuda on Linux containing unnecessary large DLL files ep:CUDA

#1135 opened Dec 10, 2024 by jiaxuwu2021

Cuda / DirectML question ep:CUDA ep:DML

#1037 opened Nov 6, 2024 by janjanusek

Genny CUDA memory exception crash ep:CUDA

#939 opened Sep 27, 2024 by lilhoser

Building from Source for Jetson ep:CUDA platform:jetson

#818 opened Aug 20, 2024 by SRai22

Setting specific device_id with set_current_gpu_device_id not working bug

Something isn't working

ep:CUDA

#730 opened Jul 29, 2024 by MadMenHitBooker

Inference with batching is significantly slower than without batching. ep:CUDA

#714 opened Jul 20, 2024 by Jester6136

ONNXRuntime-genai doesn't release GPU memory after first inference ep:CUDA performance

#526 opened May 28, 2024 by Positronx

How to release GPU memory after each inference? enhancement

New feature or request

ep:CUDA performance

#446 opened May 13, 2024 by nguyenthekhoig7

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly