-
Notifications
You must be signed in to change notification settings - Fork 77
feat: Expose BF16 precision in TensorRT #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Adding @yinggeh to review this. I'm not as familiar w/ ONNX->TRT as I should be. |
Not familiar either but looks like a small change. I will take a look today. |
|
@dwyatte, please make sure you've completed the contribution requirements: https://github.com/triton-inference-server/server?tab=readme-ov-file#contributing. Thank you. |
@whoisj Block (my corporate entity) has previously completed the CLA here, but let me know if I need to personally submit something too |
Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com>
Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if the change is indeed this simple.
|
Please update PR title and description using the template https://github.com/triton-inference-server/server/blob/main/.github/PULL_REQUEST_TEMPLATE/pull_request_template_external_contrib.md. Fill n/a for any field doesn't apply |
@yinggeh Done! |
|
@whoisj Is our client eligible for contributing now? |
What does the PR do?
BF16 was added to the ONNX runtime TensorRT EP in microsoft/onnxruntime#24743, this PR should expose it to Triton's ONNX backend
Checklist
Agreement
<commit_type>: <Title>pre-commit install, pre-commit run --all)Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
n/a
Where should the reviewer start?
src/onnxruntime.cc
Test plan:
n/a
Caveats:
n/a
Background
See microsoft/onnxruntime#24743 for more info
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)