int8 vs int8_float16 compute type #173
palladium123
started this conversation in
General
Replies: 2 comments
-
|
No they are different.
See also the corresponding CTranslate2 documentation: https://opennmt.net/CTranslate2/quantization.html#supported-types |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Is int8_float16 faster for inference? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When running on a GPU, are these identical?
Beta Was this translation helpful? Give feedback.
All reactions