We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent f75951e commit 53392e6Copy full SHA for 53392e6
guides/int4_quantization_in_keras.py
@@ -37,7 +37,7 @@
37
38
## Benefits
39
* Memory / bandwidth bound models: When the implementation spends most of its time on memory I/O,
40
- reducing the computation time does not reduce their overall runtime. INT4 reduces bytes
+ reducing the computation time does not reduce its overall runtime. INT4 reduces bytes
41
moved by ~8x vs `float32`, improving cache behavior and reducing memory stalls;
42
this often helps more than increasing raw FLOPs.
43
* Accuracy: Many architectures retain acceptable accuracy with INT4; encoder-only models
0 commit comments