We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent fde7275 commit 5cd9e26Copy full SHA for 5cd9e26
README.md
@@ -75,6 +75,13 @@ $ cargo install gpu-fryer
75
GPU fryer creates two 8192x8192 matrix and performs a matrix multiplication using CUBLAS.
76
Test allocates 95% of the GPU memory to write results in a ring buffer fashion.
77
78
+If GPU is BF16 capable, it will use BF16 precision instead of FP32 to stress the Tensor Cores.
79
+
80
+With a 8xNVIDIA H100 80GB HBM3 system, we get the following results:
81
82
+
83
+
84
85
## Acknowledgements
86
87
The awesome [GPU Burn](https://github.com/wilicc/gpu-burn), very similar tool but looking at computational errors.
0 commit comments