Open
Conversation
Collaborator
|
✅ Результаты тестирования PR #1047 Логи тестирования (нажмите чтобы развернуть)=== СТАТУС: Успешно выполнены программы: main_matrix_transpose, main_matrix_multiply === === main_matrix_transpose stdout (exit code: -11 (segfault после выполнения)) === Found 1 GPUs in 8.72174 sec (CUDA: 0.11048 sec, OpenCL: 0.8089 sec, Vulkan: 7.80229 sec) Available devices: Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb. Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb. Using OpenCL API... Matrix size: rows=H=8192 x cols=W=16384 (512 MB) ______________________________________________________ Evaluating algorithm #1/2: 01 naive transpose (non-coalesced) Kernels compilation done in 3.46203 seconds algorithm times (in seconds) - 10 values (min=0.028699 10%=0.0297832 median=0.0298146 90%=3.4951 max=3.4951) median effective algorithm bandwidth: 33.5407 GB/s ______________________________________________________ Evaluating algorithm #2/2: 02 transpose via local memory (coalesced) Kernels compilation done in 0.0991758 seconds algorithm times (in seconds) - 10 values (min=0.00847714 10%=0.00847844 median=0.00849441 90%=0.107754 max=0.107754) median effective algorithm bandwidth: 117.724 GB/s === main_matrix_multiply stdout (exit code: -11 (segfault после выполнения)) === Found 1 GPUs in 0.320729 sec (CUDA: 0.125026 sec, OpenCL: 0.0377321 sec, Vulkan: 0.157902 sec) Available devices: Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb. Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb. Using OpenCL API... C = A x B, matrices size: C (rows=H=2048 x cols=W=4096) = A (rows=H=2048 x cols=K=1024) x B (rows=K=1024 x cols=W=4096) matrices data size: A - 8 MB, B - 16 MB, C - 16 MB ______________________________________________________ Evaluating algorithm #1/3: CPU with OpenMP algorithm times (in seconds) - 1 values (min=11.9457 10%=11.9457 median=11.9457 90%=11.9457 max=11.9457) algorithm GFlops: 1.43746 GFlops algorithm effective memory bandwidth: 0.00457799 GB/s ______________________________________________________ Evaluating algorithm #2/3: 01 naive Kernels compilation done in 0.107665 seconds algorithm times (in seconds) - 10 values (min=0.061319 10%=0.0617409 median=0.0631913 90%=0.172276 max=0.172276) algorithm GFlops: 271.738 GFlops algorithm effective memory bandwidth: 0.865428 GB/s relative differences with CPU: 8388608 values (min=0 10%=0 median=2.21073e-07 90%=1.12363e-06 max=2.77294) median relative difference with CPU: 2.21073e-07 99% percentile relative difference with CPU: 1.09303e-05 ______________________________________________________ Evaluating algorithm #3/3: 02 using local memory Kernels compilation done in 0.111219 seconds algorithm times (in seconds) - 10 values (min=0.0288905 10%=0.0315843 median=0.031748 90%=0.136915 max=0.136915) algorithm GFlops: 540.869 GFlops algorithm effective memory bandwidth: 1.72255 GB/s relative differences with CPU: 8388608 values (min=0 10%=0 median=2.21073e-07 90%=1.12363e-06 max=2.77294) median relative difference with CPU: 2.21073e-07 99% percentile relative difference with CPU: 1.09303e-05 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Локальный вывод
Вывод Github CI