Skip to content

Task04 Амир Батыров ITMO#1049

Open
c5xheavy wants to merge 1 commit intoGPGPUCourse:task04from
c5xheavy:task04
Open

Task04 Амир Батыров ITMO#1049
c5xheavy wants to merge 1 commit intoGPGPUCourse:task04from
c5xheavy:task04

Conversation

@c5xheavy
Copy link

@c5xheavy c5xheavy commented Feb 27, 2026

Локальный вывод

$ ./main_prefix_sum 1
Found 3 GPUs in 0.0447905 sec (OpenCL: 0.0246663 sec, Vulkan: 0.0200764 sec)
Available devices:
  Device #0: API: Vulkan. iGPU. Intel(R) Arc(tm) Graphics (MTL). Free memory: 3399/7750 Mb.
  Device #1: API: OpenCL. CPU. Intel(R) Core(TM) Ultra 5 125H. Intel(R) Corporation. Total memory: 15501 Mb.
  Device #2: API: Vulkan. CPU. llvmpipe (LLVM 20.1.2, 256 bits). Free memory: 15501/15501 Mb.
Using device #1: API: OpenCL. CPU. Intel(R) Core(TM) Ultra 5 125H. Intel(R) Corporation. Total memory: 15501 Mb.
Using OpenCL API...
Kernels compilation done in 0.0753197 seconds
Kernels compilation done in 0.0248302 seconds
Kernels compilation done in 0.0234447 seconds
prefix sum times (in seconds) - 10 values (min=0.402294 10%=0.402478 median=0.41817 90%=0.524951 max=0.524951)
prefix sum median effective VRAM bandwidth: 1.78171 GB/s

Вывод Github CI


@GPUcourseBOT
Copy link
Collaborator

Результаты тестирования PR #1049

Логи тестирования (нажмите чтобы развернуть)
=== СТАТУС: Успешно выполнены программы: main_prefix_sum ===
=== main_prefix_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.333403 sec (CUDA: 0.122386 sec, OpenCL: 0.0376186 sec, Vulkan: 0.173332 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
Kernels compilation done in 0.0679301 seconds
Kernels compilation done in 0.0980662 seconds
Kernels compilation done in 0.104359 seconds
prefix sum times (in seconds) - 10 values (min=0.0627238 10%=0.0627245 median=0.06275 90%=0.333298 max=0.333298)
prefix sum median effective VRAM bandwidth: 11.8734 GB/s

Посмотреть полные логи

@GPUcourseBOT
Copy link
Collaborator

Результаты тестирования PR #1049

Логи тестирования (нажмите чтобы развернуть)
=== СТАТУС: Успешно выполнены программы: main_prefix_sum ===
=== main_prefix_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.307123 sec (CUDA: 0.122269 sec, OpenCL: 0.0375879 sec, Vulkan: 0.147207 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
Kernels compilation done in 0.0519588 seconds
Kernels compilation done in 0.0495533 seconds
Kernels compilation done in 0.037794 seconds
prefix sum times (in seconds) - 10 values (min=0.0627217 10%=0.0627227 median=0.0627329 90%=0.202204 max=0.202204)
prefix sum median effective VRAM bandwidth: 11.8767 GB/s

Посмотреть полные логи

@GPUcourseBOT
Copy link
Collaborator

Результаты тестирования PR #1049

Логи тестирования (нажмите чтобы развернуть)
=== СТАТУС: Успешно выполнены программы: main_prefix_sum ===
=== main_prefix_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.305365 sec (CUDA: 0.12256 sec, OpenCL: 0.0391297 sec, Vulkan: 0.143619 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
Kernels compilation done in 0.0518519 seconds
Kernels compilation done in 0.0400152 seconds
Kernels compilation done in 0.0450623 seconds
prefix sum times (in seconds) - 10 values (min=0.0627202 10%=0.0627236 median=0.0627445 90%=0.199869 max=0.199869)
prefix sum median effective VRAM bandwidth: 11.8745 GB/s

Посмотреть полные логи

@GPUcourseBOT
Copy link
Collaborator

Результаты тестирования PR #1049

Логи тестирования (нажмите чтобы развернуть)
=== СТАТУС: Успешно выполнены программы: main_prefix_sum ===
=== main_prefix_sum stdout (exit code: -11 (segfault после выполнения)) ===
Found 1 GPUs in 0.313777 sec (CUDA: 0.121988 sec, OpenCL: 0.037515 sec, Vulkan: 0.154205 sec)
Available devices:
Device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using device #0: API: CUDA+OpenCL+Vulkan. GPU. Tesla T4 (CUDA 12020). Free memory: 14822/14930 Mb.
Using OpenCL API...
Kernels compilation done in 0.0513643 seconds
Kernels compilation done in 0.0502555 seconds
Kernels compilation done in 0.0654138 seconds
prefix sum times (in seconds) - 10 values (min=0.0627095 10%=0.0627123 median=0.062737 90%=0.229958 max=0.229958)
prefix sum median effective VRAM bandwidth: 11.8759 GB/s

Посмотреть полные логи

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants