-
Notifications
You must be signed in to change notification settings - Fork 74
Update performant_matmul.jl #590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
debug when N and M are not an integer multiple of TILE_DIM
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/examples/performant_matmul.jl b/examples/performant_matmul.jl
index 02b25523..be72e76f 100644
--- a/examples/performant_matmul.jl
+++ b/examples/performant_matmul.jl
@@ -37,7 +37,7 @@ const TILE_DIM = 32
else
@inbounds tile1[i, j] = 0.0
end
-
+
if t * TILE_DIM + i <= R && J <= M
@inbounds tile2[i, j] = input2[t * TILE_DIM + i, J]
else
@@ -63,18 +63,18 @@ const TILE_DIM = 32
end
end
-@testset "dims for $N, $R, $M" for (N,R,M) in [rand(500:1000,3) for _ in 1:10]
+@testset "dims for $N, $R, $M" for (N, R, M) in [rand(500:1000, 3) for _ in 1:10]
A = rand!(allocate(backend, Float32, N, R))
B = rand!(allocate(backend, Float32, R, M))
C = KernelAbstractions.zeros(backend, Float32, N, M)
-
+
kern = coalesced_matmul_kernel!(backend, (TILE_DIM, TILE_DIM))
-
+
group_size_x = cld(N, TILE_DIM)
group_size_y = cld(M, TILE_DIM)
-
+
kern(C, A, B, N, R, M, ndrange = (group_size_x * TILE_DIM, group_size_y * TILE_DIM))
KernelAbstractions.synchronize(backend)
-
+
@test isapprox(A * B, C)
end |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #590 +/- ##
==========================================
- Coverage 47.52% 0.00% -47.53%
==========================================
Files 22 21 -1
Lines 1715 1571 -144
==========================================
- Hits 815 0 -815
- Misses 900 1571 +671 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Have you seen JuliaGPU/GPUArrays.jl#590? |
I’m sorry for not noticing that you had already found the same bug. |
No worries! It made me realize that my workarounds might not be necessary once 0.10 gets released. |
debug when N and M are not an integer multiple of TILE_DIM