-
Notifications
You must be signed in to change notification settings - Fork 239
Add a hostcall interface #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add a hostcall interface #1140
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1140 +/- ##
==========================================
+ Coverage 66.97% 75.94% +8.97%
==========================================
Files 118 119 +1
Lines 7955 7737 -218
==========================================
+ Hits 5328 5876 +548
+ Misses 2627 1861 -766
Continue to review full report at Codecov.
|
Hmm, one problem is that the following deadlocks: # hostcall watcher task/thread
Threads.@spawn begin
while true
println(1)
sleep(1)
end
end
# the application, possibly getting stuck in a CUDA API call that needs the kernel to finish
while true
ccall(:sleep, Cuint, (Cuint,), 1)
end I had expected this when running with a single thread, because the main task isn't preemtible, but even with multiple threads the main task getting stuck apparently blocks the scheduler, keeping the hostcall watcher thread from making progress. That would cause a deadlock. @vchuravy any thoughts? How does AMDGPU.jl solve this? |
97b3ad8
to
5538c0b
Compare
And for some preliminary time measurements:
So 2.25us 'per' hostcall (uncontended, and nonblocking since the call doesn't return anything). That's not great, but it's a start. I also don't want to build on this before I'm sure this won't deadlock applications. And for reference, |
e99e290
to
165e41a
Compare
1d35604
to
bf93220
Compare
165e41a
to
f0950dd
Compare
bf93220
to
9990547
Compare
9990547
to
1fe2b4c
Compare
Are you sure you are blocking the scheduler or are you blocking GC? You need at least a safepoint in the loop |
In which loop? The first does a sleep, so that's a yield point. The second loop doesn't need to be a loop, if could as well be an API call that blocks 'indefinitely'. |
Seems to deadlock regularly on CI, so I guess this will have to wait unless we have either application threads, or a way to make CUDA's blocking API calls yield. |
5d585c4
to
c850163
Compare
Fixes #440
Initial, simple implementation. I still need to steal ideas from ADMGPU.jl and optimizations from #567, but the initial goal is a simple but correct implementation that we can use for unlikely code paths such as error reporting.
Demo:
Depends on #1110.
Probably requires Base support like JuliaLang/julia#42302
cc @jpsamaroo