Open
Conversation
* add a `runtests.py` parallel option following kokkosgh-59
Contributor
Author
|
I'm still studying this a bit. Here is a plot suggesting a potential race condition on compilation vs. opening compiled objects when using --- a/pykokkos/core/runtime.py
+++ b/pykokkos/core/runtime.py
@@ -1,3 +1,4 @@
+import time
import importlib.util
import sys
from typing import Any, Callable, Dict, Optional, Tuple, Type, Union
@@ -145,6 +146,7 @@ class Runtime:
return sys.modules[module_name]
spec = importlib.util.spec_from_file_location(module_name, module_path)
+ time.sleep(15)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module) |
Contributor
Author
|
A hack like this helps reduce the parallel error rate, but not completely: --- a/pykokkos/core/runtime.py
+++ b/pykokkos/core/runtime.py
@@ -1,3 +1,5 @@
+import time
+from pathlib import Path
import importlib.util
import sys
from typing import Any, Callable, Dict, Optional, Tuple, Type, Union
@@ -145,6 +147,17 @@ class Runtime:
return sys.modules[module_name]
spec = importlib.util.spec_from_file_location(module_name, module_path)
+ # poll for the compiled file
+ while not Path(spec.origin).exists():
+ # NOTE: what is a reasonable delay time
+ # here? unfortunately, short times appear
+ # to be able to cause a segfault, perhaps
+ # because the shared object is only partially
+ # written when a load attempt is made?
+ time.sleep(0.3)
+ # even if the file exists, it may not have
+ # been fully flushed?
+ time.sleep(1.0)
module = importlib.util.module_from_spec(spec)
sys.modules[module_name] = module
spec.loader.exec_module(module) |
tylerjereddy
added a commit
to tylerjereddy/pykokkos
that referenced
this pull request
Aug 22, 2022
* the current `develop` branch appears to not be parallel safe as described at kokkos#60 (comment) * this branch allows pykokkos to compile/run code both in serial and in parallel by providing genuinely unique identifiers (file paths) to each "compilation unit"; careful though, this will slow down the serial execution time for the testsuite substantially, probably because it removes reuse in favor of safety from a compilation standpoint--there's probably an approach that is both fast and safe, and I'm certainly open to that, but I'd also argue that safe and slow > (parallel) unsafe and fast * combined with kokkosgh-60, this allows: - `OMP_NUM_THREADS=1 python runtests.py -n 10` - `123 passed, 9 skipped, 9 xfailed, 16 warnings in 92.10s (0:01:32)` - that's more than twice as fast as the serial test run on `develop` - `python runtests.py` - `123 passed, 9 skipped, 9 xfailed, 16 warnings in 212.40s (0:03:32)` - however, this branch slows down the serial test run a lot, to 11.5 minutes! * of course, you'd probably have to thoroughly test `OMP_NUM_THREADS` values and so on to benchmark the hierarchical parallelism situation to determine scenarios where you'd even want to use "parallel pykokkos," but I certainly think we should try to be "safe" for concurrent usage so that we don't impose a certain model of concurrency on our consumers
Contributor
Author
|
gh-61 has a more solid explanation I think, and does allow |
Collaborator
|
Is this PR fixed by #93? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

runtests.pyparallel optionfollowing Compiler: call ModuleSetup.get_main_dir() instead of method #59
Unfortunately a decent number of tests fail with i.e.,
python runtests.py -n 10, so this is still a WIP.Sample traceback below, I'll try to check a bit later today