Skip to content

Fix two scenarios of RDataFrame JIT and explicit multithreading #15588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

vepadulano
Copy link
Member

Re-opening this PR to check if the clad failures were real.

@vepadulano vepadulano added in:RDataFrame clean build Ask CI to do non-incremental build on PR labels May 21, 2024
@vepadulano vepadulano requested a review from dpiparo May 21, 2024 16:19
@vepadulano vepadulano self-assigned this May 21, 2024
@vepadulano vepadulano requested a review from martamaja10 as a code owner May 21, 2024 16:19
Copy link

github-actions bot commented May 21, 2024

Test Results

    14 files      14 suites   2d 19h 1m 0s ⏱️
 2 713 tests  2 712 ✅ 0 💤 1 ❌
35 754 runs  35 747 ✅ 0 💤 7 ❌

For more details on these failures, see this check.

Results for commit 0662195.

♻️ This comment has been updated with latest results.

@vepadulano vepadulano force-pushed the rdf-threadsafe-jit branch 2 times, most recently from afc5aba to 3249cec Compare May 22, 2024 20:10
@vepadulano vepadulano force-pushed the rdf-threadsafe-jit branch from 3249cec to 10ea292 Compare May 23, 2024 17:51
@vepadulano
Copy link
Member Author

This PR is related to this issue #15080

@vepadulano vepadulano force-pushed the rdf-threadsafe-jit branch from 10ea292 to 7186526 Compare May 29, 2024 12:49
@vepadulano vepadulano changed the title Fix one scenario of RDataFrame JIT and explicit multithreading Fix two scenarios of RDataFrame JIT and explicit multithreading May 29, 2024
@vepadulano
Copy link
Member Author

@pcanal A second test was added for a different type of scenario (see the commit description). On my local tests, the changes in this PR already cover also the second scenario.

@pcanal pcanal closed this May 29, 2024
@pcanal pcanal reopened this May 29, 2024
@vepadulano vepadulano closed this May 30, 2024
@vepadulano vepadulano reopened this May 30, 2024
Test a program with multiple threads managed explicitly by the user and each thread runs an RDataFrame with JITting. In particular, this test probes the following situation:

* Threads start building the computation graph, code to JIT goes in a global accumulating string.
* Thread 1 reaches the RLoopManager::Jit method. Checks the global string, it is not empty.
* Thread 2 arrives as well, checks the string, it is not empty.
* Thread 1 now takes a write lock on the string, moving it into the local function stack, thus leaving it empty.
* Thread 2 wants to move the string as well, but just moves an empty string.
* Thread 2 is faster than Thread 1 at passing its (empty) string to the `InterpreterCalc` function.
* The `InterpreterCalc` function receives an empty string, thus will raise an exception since `gInterpreter->Calc("")` returns a `kDangerous` interpreter error code.
In an explicit multithreading scenario inside the `RLoopManager::Jit` method, a thread might move the global string with the code to JIT into its own function stack while another thread might still believe that string is not empty. This would make the other thread call `InterpreterCalc` with an empty string which would eventually lead to an exception. Check for the string emptiness twice, once at the beginning of the function, then again after taking the write lock so that if another thread already moved the string then the current thread will not continue with JITting.
For the following scenario:

* Multiple threads, multiple different RDF computation graphs (one per thread).
* Some threads run a valid, working computation graph, others have some
  problems.
* Two classes of problems are used. The first is when the declaration
  of some code to JIT incurs in an error in the interpreter. The second is when
  there is an exception at runtime.
* All problems are hidden explicitly by a try-catch scope.
* Threads running a working computation graph should not be impacted by the
  threads that are running a flawed computation graph.
@vepadulano vepadulano force-pushed the rdf-threadsafe-jit branch from 7186526 to 0662195 Compare June 26, 2024 08:45
@vepadulano vepadulano closed this Sep 27, 2024
@vepadulano vepadulano reopened this Sep 27, 2024
@vepadulano vepadulano closed this Oct 15, 2024
@vepadulano vepadulano reopened this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean build Ask CI to do non-incremental build on PR in:RDataFrame
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants