Use Base.Semaphore to control test execution parallelism by giordano · Pull Request #119 · JuliaTesting/ParallelTestRunner.jl

giordano · 2026-03-26T01:19:39Z

This is a refactoring of the code, to then enable #77 in a follow up PR (CC @christiangnrd).

The idea of using a semaphore is mine, initial implementation is from Claude, but I made a few manual refinements afterwards (some of them folded in the first commit). Overall summary of the changes:

Replace the fixed worker-task-per-slot model with a semaphore-based
approach: one task per test, with a Base.Semaphore(jobs) limiting
concurrency and a Channel-based worker pool for reuse. This decouples
the number of tasks from the parallelism level and simplifies the
control flow (no inner while loop, tests array is immutable).

I'm mostly happy about the result, it's very close to what I had in mind, and the net diff is relatively small (+48/-22), so this shouldn't be too hard to review.

#118 was useful because it detected that the case of no tests to run wasn't handled correctly, so yay for the extra tests.

Replace the fixed worker-task-per-slot model with a semaphore-based approach: one task per test, with a Base.Semaphore(jobs) limiting concurrency and a Channel-based worker pool for reuse. This decouples the number of tasks from the parallelism level and simplifies the control flow (no inner while loop, tests array is immutable). Co-authored-by: Claude <noreply@anthropic.com> Made-with: Cursor

giordano · 2026-03-26T01:23:30Z

src/ParallelTestRunner.jl

-    for p in workers
+    tests_to_start = Threads.Atomic{Int}(length(tests))
+    for test in tests


The core change I was interested in was changing this for loop from an iteration over the workers, to an iteration over the tests: the idea for #77 is to have different semaphores for different subsets of tests, and here we iterate over the different subsets (and associated semaphores). The whole execution cycle would then require only this change on this single line, the rest would remain as is.

I still need to fully work out the user-facing changes for designating the serial tests for #77. I have some bad ideas in mind, but I want to decouple them from this refactoring, which I believe is still worth on its own right for making the code easier to follow.

giordano · 2026-03-26T01:40:22Z

Uhm, the test failure

default workers stopped at end: Test Failed at /Users/runner/work/ParallelTestRunner.jl/ParallelTestRunner.jl/test/runtests.jl:472
  Expression: after == before
   Evaluated: 1 == 2
  Stacktrace:
   [1] top-level scope
     @ ~/work/ParallelTestRunner.jl/ParallelTestRunner.jl/test/runtests.jl:10
   [2] macro expansion
     @ ~/hostedtoolcache/julia/nightly/aarch64/share/julia/stdlib/v1.14/Test/src/Test.jl:2243 [inlined]
   [3] macro expansion
     @ ~/work/ParallelTestRunner.jl/ParallelTestRunner.jl/test/runtests.jl:424 [inlined]
   [4] macro expansion
     @ ~/hostedtoolcache/julia/nightly/aarch64/share/julia/stdlib/v1.14/Test/src/Test.jl:2243 [inlined]
   [5] macro expansion
     @ ~/work/ParallelTestRunner.jl/ParallelTestRunner.jl/test/runtests.jl:472 [inlined]
   [6] macro expansion
     @ ~/hostedtoolcache/julia/nightly/aarch64/share/julia/stdlib/v1.14/Test/src/Test.jl:781 [inlined]

is interesting:

it happened only on macOS with Julia nightly, but that's precisely what I use locally, and tests passed for me several times 😅
the fact that there were fewer subprocesses still alive after the tests finished than before is quite surprising: if something went wrong with the termination of the subprocesses I'd expect more processes to be still around, not less.

The error disappeared after a rerun, I'd say it was a glitch? 😬 Edit: tests on all platforms always passed in the following 3 full reruns.

giordano · 2026-03-26T11:38:36Z

I pushed a change to use a Threads.@spawn + @sync mechanism, which more closely matches what I had in mind for addressing #77.

As a standalone demo of my idea for running the serial tests, followed by the fully concurrent tests, start Julia with 4 threads and run the following code:

@time "outer loop" for (tests, semaphore) in ((1:4, Base.Semaphore(1)), (5:12, Base.Semaphore(4)))
    @info "running batch" tests semaphore
    @time "inner loop - semaphore size $(semaphore.sem_size)" @sync for test in tests
        Threads.@spawn Base.acquire(semaphore) do
            @show test
            sleep(1)
        end
    end
end

You should observe something like

┌ Info: running batch
│   tests = 1:4
└   semaphore = Base.Semaphore(1, 0, Base.GenericCondition(ReentrantLock()))
test = 4
test = 3
test = 2
test = 1
inner loop - semaphore size 1: 4.016979 seconds (15.27 k allocations: 845.766 KiB, 0.73% compilation time)
┌ Info: running batch
│   tests = 5:12
└   semaphore = Base.Semaphore(4, 0, Base.GenericCondition(ReentrantLock()))
test = 5
test = 8
test = 6
test = 12
test = 9
test = 11
test = 10
test = 7
inner loop - semaphore size 4: 2.005876 seconds (163 allocations: 8.047 KiB, 7 lock conflicts)
outer loop: 6.088579 seconds (73.02 k allocations: 3.853 MiB, 7 lock conflicts, 1.54% compilation time)

The first batch of "tests" (1 to 4), associated to the semaphore of size 1, is run serially, and takes 4 seconds, the second batch of "tests" (5 to 12), associated to the semaphore of size 4, runs concurrently and takes 2 seconds, for a total of 6 seconds for the overall "test suite".

With this design, we could even have multiple semaphores of intermediate sizes between 1 and njobs, but I'm not planning to do that at the moment.

Edit: as a complete demo of my proposal for #77, with the following hacky change

diff --git a/src/ParallelTestRunner.jl b/src/ParallelTestRunner.jl
index 31f77f9..106c428 100644
--- a/src/ParallelTestRunner.jl
+++ b/src/ParallelTestRunner.jl
@@ -826,11 +826,6 @@ function runtests(mod::Module, args::ParsedArgs;
     jobs = clamp(jobs, 1, length(tests))
     println(stdout, "Running $(length(tests)) tests using $jobs parallel jobs. If this is too many concurrent jobs, specify the `--jobs=N` argument to the tests, or set the `JULIA_CPU_THREADS` environment variable.")
     !isnothing(args.verbose) && println(stdout, "Available memory: $(Base.format_bytes(available_memory()))")
-    sem = Base.Semaphore(max(1, jobs))
-    worker_pool = Channel{Union{Nothing, PTRWorker}}(jobs)
-    for _ in 1:jobs
-        put!(worker_pool, nothing)
-    end
 
     t0 = time()
     results = []
@@ -1022,9 +1017,15 @@ function runtests(mod::Module, args::ParsedArgs;
     #
     # execution
     #
+    worker_pool = Channel{Union{Nothing, PTRWorker}}(jobs)
+    for _ in 1:jobs
+        put!(worker_pool, nothing)
+    end
 
     tests_to_start = Threads.Atomic{Int}(length(tests))
-    @sync for test in tests
+    tests_semaphores = ((tests[1:4], Base.Semaphore(1)), (tests[5:end], Base.Semaphore(max(1, jobs))))
+    for (batch, sem) in tests_semaphores
+    @sync for test in batch
         push!(worker_tasks, Threads.@spawn begin
             local p = nothing
             acquired = false
@@ -1123,6 +1124,7 @@ function runtests(mod::Module, args::ParsedArgs;
             end
         end)
     end
+    end
 
     #
     # finalization

run

using ParallelTestRunner
testsuite = Dict(string(l) => :(sleep(1)) for l in 'a':'l');
runtests(ParallelTestRunner, ["--verbose", "--jobs=4"]; testsuite);

you should get

Running 12 tests using 4 parallel jobs. If this is too many concurrent jobs, specify the `--jobs=N` argument to the tests, or set the `JULIA_CPU_THREADS` environment variable.
Available memory: 8.825 GiB
               │   Test   │   Init   │ Compile │ ──────────────── CPU ──────────────── │
Test  (Worker) │ time (s) │ time (s) │   (%)   │ GC (s) │ GC % │ Alloc (MB) │ RSS (MB) │
l          (1) │        started at 2026-03-26T12:06:56.175
l          (1) │     1.06 │     2.09 │    5.80 │   0.00 │  0.0 │       2.01 │   321.14 │
k          (2) │        started at 2026-03-26T12:06:59.336
k          (2) │     1.08 │     1.88 │    7.09 │   0.00 │  0.0 │       2.01 │   319.59 │
i          (3) │        started at 2026-03-26T12:07:02.369
i          (3) │     1.08 │     1.82 │    7.36 │   0.00 │  0.0 │       2.01 │   321.47 │
j          (4) │        started at 2026-03-26T12:07:05.431
j          (4) │     1.08 │     1.82 │    7.10 │   0.00 │  0.0 │       2.01 │   321.00 │
d          (1) │        started at 2026-03-26T12:07:07.880
h          (2) │        started at 2026-03-26T12:07:07.880
e          (3) │        started at 2026-03-26T12:07:07.880
g          (4) │        started at 2026-03-26T12:07:07.881
e          (3) │     1.00 │     0.23 │    0.00 │   0.00 │  0.0 │       0.00 │   322.14 │
c          (3) │        started at 2026-03-26T12:07:09.210
d          (1) │     1.00 │     0.23 │    0.00 │   0.00 │  0.0 │       0.00 │   322.28 │
b          (1) │        started at 2026-03-26T12:07:09.210
h          (2) │     1.00 │     0.23 │    0.00 │   0.00 │  0.0 │       0.00 │   323.28 │
a          (2) │        started at 2026-03-26T12:07:09.210
g          (4) │     1.00 │     0.23 │    0.00 │   0.00 │  0.0 │       0.00 │   322.50 │
f          (4) │        started at 2026-03-26T12:07:09.211
f          (4) │     1.00 │     0.22 │    0.00 │   0.00 │  0.0 │       0.00 │   322.66 │
b          (1) │     1.00 │     0.22 │    0.00 │   0.00 │  0.0 │       0.00 │   322.98 │
a          (2) │     1.00 │     0.23 │    0.00 │   0.00 │  0.0 │       0.00 │   325.30 │
c          (3) │     1.00 │     0.22 │    0.00 │   0.00 │  0.0 │       0.00 │   323.34 │

Test Summary: | Total   Time
  Overall     |     0  15.5s
    l         |     0   1.1s
    k         |     0   1.1s
    i         |     0   1.1s
    j         |     0   1.1s
    e         |     0   1.0s
    d         |     0   1.0s
    h         |     0   1.0s
    g         |     0   1.0s
    f         |     0   1.0s
    b         |     0   1.0s
    a         |     0   1.0s
    c         |     0   1.0s
    SUCCESS

First batch of 4 tests were run serially, the rest of the tests was run concurrently. This also shows that the workers are still recycled correctly.

giordano and others added 4 commits March 25, 2026 18:49

Deal with 0 test jobs

62bb315

Remove redundant variable

7ee3779

Avoid creating a Set

a0fc2a3

giordano requested review from maleadt and vchuravy March 26, 2026 01:19

giordano commented Mar 26, 2026

View reviewed changes

glwagner approved these changes Mar 26, 2026

View reviewed changes

@async -> Threads.@spawn -> @sync

7c39cf5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Base.Semaphore to control test execution parallelism#119

Use Base.Semaphore to control test execution parallelism#119
giordano wants to merge 5 commits intomainfrom
mg/semaphore

giordano commented Mar 26, 2026

Uh oh!

giordano Mar 26, 2026

Uh oh!

giordano commented Mar 26, 2026 •

edited

Loading

Uh oh!

giordano commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

giordano commented Mar 26, 2026

Uh oh!

giordano Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

giordano commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giordano commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

giordano commented Mar 26, 2026 •

edited

Loading

giordano commented Mar 26, 2026 •

edited

Loading