Fix worktree counting on Julia 1.13#64
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #64 +/- ##
==========================================
- Coverage 99.20% 95.20% -4.01%
==========================================
Files 4 4
Lines 126 125 -1
==========================================
- Hits 125 119 -6
- Misses 1 6 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
The invalidations check itself seems broken, rather than anything to do with this change. |
|
@oscardssmith you seem like you might be a good person to poke about this. |
ChrisRackauckas
left a comment
There was a problem hiding this comment.
Verified: 1.13+ branch uses Base.Workqueues() which returns the current thread's IntrusiveLinkedListSynchronized{Task}, and Base.Workqueues[tidp1] continues to work via OncePerThread's getindex. CI green on nightly across linux/macos/windows including aarch64.
ThreadingUtilities 0.5.6 (JuliaSIMD/ThreadingUtilities.jl#64) fixes the Julia 1.13+ OncePerThread MethodError in wake_thread! that was causing every pre/nightly job to red-flag part1 and part4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Unbreak Apple ARM tests that now pass
Several `@test_broken` / `@test_skip` gates on Apple ARM (M-series) no
longer apply with current LoopVectorization and the VectorizationBase
nested-W=1 `_vstore_unroll!` fix.
- `condstore!` masked-store tests in `ifelsemasks.jl` (lines ~626-655)
now produce matching results on Apple ARM — drop the Apple branch and
test unconditionally for both Float32 and Float64.
- `Bernoulli_logitavx`/`Bernoulli_logit_avx` with `Vector{Bool}` and an
`Int` α (`ifelsemasks.jl` line ~736) was `@test_skip`-ed but actually
passes — convert to `@test`.
- Issue #543 W=1 nested VecUnroll store test in `staticsize.jl` was
`@test_skip`-ed for v=1 on Apple ARM; with the VectorizationBase fix
it now passes for all v=1..4, n=2..8.
The remaining ARM-gated breakage in `ifelsemasks.jl` (Bernoulli with a
`BitVector` mask + Float64/Int α at lines ~715-722) and the
`tullio_issue_131` pattern in `shuffleloadstores.jl` are deeper SIMD
issues left as `@test_broken` with TODOs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Unbreak BitVector Bernoulli_logit tests on Apple ARM
With the companion VectorizationBase fix for dynamic-index BitArray
loads with sub-byte alignment, `Bernoulli_logitavx` and
`Bernoulli_logit_avx` now produce correct results for both
`BitVector` and `Vector{Bool}` masks on Apple M-series. The
Apple-aarch64 `@test_skip` / `@test_broken` branches are dropped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Fix unroll-cleanup tail bound for strided loads (tullio_issue_131)
`pointermax_index` builds the limit pointer that the unroll-cleanup
termination check is compared against. The `sub > 0` branch already
applies `incr` (when not statically known) and `stride` (when ≠ 1) to
scale the loop length into a byte/element offset, but the `sub == 0`
branch was pushing the raw `stophint` / `stopsym` straight through. For
any strided load on the unrolled axis (e.g. `arr[2i, ...]`) the cleanup
bound came out `stride×` too small, so the final tail iteration was
skipped whenever `looplen mod (UF*W) != 0`.
On Apple ARM with W=2 for Float64, this dropped the last `out_i`
iteration for every odd `out_i ≥ 3` in the tullio_issue_131 shape grid,
and analogously for Float32 with W=4. The cleanup never ran for the
1–3 trailing elements, leaving them at whatever the output array was
initialized to. Confirmed correct after fix for all
`(M, N) ∈ 4:24 × 2:8` on the tullio reproducer; `test/shuffleloadstores.jl`
goes from 4255 pass / 686 broken to 4941 pass / 0 broken on Apple M-series.
Drop the matching `@test_broken` gate and the `tullio_issue_131` comment
in `test/shuffleloadstores.jl`.
Fixes #570.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Loosen condstore == to ≈; re-gate VB-dependent tests until VB#127 release
Two CI regressions on the previous commits:
1. `condstore!` tests in `ifelsemasks.jl` (lines 626-637) use `==` to
compare a SIMD-masked-store result against the scalar reference. On
Apple ARM the two paths can differ by a 1-ULP rounding even though
`@show`-printed values look identical (the original gate predates
that observation). Switch to `≈` — the test still catches anything
meaningful, just not artifacts of operation reordering.
2. The BitVector `Bernoulli_logit{,_}avx` tests in `ifelsemasks.jl`, the
`Vector{Bool}` + Int α variants in the same block, and the W=1
nested-VecUnroll Issue #543 testset in `staticsize.jl` all depend on
the JuliaSIMD/VectorizationBase.jl#127 fixes being available at
runtime. That PR isn't tagged yet, so CI's stock VectorizationBase
doesn't have it and the tests fail. Restore the
`Sys.ARCH === :aarch64 && Sys.isapple()` gate (as `@test_broken` /
`@test_skip`) with a comment pointing at VB#127. Once that release
lands and LV's compat is bumped, the branches can be dropped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Use @test_skip for BitVector Bernoulli gates (Julia-version dependent)
`@test_broken` errors on "Unexpected Pass", which makes the BitVector
+ Int α Bernoulli test fail in Julia LTS macOS aarch64 CI even though
the test happens to give the correct result there. The underlying bug
(VectorizationBase BitVector load misalignment, fixed in VB#127) is
present in some configurations but not others — Julia 1.10's older
LLVM appears to dodge it for the test inputs in question.
Switch to `@test_skip` so the gate is loose either way: when the
underlying bug bites, the test is skipped; when it doesn't, no error.
After VB#127 is released and LV's compat is bumped, the entire branch
can be dropped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Skip W=1 issue #543 test on all platforms (not just Apple aarch64)
The nested W=1 VecUnroll store path is picked by LoopVectorization on
different (arch, julia version) combinations than originally assumed —
the Julia nightly x86_64 macOS CI also hit it, not just Apple aarch64.
The fix is in JuliaSIMD/VectorizationBase.jl#127 and not yet in a
tagged release, so skip the v == 1 sub-case on every platform until
LV's VectorizationBase compat is bumped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Rerun CI on top of bumped downstream releases
* Rerun CI on top of SLEEFPirates v0.6.46+
* Bump VectorizationBase compat to 0.21.74; drop @test_skip gates
VectorizationBase v0.21.74 ships the two fixes JuliaSIMD/VectorizationBase.jl#127 added:
- `_vstore_unroll!` for the nested W=1 (scalar lane) VecUnroll path,
which `staticsize.jl`'s Issue #543 testset exercises with `v == 1`.
- The dynamic-index BitArray load misalignment fix that
`ifelsemasks.jl`'s `Bernoulli_logitavx`/`Bernoulli_logit_avx` with
`BitVector` masks depends on.
Bump LV's lower bound to `"0.21.74"` and drop the
`@test_skip ... else @test ... end` branches I added while VB#127 was
still in flight:
- `test/ifelsemasks.jl`: Bernoulli BitVector + Int α (4 tests),
Vector{Bool} + Int α (2 tests), BitVector + Float64 α (2 tests).
- `test/staticsize.jl`: the `v == 1` Issue #543 sub-case (7 entries).
Local sweep on Apple M-series with the dev'd v0.21.74:
- `test/ifelsemasks.jl`: 435/435 pass (was 430/5 broken).
- `test/staticsize.jl` Issue #543 testset: 84/84 pass (was 70/77).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Retrigger CI to pick up ThreadingUtilities 0.5.6
ThreadingUtilities 0.5.6 (JuliaSIMD/ThreadingUtilities.jl#64)
fixes the Julia 1.13+ OncePerThread MethodError in wake_thread! that was
causing every pre/nightly job to red-flag part1 and part4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Remove Invalidations CI workflow
The SnoopCompileCore-based invalidations check has been broken since the
SCPrettyTablesExt FieldError upstream regression and has been red across
all recent PRs. The signal it produced (regressions in method-table
invalidation count) hasn't been actionable for this repo; removing the
workflow rather than keeping a perma-red check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rather trivial. Fixes #63.
Can confirm that all tests, and Polyester.jl tests, now pass on 1.13-rc1.