ci: ios by henryiii · Pull Request #5705 · pybind/pybind11

henryiii · 2025-05-30T05:49:32Z

Description

Builds on #5708.

Suggested changelog entry:

Test on iOS in CI.

henryiii · 2025-05-30T16:38:18Z

@b-pass how hard would it be to work around thread_local, possibly via an opt-in mechanism? iOS requires an increased target to get thread_local, but you can't require it currently due to python/cpython#133183 not being in a release yet. It would also be nice to have a way to target iOS 13 (I think 17 is required for thread-local). Ideally we could detect the the target iOS version and select the correct mechanism.

b-pass · 2025-05-30T23:05:34Z

@b-pass how hard would it be to work around thread_local, possibly via an opt-in mechanism?

The easiest thing would be to just exclude the platform from PYBIND11_HAS_SUBINTERPRETER_SUPPORT.

Does CPython's PyThread_tss_set (and friends) work on this target? pybind11 is using that else where so I am guessing yes?

If so, then I can probably make something that uses that....

henryiii · 2025-05-30T23:24:05Z

This might just be a problem with it defaulting to iOS 1, hopefully a new build with the upstream fix will be out soon. What I did for now was allow this to be manually overridden. 13 is was CPython itself is compiled against. If 13+ turns out to be fine, it's okay. If it does require 17+, we can document it at least, and develop a workaround if it wasn't too hard.

henryiii · 2025-05-30T23:53:52Z

iOS test pass! (Minus the sub interpreter ones which I disabled when building)

henryiii · 2025-05-31T01:01:38Z

I think it must handle PYBIND11_TLS_*, happy to test something in the iOS job if you have an idea.

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>

henryiii · 2025-06-01T04:01:29Z

It would probably be easier to set up a PYBIND11_TLS_* version if this was in so it would be part of the CI, so taking out of draft.

.github/workflows/tests-cibw.yml

tests/pyproject.toml

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>

As explained in a new code comment, loader_life_support needs to be thread_local but does not need to be isolated to a particular interpreter because any given function call is already going to only happen on a single interpreter by definiton. Performance before on M4 Max using pybind/pybind11_benchmark unmodified repo: ``` > python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 63.8 nsec per loop ``` After: ``` python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 53.1 nsec per loop ``` Open questions: - How do we determine whether we can safely use `thread_local`? I see concerns about old iOS versions on pybind#5705 (comment) and pybind#5709; is there anything else? - Do we have a test that covers "function called in one interpreter calls a C++ function that causes a function call in another interpreter??

As explained in a new code comment, `loader_life_support` needs to be `thread_local` but does not need to be isolated to a particular interpreter because any given function call is already going to only happen on a single interpreter by definiton. Performance before: - on M4 Max using pybind/pybind11_benchmark unmodified repo: ``` > python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 63.8 nsec per loop ``` - Linux server: ``` python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch) 2000000 loops, best of 5: 120 nsec per loop ``` After: - M4 Max: ``` python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 53.1 nsec per loop ``` - Linux server: ``` > python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch) 2000000 loops, best of 5: 101 nsec per loop ``` A quick profile with perf shows that pthread_setspecific and pthread_getspecific are gone. Open questions: - How do we determine whether we can safely use `thread_local`? I see concerns about old iOS versions on pybind#5705 (comment) and pybind#5709; is there anything else? - Do we have a test that covers "function called in one interpreter calls a C++ function that causes a function call in another interpreter"? I think it's fine, but can it happen? - Are we happy with what we think will happen in the case where multiple extensions compiled with and without this PR interoperate? I think it's fine -- each dispatch pushes and cleans up its own state -- but a second opinion is certainly welcome.

* Use thread_local for loader_life_support to improve performance As explained in a new code comment, `loader_life_support` needs to be `thread_local` but does not need to be isolated to a particular interpreter because any given function call is already going to only happen on a single interpreter by definiton. Performance before: - on M4 Max using pybind/pybind11_benchmark unmodified repo: ``` > python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 63.8 nsec per loop ``` - Linux server: ``` python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch) 2000000 loops, best of 5: 120 nsec per loop ``` After: - M4 Max: ``` python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' 5000000 loops, best of 5: 53.1 nsec per loop ``` - Linux server: ``` > python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch) 2000000 loops, best of 5: 101 nsec per loop ``` A quick profile with perf shows that pthread_setspecific and pthread_getspecific are gone. Open questions: - How do we determine whether we can safely use `thread_local`? I see concerns about old iOS versions on #5705 (comment) and #5709; is there anything else? - Do we have a test that covers "function called in one interpreter calls a C++ function that causes a function call in another interpreter"? I think it's fine, but can it happen? - Are we happy with what we think will happen in the case where multiple extensions compiled with and without this PR interoperate? I think it's fine -- each dispatch pushes and cleans up its own state -- but a second opinion is certainly welcome. * Remove PYBIND11_CAN_USE_THREAD_LOCAL * clarify comment * Simplify loader_life_support TLS storage Replace the `fake_thread_specific_storage` struct with a direct thread-local pointer managed via a function-local static: static loader_life_support *& tls_current_frame() This retains the "stack of frames" behavior via the `parent` link. It also reduces indirection and clarifies intent. Note: this form is C++11-compatible; once pybind11 requires C++17, the helper can be simplified to: inline static thread_local loader_life_support *tls_current_frame = nullptr; * loader_life_support: avoid duplicate tls_current_frame() calls Replace repeated calls with a single local reference: auto &frame = tls_current_frame(); This ensures the thread_local initialization guard is checked only once per constructor/destructor call site, avoids potential clang-tidy complaints, and makes the code more readable. Functional behavior is unchanged. * Add REMINDER for next version bump in internals.h --------- Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>

henryiii force-pushed the henryiii/fix/ios branch from 4fe7197 to bf663be Compare May 30, 2025 06:15

henryiii changed the title ~~fix: ios~~ ci: ios May 30, 2025

henryiii force-pushed the henryiii/fix/ios branch 2 times, most recently from 6c19acd to bcbab64 Compare May 30, 2025 15:48

henryiii mentioned this pull request May 30, 2025

test-sources issue pypa/cibuildwheel#2436

Closed

henryiii force-pushed the henryiii/fix/ios branch from bcbab64 to 23e1a72 Compare May 30, 2025 16:09

henryiii force-pushed the henryiii/fix/ios branch 2 times, most recently from 5b568e8 to 9875955 Compare May 30, 2025 23:04

henryiii mentioned this pull request May 30, 2025

fix: test-sources should use project dir pypa/cibuildwheel#2437

Merged

henryiii mentioned this pull request May 31, 2025

fix: allow subinterpreters to be manually disabled #5708

Merged

henryiii force-pushed the henryiii/fix/ios branch 6 times, most recently from eec1d51 to d5ef814 Compare May 31, 2025 03:49

ci: add iOS

cb6ef95

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>

henryiii force-pushed the henryiii/fix/ios branch from d5ef814 to cb6ef95 Compare May 31, 2025 19:45

henryiii marked this pull request as ready for review June 1, 2025 04:00

henryiii commented Jun 1, 2025

View reviewed changes

.github/workflows/tests-cibw.yml Outdated Show resolved Hide resolved

Update .github/workflows/tests-cibw.yml

b61e612

rwgk approved these changes Jun 1, 2025

View reviewed changes

tests/pyproject.toml Show resolved Hide resolved

b-pass mentioned this pull request Jun 2, 2025

fix: modify the internals pointer-to-pointer implementation to not use thread_local #5709

Merged

ci: use test groups

99b912b

Signed-off-by: Henry Schreiner <henryschreineriii@gmail.com>

rwgk approved these changes Jun 2, 2025

View reviewed changes

henryiii merged commit 7da1d53 into pybind:master Jun 2, 2025
83 checks passed

henryiii deleted the henryiii/fix/ios branch June 2, 2025 04:29

github-actions bot added the needs changelog Possibly needs a changelog entry label Jun 2, 2025

BrewTestBot mentioned this pull request Jul 10, 2025

pybind11 3.0.0 Homebrew/homebrew-core#229675

Merged

rwgk removed the needs changelog Possibly needs a changelog entry label Jul 10, 2025

swolchok mentioned this pull request Sep 5, 2025

Use thread_local for loader_life_support to improve performance #5830

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: ios#5705

ci: ios#5705
henryiii merged 3 commits intopybind:masterfrom
henryiii:henryiii/fix/ios

henryiii commented May 30, 2025 •

edited

Loading

Uh oh!

henryiii commented May 30, 2025

Uh oh!

b-pass commented May 30, 2025

Uh oh!

henryiii commented May 30, 2025

Uh oh!

henryiii commented May 30, 2025

Uh oh!

henryiii commented May 31, 2025 •

edited

Loading

Uh oh!

henryiii commented Jun 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

henryiii commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Suggested changelog entry:

Uh oh!

henryiii commented May 30, 2025

Uh oh!

b-pass commented May 30, 2025

Uh oh!

henryiii commented May 30, 2025

Uh oh!

henryiii commented May 30, 2025

Uh oh!

henryiii commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henryiii commented Jun 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

henryiii commented May 30, 2025 •

edited

Loading

henryiii commented May 31, 2025 •

edited

Loading