Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLOG] Add libsf_error_state: SciPy's first shared library #892

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from

Conversation

steppi
Copy link

@steppi steppi commented Dec 17, 2024

Sorry for accidentally targeting main with my first PR. Copied from before:

This is nearly done so I figured it was time to make a PR here. Irwin Zaid, who was involved in the work described has read this draft and given a thumbs up (after making a couple small suggestions).

I still need to add footnotes and write the conclusion section at the end, which currently has section header ##Reflections. It looks like I'll also need to go through and confirm the text styling bullet points.

Text styling

  • The blog is written with plain language (where relevant).
  • If there are headers, they use the proper header tags in order to do so (with only one level-one header).
  • All links describe where they link to (for example, check the Quansight labs website).
  • Any kind of styling that the author uses (for example, bold for emphasis) is consistent throughout the blog.

Non-text contents

  • Blog post featured image is in PNG or JPEG format, not SVG.
  • All content is represented as text (for example, images need alt text and videos need captions or descriptive transcripts).
  • If there are emojis, there are not more than three in a row.
  • Don't use flashing gifs or videos.
  • If it were to be read as plain text, the blog still makes sense and no information is missing.

Copy link

vercel bot commented Dec 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
labs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 11, 2025 7:09pm

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great write-up Albert! I like the structure of the story.

code, since one need not worry about the thread-safety of wrapped compiled libraries.

By wrapping, filling in gaps, and providing user friendly Python APIs to a wide range of battle tested, permissively
licensed and public domain scientific software tools—SciPy's founding developers [] were able to kickstart the growth of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some spurious brackets here

Copy link
Author

@steppi steppi Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ralf, those were the places where I meant to go back and add footnotes. They're added now

complexity SciPy tries to hide from users, while previewing some exciting developments in
[`scipy.special`](https://docs.scipy.org/doc/scipy/reference/special.html).

Python has become wildly popular as a scripting language for scientific and other data intensive applications. It owes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd remove "scripting" - it doesn't apply well to how many devs use Python, and is often read as a negative adjective.


By wrapping, filling in gaps, and providing user friendly Python APIs to a wide range of battle tested, permissively
licensed and public domain scientific software tools—SciPy's founding developers [] were able to kickstart the growth of
the Scientific Python ecosystem. There are now worlds of scientific and data intensive software available in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "data-intensive" with a dash I think?


## NumPy Universal functions

A NumPy [`ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) represents an arbitrary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This transition is a little abrupt. I'd suggest guiding the reader a bit by putting a sentence above the section header like "Our simple feature here is correct error handling for SciPy's special functions. We'll start with how special functions are implemented with NumPy ufuncs, and from there to how error handling is designed and finally implemented with the help of a shared library." (quick sketch, the actual flow may be a bit different, I haven't read to the end yet).


To create a ufunc for a mathematical function, one needs a scalar implementation of this function, known as a scalar
kernel, that's written in a compiled language. Until recently, `scipy.special` had scalar kernels written in all of C,
C++, Fortran 77, and Cython. In August of 2023, Irwin Zaid (izaid) at Oxford proposed rewriting all of the scalar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change "izaid" to @izaid with a link to his Github profile?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea

free-threaded Python across the ecosystem for much of this past year.

In a curious reversal, it is now (the `win32` flavor of) MinGW that is causing trouble due to lack of proper threading
support [].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty brackets. I think the sentence should be continued with something like "since MinGW does not support thread local variables".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing it on this from Edgar

It seems that using std::mutex also causes the same MinGW threading issues to surface.

and part of your reply

  • MinGW-w64 comes in two build flavors: posix and win32. The posix builds have threading support, the win32 ones don't*

It seemed like something deeper might be going on, since even the mutex isn't working. I don't really understand the underlying issue though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a need to explain in detail, just the basic reason (MinGW doesn't have support).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't claim to fully understand it either, MinGW stuff is usually quite arcane and requires digging through lots of old email threads.

Now that the dust has mostly settled, it's valuable to look back and try to judge whether we made the right choice. Over
the entire timeframe, the primary goal for Irwin and I was always to make as much of SciPy special available on GPUs as
possible—with secondary goals of simplifying SciPy special's build process and improving the scalar kernel codebase. The
story of `libsf_error_state` is that of a side quest—a story of needing to fix something we broke in pursuit of bigger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spaces around brackets here

story of `libsf_error_state` is that of a side quest—a story of needing to fix something we broke in pursuit of bigger
things—and the challenges we faced and brought to others because we underestimated the difficulty of our chosen
solution. Having a single array for managing shared state between the extension modules appeals to a desire to have a
single source of truth [], but looking back, it would have taken much less time to have each extension module carry its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty brackets

[`scipy.special.cython_special`](https://docs.scipy.org/doc/scipy/reference/special.cython_special.html), a module for
calling scalar versions of special function from Cython. To quote:

"To make the necessary modifications to cython_special error handling we need to instead share a single global variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeset cython_special as code here?

calling scalar versions of special function from Cython. To quote:

"To make the necessary modifications to cython_special error handling we need to instead share a single global variable
between the two extension modules. On gcc we could do this with -rpath, but coming up with a portable solution seems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCC in capitals and -rpath as code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a verbatim quote, so I didn't want to edit it.

@steppi
Copy link
Author

steppi commented Dec 19, 2024

Thanks @rgommers for all of the suggestions. Me not understanding why the inplace build works but not dev.py seems to be most in need of fixing. I'll look into understanding what's really going on. The empty brackets were places to remind me where to put the footnotes later. I should have mentioned that in my PR message. I've added footnotes now, and a conclusion section, and will go through and implement all of your suggestions.

@steppi
Copy link
Author

steppi commented Dec 19, 2024

@rgommers, I think I have almost everything addressed now. A couple questions though:

I have a bunch of footnotes that are just links. I did this to better follow this advice above "All links describe where they link to (for example, check the Quansight labs website)." This is for cases where I just want to point to a place to get more information. In each case I did this, the url clearly describes where it links to. Is this the right thing to do? I could also just remove the links entirely and trust readers to look things up if needed, but my thought was that explicitly linking is also vouching for the linked content.

There's another issue where I was putting links in places where something should be formatted as code. This seems to not actually work, they end up not really looking like links to me. I think I'll just take those out.

An example:

[`ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)

this looks fine in github

ndarray

but in the labs blog:

image

and the color doesn't change when hovering.

This also doesn't describe what is linked to, but I thought maybe the convention of having all such links point to the official documentation would be clear enough.

@steppi
Copy link
Author

steppi commented Dec 19, 2024

I have a bunch of footnotes that are just links. I did this to better follow this advice above "All links describe where they link to (for example, check the Quansight labs website)."

Looking at some other posts, it seems pretty common just to have links directly in the text associated to a term or phrase without explicitly saying what is linked to, and also to just have link text that is formatted with backticks, so I guess it's fine to go back to that. Having links as footnotes seems like a worse user experience.

@steppi
Copy link
Author

steppi commented Dec 19, 2024

I think all of the feedback so far has been addressed now.

@rgommers
Copy link
Member

Yes, I think direct links are fine - footnotes are a bit more awkward imho.

@rgommers
Copy link
Member

Thanks Albert. I'll try to have another look tomorrow.

@steppi
Copy link
Author

steppi commented Dec 19, 2024

Thanks Ralf!

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! A few more comments before the weekend.

## NumPy Universal functions

The simple feature in question is error handling for mathematical functions in `scipy.special` that operate
elementwise over NumPy arrays. Let's look into this more closely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: looks like this should be above the section header


Here's the ufunc [`np.sqrt`](https://numpy.org/doc/stable/reference/generated/numpy.sqrt.html) in action:

```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add python here for syntax highlighting (also in the next code blocks)

```

For large arrays, the speedup from applying `np.sqrt` over an `ndarray` rather than
[`math.sqrt`](https://docs.python.org/3/library/math.html#math.sqrt) over each element of a list is significant. On my
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I would not use links for common things like math.sqrt, float, ValueError - that's more distracting than helpful in my opinion. Less well known things like np.errstate do seem fine as links.

We saw three options:

1. Have the Python interface for modifying the error handling state (`special.errstate` and
[`special.seterr`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.seterr.html)) update the state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

special.seterr was already linked higher up, so please remove the link here


Even after this, I was still seeing the same error. I asked Ralf about it, and found what happens is that `pip`
goes through [meson-python](https://github.com/mesonbuild/meson-python) while `dev.py` uses Meson directly.
`meson-python` takes care to set the [`RPATH`](https://en.wikipedia.org/wiki/Rpath) for each extension module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to rephrase this before merging, since it's a little more subtle and I'd prefer for it to be as correct as possible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

a Clang compiler driver that aims for compatability with MSVC. Clang-cl is used for Windows builds in conda-forge and we had run into
another platform specific difference.

Fortunately, h-vetinari knew what the problem was. On Linux, Mac OS (and Windows while using MinGW), symbols from shared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Mac OS" -> "macOS"


He had a recipe ready to use: defining and using macros which conditionally compiled to the right thing depending on their context.

```C++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlighting doesn't work, so this probably should be cpp

[free-threaded Python team](https://labs.quansight.org/blog/free-threaded-python-rollout) — to ensure
thread safety by declaring the array `sf_error_actions` thread local. This eliminates the data-race by making it so each
thread gets its own separate copy of the array. Edgar and the others on the free-threaded Python team have been doing
great work improving support for free-threaded Python across the ecosystem for much of this past year.[^9]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This footnote 9 is missing

computing tools and the greatest challenge turned out to be getting things to work on each of them. Several times, just
as we thought everything was finally working, another quirk would pop up that needed to be addressed.

### Path troubles
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one more checking off the checkboxes in the PR description: from here on the headings switch from ## to ### - probably not intentional?

@steppi
Copy link
Author

steppi commented Dec 20, 2024

Thanks @rgommers. I think I've addressed everything except for the inaccuracy regarding why the pip install worked but python dev.py build didn't; much appreciation for your offer to rephrase it.


---

[^1]: The first regular shared library, not including Python extension modules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not including the vendored libraries? i.e. Everything in scipy.libs, libscipy_openblas-*.so, libgfortran-*.so.

Side note: I kinda expected that libsf_error_state.so would also be in scipy.libs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the ambiguity @thomasjpfan. Let me know if the update looks better to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants