Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: add option to specify numpy header location #61095

Closed
wants to merge 1 commit into from

Conversation

OldManYellsAtCloud
Copy link

In some cases the numpy module might not be usable during build-time, especially when cross-compiling. (E.g. when compiling for arm32 on a x86-64 machine, the arm32 module is not usable at build time).

This makes meson fail, as it isn't able to figure out the location of numpy headers.

To allow an alternative way to find these headers, introduce a meson build option, where the location of the numpy headers can be specified.

In case numpy module cannot be loaded for some reason to query the include folder location, fall back to the value of this meson option.

@@ -4,17 +4,24 @@ incdir_numpy = run_command(
'-c',
'''
import os
import numpy as np

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This custom code to get the include path from NumPy was implemented as a workaround prior to Meson adding first class support, which I believe landed in 1.4

Rather than continue to patch this, I'd prefer if we bump our minimum Meson to 1.4 and use Meson's built-in NumPy resolution mechanisms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also wary of pandas itself adding an option for this; you may want to check upstream if Meson supports NumPy for cross compilation, and if not ask for it to be implemented there rather than here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started to rework it to use meson's numpy support - it does look nicer.

I need a day or two to stare at this a bit more though before committing. Meson supports numpy via pkgconfig and numpy-config. Numpy's pkgconfig file is not usable for cross-compiling. numpy-config looks more promising, but I couldn't make it work yet for cross-compiling - at this time I think it's numpy's issue, but want to touch some grass to think it through, maybe I find something if I stop looking at it for a short time. If it's numpy issue, than I would contact numpy about this before changing the meson scripts in pandas, otherwise it could make things more difficult than they are. Will be back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, unfortunately it requires a change in numpy to make it better - a way to get the includes folder without actually importing the numpy module (it's closely related to numpy/numpy#18209 ).

I think that currently, using meson's numpy resolution support would cause a regression for cross-compiling, it would make things harder. If this PR isn't merged, that's fine - the ultimate goal was to inform you about the root cause of the linked bug, and that this is one (but certainly not the only) possible solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgommers do you have any thoughts on this issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's the same absolute path as you need to pass in as a build option, like in this PR, right? I was suggesting generating this numpy.pc on the fly. It's maybe a couple lines more code in your build script, but it avoids adding this build option to every package that depends on NumPy - it seems like a win to me.

If that really doesn't work then this PR is okay too (it's what we've done in SciPy too after all).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... technically it is possible. Let me do a poc, and see what the Yocto folks say about it. Will be back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going around the same idea, eventually I found an even simpler solution, which I think won't find a lot of resistance in Yocto - we actually don't necessarily need the full absolute path for cross-compiling, the relative path to the sysroot is perfectly enough, and that's a static path. That 1-liner sed needs to be done only once - I don't expect this to be seen controversial at all, will just submit it during the week as a run of the mill patch. Thanks for the hint.

I have pushed the version that uses Meson's own dependency resolution instead of the custom code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like good news, thanks!

we actually don't necessarily need the full absolute path for cross-compiling, the relative path to the sysroot is perfectly enough

Out of interest: how does the translation from sysroot path to numpy header location happen?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yocto has a standard variable that points to the current site-packages dir (without the sysroot path). The change I have just tested is essentially
sed -i "s:\${pcfiledir}:${PYTHON_SITEPACKAGES_DIR}/numpy/_core/lib/pkgconfig:g" [...]/numpy.pc

@WillAyd
Copy link
Member

WillAyd commented Mar 16, 2025

I don't think the build failures showing now are related to this change, but rather a longstanding issue where build warnings were being suppressed with our Meson 1.2.1 pin. #60681 should have the more comprehensive fix for that

For now, you can just remove any setup calls with -Csetup-args="--werror"

@OldManYellsAtCloud
Copy link
Author

Sure. Though there are still at least 1 or 2 that were broken by this - though if the list would get a bit cleaner, that wouldn't be a problem of course.

@OldManYellsAtCloud OldManYellsAtCloud force-pushed the main branch 2 times, most recently from c514975 to 49dac18 Compare March 16, 2025 20:35
@WillAyd
Copy link
Member

WillAyd commented Mar 16, 2025

For the pre-commit error I think you just need to run ./scripts/generate_pip_deps_from_conda.py and commit the reuslt.

I thought the pre-commit hook would do this for you, so maybe it was skipped. If that doesn't work can help take a closer look

@OldManYellsAtCloud
Copy link
Author

You are right, that did the trick. But I'm not really sure why the wheel* workflows weren't kicked off

@OldManYellsAtCloud OldManYellsAtCloud force-pushed the main branch 5 times, most recently from 83cf21d to 74e5be0 Compare March 17, 2025 10:22
@OldManYellsAtCloud
Copy link
Author

I see the problem - it's about the PKG_CONF_PATH env var, which should be visible for each process, that is building pandas (or wants to find numpy via meson for any other reasons). But github actions is fighting against me. I will experiment how to pass env vars back and forth between actions and workflows, but in a test repo, and hopefully will be back.

@rgommers
Copy link
Contributor

Ah yes, cibuildwheel-within-GHA is odd layering. I think this does what you want:

https://github.com/scipy/scipy/blob/689ebb9d0332b025d604b755febb68fb0d017a40/.github/workflows/wheels.yml#L125

@OldManYellsAtCloud OldManYellsAtCloud force-pushed the main branch 2 times, most recently from 1338a90 to d9fb595 Compare March 18, 2025 12:07
Instead of querying the include folder's location from NumPy during
building, use Meson's built-in dependency resolution for NumPy.

With this change during build-time Meson first tries to query the
dependency details using numpy-config (which, in turn essentially
uses the same method as the original code this commit replaces),
and in case that fails for some reason, it tries to discover
NumPy resources using pkg-config - which, beside being a good
fail-over mechanism, has the added benefit of somewhat simpler
cross-compiling, as querying the include folder location from
NumPy module is only usable for cross-compiling only in some
corner cases, while pkg-config is a bit more universal.

Signed-off-by: Gyorgy Sarvari <[email protected]>
@OldManYellsAtCloud
Copy link
Author

I dunno. I tried it, but i doesn't seem to have any effect. It looks the environment variables are passed correctly, but they disappear in thin air by the time meson is executed. At this time I suspect that might be running into the same issue as mesonbuild/meson-python#604 - and it would need a clean build folder (but today I'm not brave enough to do rm -rf on random build machines), but that's only a desperate last thought, and I'm just afraid to accept the fact that it is cursed.

If you have any ideas, suggestions, please go for it, I'm all ears. Personally I have to put this to rest on my end for at least a day.

@WillAyd
Copy link
Member

WillAyd commented Mar 18, 2025

I might have missed a prior conversation but does NumPy definitely distribute the pkg-config file when installed via conda? Checking my local machine in an environment with 1.26.4 there is no directory for <site_packages>/numpy/_core/lib

@@ -40,11 +40,18 @@ jobs:
with:
fetch-depth: 0

- name: Determine NumPy pkg-config location
run: |
echo "PKG_CONFIG_PATH=$(python -c 'import site; print(site.getsitepackages()[0])')/numpy/_core/lib/pkgconfig" >> $GITHUB_ENV
Copy link
Member

@lithomas1 lithomas1 Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like sprinkling PKG_CONFIG_PATH in every workflow seems a bit hacky.
Is it possible for us to upstream something to meson/numpy to autodetect this better so that we can avoid this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a comment (if this is necessary and not upstreamable) explaining what this environment variable does?

(At a quick glance, it is not entirely obvious to me what the ordering should be - i.e. should this go before or after the conda env is activated, and the significance of the hard-coded path inside numpy, so if you could add an explanation that would help me and the other core team members that aren't as familiar with pkgconfig a lot)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to upstream something to meson/numpy to autodetect this better so that we can avoid this?

As far as Meson is concerned providing the pkg_config_path is a standard argument to setup/configure, so I don't think there is anything left that could be done in that library. For NumPy, Ralf would know best, but I am not sure that it would be able to install the pkg-config file to the normal location on a user's disk given that would be outside the typical installation architecture of a Python package

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like sprinkling PKG_CONFIG_PATH in every workflow seems a bit hacky

It's only needed for cross-compiling, so it should not be present/needed in other workflows.

With numpy 2.0, which provides numpy-config, this is all you need in meson.build (and nothing in CI jobs):

dependency('numpy')

@OldManYellsAtCloud
Copy link
Author

When using meson's numpy resolution, there are two ways one can go in my understanding:

  1. Use pkg-config. NumPy uses a non-standard location for the .pc file, I guess mostly because there is no standard location for .pc files provided by Python modules. This needs the PKG_CONFIG_PATH to be set to this custom path, otherwise pkg-config can't find it. I believe that there is always a numpy.pc file, though maybe depending on the version, the location might be different? (thinking this because of your comment) That unfortunately wouldn't make things easier. On the other hand I'm not too familiar with NumPy, I'm just trying to make it cross-compile - if you tell me that this is not right, I will believe it without second guessing.
  2. Use /usr/bin/numpy-config. For this, meson needs to know where this executable/script is - it could be added to a meson*native file as a binary which that could be passed to meson (at least in some workflows this file seems to exist already, though I didn't check where it is coming from exactly), which would make the pkg-config method superfluous.

I do realize that currently it looks like a hack, and in the past days I only wanted to see it work at least once, to have at least a baseline - I suspect that half of the env-declarations are useless, and could be removed. But unfortunately I still have no idea why it doesn't work - and unless some miracle happens and I have some revelation, I think I might have to give this up - unfortunately I am not able to learn the ins and outs of a non-trivial CI setup due to time constraints :(

@rgommers
Copy link
Contributor

in an environment with 1.26.4 there is no directory

It's new in 2.0.0

default_options: ['buildtype=release', 'c_std=c11', 'warning_level=2'],
)

fs = import('fs')
py = import('python').find_installation(pure: false)
tempita = files('generate_pxi.py')
versioneer = files('generate_version.py')

numpy_dep = dependency('numpy', method: 'auto')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use required: false as long as numpy 1.26.x is still supported

@rgommers
Copy link
Contributor

@OldManYellsAtCloud thank you for the effort, this was helpful to understand the limitations of the .pc file. This PR is largely still in the right direction. Let's see if we can finish it off. @WillAyd do you want to take a stab at it, or do you want me to?

@WillAyd
Copy link
Member

WillAyd commented Mar 19, 2025

I don't have the time personally to work on this at the moment, but am happy to review any changes that come through. I do think this is close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: ValueError: Buffer dtype mismatch, expected 'intp_t' but got 'long long' on ARMv7 32 bit
5 participants