Skip to content

Conversation

@daedric
Copy link

@daedric daedric commented Jul 29, 2025

Hi folks,
We're developing a BLAKE3 extension for DuckDB. It made us realise that the compilation on Mac OS X was broken in two locations:

  • cross compilation for amd64 when the host is ARM was not detected and thus tried to use NEON extensions
  • CMake was not using the -mfloat-abi compilation option

Let us know whether this is something that you would consider integrating,

Cheers,

@BurningEnlightenment
Copy link
Collaborator

Hi,
thanks for reaching out! Given that I usually don't touch Apple hardware, I might just need some more information regarding the use case. Are you trying to build fat binaries targetting both amd64 and armv8? AFAICT this would require you to use the generic non-SIMD implementation, because CMake / the XCode generator have to use the same set of source files for both architectures, right? In this case I believe you'd have to configure and build for amd64 and armv8 seperately and merge the artifacts manually using lipo or something like that.
However, if you are just cross compiling for a single different architecture, you should still set CMAKE_SYSTEM_PROCESSOR in your cross compiling toolchain.cmake to amd64. Do you have some official documentation suggesting otherwise? On the same note, it is perfectly fine to override the automatic SIMD detection by setting the cache entry BLAKE3_SIMD_TYPE either on the commandline or in your toolchain.cmake.

CMake was not using the -mfloat-abi compilation option

I really try to avoid hardcoding compiler flags. From a quick glance this option seems well suited to be set in your toolchain.cmake.

@daedric
Copy link
Author

daedric commented Aug 26, 2025

Hey,
Sorry, I'm back from vacation only today.

Are you trying to build fat binaries targetting both amd64 and armv8?
Nope. I'm trying to make sure we can generate amd64 and armv8 from armv8 (m* cpu) hardware.

AFAICT this would require you to use the generic non-SIMD implementation, because CMake / the XCode generator have to use the same set of source files for both architectures, right?

I'm not well versed in Apple fat binaries. But I would expect a fat binary to be able to use simd as well. In any case that is not the subject of my Pull Request :)

However, if you are just cross compiling for a single different architecture, you should still set CMAKE_SYSTEM_PROCESSOR in your cross compiling toolchain.cmake to amd64. Do you have some official documentation suggesting otherwise? On the same note, it is perfectly fine to override the automatic SIMD detection by setting the cache entry BLAKE3_SIMD_TYPE either on the commandline or in your toolchain.cmake.

Unfortunately, for my use case, I need to integrate Blake3 within another CMake (duckdb) so I've much less leeway than I would have liked. I'll check whether I can integrate a toolchain.cmake, BUT ...

I really try to avoid hardcoding compiler flags. From a quick glance this option seems well suited to be set in your toolchain.cmake.

...I would still try to make sure that on relatively standard use-case a CMake can be used without toolchain.cmake. I'd use toolchain.cmake for stuff that CMake cannot determine itself, here it can determine whether it needs to add the flags, whether it compiles for amd64 from armv8 etc. A toolchain.cmake sounds overkill to me, especially given that the same kind of check are done within the build.rs and one could argue that a build.rs and CMake target the same use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants