Skip to content

Replace hand-maintained CPU tables with cpufeatures library#61292

Open
gbaraldi wants to merge 13 commits intomasterfrom
gb/cpufeatures
Open

Replace hand-maintained CPU tables with cpufeatures library#61292
gbaraldi wants to merge 13 commits intomasterfrom
gb/cpufeatures

Conversation

@gbaraldi
Copy link
Member

Summary

  • Replaces processor_x86.cpp, processor_arm.cpp, processor_fallback.cpp (~5000 lines of hand-maintained CPU/feature tables) with a unified processor_cpufeatures.cpp that uses the cpufeatures library
  • CPU/feature data is extracted from LLVM's TableGen at build time and shipped as standalone C headers — no LLVM runtime dependency for the tables themselves
  • cpuid.jl now queries feature sets from the C library instead of hardcoding them
  • Debug output available via JULIA_DEBUG=cpufeatures

What is cpufeatures?

A standalone library that extracts CPU names, feature sets, and feature dependencies from LLVM's MCSubtargetInfo at build time into generated C headers. The generated headers are committed to the repo, so a normal build only needs a C++17 compiler — no LLVM required. Supports x86_64, aarch64, and riscv64.

Changes

New files:

  • deps/cpufeatures.mk, deps/cpufeatures.version — build system integration
  • src/processor_cpufeatures.cpp — unified processor implementation

Modified files:

  • src/processor.cpp — includes new file instead of arch-specific ones
  • src/processor.h — feature enum → uint32_t typedef (indices from cpufeatures)
  • src/Makefile — updated dependencies, link -ltarget_parsing
  • src/crc32c.c — hardcode HWCAP bit instead of using removed enum
  • base/cpuid.jl — ISA sets from C queries, not hardcoded
  • base/Makefilefeatures_h.jl from cpufeatures headers
  • deps/Makefile — add cpufeatures to DEP_LIBS

Files to delete (follow-up):

  • src/processor_x86.cpp, src/processor_arm.cpp, src/processor_fallback.cpp
  • src/features_x86.h, src/features_aarch32.h, src/features_aarch64.h

Test plan

  • make clean && make -j succeeds (downloads, builds, links cpufeatures)
  • CPU detection: correct name and features on znver4
  • Multiversioned sysimage (generic;haswell;skylake-avx512): correct target selection
  • -C haswell selects haswell target, -C generic selects generic
  • CPU name aliases (skx, corei7, atom, etc.) resolve correctly
  • BinaryPlatforms ISA matching works
  • FMA, sin, sqrt, basic math all work
  • JULIA_DEBUG=cpufeatures shows target selection details

Related

🤖 Generated with Claude Code

gbaraldi and others added 4 commits March 12, 2026 16:48
Replace processor_x86.cpp, processor_arm.cpp, and processor_fallback.cpp
with a unified processor_cpufeatures.cpp that uses the cpufeatures library
(github.com/gbaraldi/cpufeatures) for CPU detection and feature management.

The cpufeatures library extracts CPU/feature tables from LLVM's TableGen
data at build time and provides them as standalone C headers with no LLVM
runtime dependency. This eliminates ~5000 lines of hand-maintained processor
tables and feature definitions.

Key changes:
- New dep: cpufeatures library (downloaded and built as part of deps/)
- processor_cpufeatures.cpp: unified replacement for all arch-specific files
- processor.h: feature enum replaced with uint32_t typedef (indices from cpufeatures)
- cpuid.jl: ISA feature sets queried from C library instead of hardcoded
- base/Makefile: features_h.jl generated from cpufeatures headers
- JULIA_DEBUG=cpufeatures enables debug output for target selection
- Tuning features filtered from sysimage serialization (hw_feature_mask)

Supported architectures: x86_64, aarch64, riscv64 (same as before).

This was written with the assistance of generative AI (Claude).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cpufeatures library switched from extern "C" tp_* functions to
a namespace tp with C++ return types. Update the four call sites:
get_host_cpu_name, get_host_features, build_feature_string.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use Julia's standard C++ flags (which include -std=c++17) instead of
bare $(CXXFLAGS). The cpufeatures public API uses std::string_view
which requires C++17.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The non-template base case is only called by template overloads used in
the old processor backends. The cpufeatures backend doesn't use these
templates, so clang -Werror,-Wunused-function flags it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The analyzer needs cpufeatures headers installed to compile
processor.cpp, same as the other runtime dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

#if defined(_CPU_X86_64_) || defined(_CPU_X86_)
// KNL/KNM special case
if (!(t.dis.flags & JL_TARGET_CLONE_ALL)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we care about knl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not.

gbaraldi and others added 2 commits March 12, 2026 19:19
Compile-time check that TARGET_TABLES_LLVM_VERSION_MAJOR (from the
cpufeatures generated headers) matches LLVM_VERSION_MAJOR (from Julia's
LLVM). Catches version mismatches early instead of at runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the cpufeatures cross_lookup_cpu API so ISAs_by_family has real
feature data for all architectures regardless of host. Previously
non-host arch entries were empty or used synthetic fallbacks.

New Julia-side APIs:
- CPUID._cross_lookup_cpu(arch, name) — look up any CPU on any arch
- CPUID.feature_names(arch, isa) — map ISA feature bits to names

New C exports (jl_cpufeatures_cross_*):
- cross_lookup, cross_nbytes, cross_num_features/cpus
- cross_feature_name, cross_feature_bit, cross_cpu_name

Also:
- Install cross_arch.h in cpufeatures.mk
- static_assert that cpufeatures tables match Julia's LLVM version
- Add cross-arch and feature name tests to binaryplatforms.jl

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Keno
Copy link
Member

Keno commented Mar 13, 2026

Haven't reviewed in detail, but directionally, this is exactly what I wanted.

@giordano
Copy link
Member

giordano commented Mar 13, 2026

This makes the aarch64-linux-gnu tests error (more than usual, that is), because now this literally spamming

-contextidrel2' is not a recognized feature for this target (ignoring feature)

everywhere. Edit: same for aarch64-darwin, which at least errors more loudly.

@giordano
Copy link
Member

giordano commented Mar 13, 2026

Also, on an aarch64-linux system with big.LITTLE architecture Cortex-X925 + A725, this PR detects the CPU as the little variant:

$ julia +nightly -E 'Sys.CPU_NAME'
"cortex-x925"
$ julia +pr61292 -E 'Sys.CPU_NAME'
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
'-contextidrel2' is not a recognized feature for this target (ignoring feature)
"cortex-a725"

@christiangnrd
Copy link
Contributor

Is there a way to handle aliases? M1, M2, and M3 are aliases of their mobile a1x counterparts.

julia> Base.BinaryPlatforms.CPUID._lookup_cpu("apple-a15")
Base.BinaryPlatforms.CPUID.ISA(Set(UInt32[0x00000002, 0x00000067, 0x00000073, 0x00000077, 0x0000005b, 0x0000002a, 0x000000a1, 0x0000003f, 0x00000057, 0x000000d1  …  0x0000009e, 0x0000000f, 0x000000e7, 0x000000e1, 0x00000047, 0x00000081, 0x000000e8, 0x000000da, 0x00000033, 0x00000009]))

julia> Base.BinaryPlatforms.CPUID._lookup_cpu("apple-m2")
Base.BinaryPlatforms.CPUID.ISA(Set{UInt32}())

Knights Landing/Mill are discontinued and not worth special-casing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@inkydragon inkydragon added the building Build system, or building Julia or its dependencies label Mar 13, 2026
gbaraldi and others added 2 commits March 13, 2026 14:08
…iases)

Bump version to 0.2.0 and pin to specific commit to bust CI cache.
Previous builds cached the old tarball from refs/heads/main.

Fixes:
- CONTEXTIDREL2 warnings on aarch64
- big.LITTLE detecting the little core instead of big
- apple-m1/m2/m3 alias resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes hw_feature_mask also excluding CONTEXTIDREL2 (the previous pin
only fixed is_hw but not the mask itself).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gbaraldi and others added 3 commits March 13, 2026 18:10
Includes: hw_feature_mask fix for CONTEXTIDREL2, big.LITTLE detection,
Apple M-series aliases, aarch64 CPU table baseline for host features,
and cross-arch architecture version tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New method feature_names(arch, cpu) for direct CPU name queries:
  feature_names("x86_64", "haswell")
  feature_names("aarch64", "cortex-x925")

Tests cover all CPUID APIs:
- _cross_lookup_cpu: cross-arch, aliases, normalization, negatives
- feature_names: by CPU name, by ISA, host defaults, cross-arch
- _build_bit_to_name: mapping completeness
- Architecture version features (v8.1a, v9a) on ARM cores
- x86 psABI level features (avx2, avx512f)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pin to commit with find_cpu alias resolution (fixes macOS M-series
detection). Copy all include/*.h instead of listing individual files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@christiangnrd
Copy link
Contributor

My m2 is now properly detected and displayed on macOS and Linux!

@christiangnrd
Copy link
Contributor

Should the cpufeatures library include a fallback feature like we currently do for those building against an older version of LLVM?

Also, when I opened gbaraldi/cpufeatures#1, I had forgotten that the apple-a18 alias was added in LLVM 21, so if we don't re-add fallback, we'll have to update the library to not report apple-a18 until then.

Also also, should +CONTEXTIDREL2 be included in the JIT target features? Would it be better to display the target features as reported by the JIT after it's initialized?

ChrichriMBP:~$ JULIA_DEBUG=cpufeatures j +pr61292 -C apple-a18
[cpufeatures] sysimg_init_cb: cpu_target='apple-a18'
[cpufeatures]   host CPU: 'apple-m2'
[cpufeatures]   cmdline has 1 target(s)
[cpufeatures] arg_target_data: name='apple-a18' require_host=1
[cpufeatures]   found CPU 'apple-a18' in database
[cpufeatures]   JIT target: name='apple-a18' features=+CONTEXTIDREL2,+aes,+alternate-sextload-cvt-f32-pattern,+altnzcv,+am,+amvs,+arith-bcc-fusion,+arith-cbz-fusion,+bf16,+bti,+ccdp,+ccpp,+complxnum,+crc,+disable-latency-sched-heuristic,+dit,+dotprod,+ecv,+el2vmsa,+el3,+fgt,+flagm,+fp-armv8,+fp16fml,+fpac,+fptoint,+fullfp16,+fuse-address,+fuse-adrp-add,+fuse-aes,+fuse-arith-logic,+fuse-crypto-eor,+fuse-csel,+fuse-literals,+i8mm,+jsconv,+lor,+lse,+lse2,+mpam,+neon,+nv,+pan,+pan-rwv,+pauth,+perfmon,+predres,+ras,+rcpc,+rcpc-immo,+rdm,+sb,+sel2,+sha2,+sha3,+specrestrict,+tlb-rmi,+tracev8.4,+uaops,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a,+vh,+zcm,+zcz,+zcz-gp
[cpufeatures]   sysimg has 2 target(s):
[cpufeatures]     [0] name='generic' flags=0x1 features=+fp-armv8,+neon
[cpufeatures]     [1] name='apple-m1' flags=0x1 features=+aes,+altnzcv,+am,+ccdp,+ccpp,+complxnum,+crc,+dit,+dotprod,+el2vmsa,+el3,+flagm,+fp-armv8,+fp16fml,+fptoint,+fullfp16,+jsconv,+lor,+lse,+lse2,+mpam,+neon,+nv,+pan,+pan-rwv,+pauth,+perfmon,+predres,+ras,+rcpc,+rcpc-immo,+rdm,+sb,+sel2,+sha2,+sha3,+specrestrict,+ssbs,+tlb-rmi,+tracev8.4,+uaops,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a,+vh
[cpufeatures]   selected target 0 'generic' (vreg_size=16)
'apple-a18' is not a recognized processor for this target (ignoring processor)
'apple-a18' is not a recognized processor for this target (ignoring processor)
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.14.0-DEV.1899 (2026-03-13)
 _/ |\__'_|_|_|\__'_|  |  gb/cpufeatures/d2834a4730c (fork: 13 commits, 2 days)
|__/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

building Build system, or building Julia or its dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants