LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026) by iclsrc · Pull Request #21723 · intel/llvm

iclsrc · 2026-04-10T03:01:43Z

LLVM: llvm/llvm-project@7a3b7f1
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@b241000

…an (#189109)

…123) Use the generic switch rather than encoding the version number it currently corresponds to.

… for risc-v (#110690) The code generated for calls with FPCC eligible structs as arguments doesn't consider the bitfield, which results in a store crossing the boundary of the memory allocated using alloca, e.g. For the code: ``` struct __attribute__((packed, aligned(1))) S { const float f0; unsigned f1 : 1; }; unsigned func(struct S arg) { return arg.f1; } ``` The generated IR is: ``` define dso_local signext i32 @func( float [[TMP0:%.*]], i32 [[TMP1:%.*]]) #[[ATTR0:[0-9]+]] { [[ENTRY:.*:]] [[ARG:%.*]] = alloca [[STRUCT_S:%.*]], align 1 [[TMP2:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 0 store float [[TMP0]], ptr [[TMP2]], align 1 [[TMP3:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 1 store i32 [[TMP1]], ptr [[TMP3]], align 1 [[F1:%.*]] = getelementptr inbounds nuw [[STRUCT_S]], ptr [[ARG]], i32 0, i32 1 [[BF_LOAD:%.*]] = load i8, ptr [[F1]], align 1 [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1 [[BF_CAST:%.*]] = zext i8 [[BF_CLEAR]] to i32 ret i32 [[BF_CAST]] ``` Where, `store i32 [[TMP1]], ptr [[TMP3]], align 1` can be seen crossing the boundary of the allocated memory. If, the IR is seen after optimizations (EarlyCSEPass), the IR left is: ``` define dso_local noundef signext i32 @func( float [[TMP0:%.*]], i32 [[TMP1:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { [[ENTRY:.*:]] ret i32 0 ``` The patch trims the second member of the struct after taking into consideration the bitwidth to decide the appropriate integer type and the test shows the results of this patch. Note that the bug is seen only when `f` extension is enabled for FPCC eligibility. Co-authored-by: muhammad.kamran4 <muhammad.kamran@esperantotech.com>

…697) Device libs has a fast sqrt macro implemented this way.

Add tests targeting assembly printing and miscellaneous CodeGen areas with low coverage: - asm-printer-cpool.ll: HexagonAsmPrinter exercising constant pool entry emission. - asm-operand-modifiers.ll: Inline asm operand modifier printing paths (lo/hi/mem). - target-objfile-sdata.ll, split-double-volatile.ll, reg-info-types.ll: Miscellaneous CodeGen coverage for HexagonTargetObjectFile small data classification, HexagonSplitDouble volatile load handling, and HexagonRegisterInfo register class queries. - constext-store-imm.ll: HexagonConstExtenders store-immediate optimization paths.

This removes dyn_cast invocations where the argument is already of the target type (including through subtyping). This was created by adding a static assert in dyn_cast and letting an LLM iterate until the code base compiled. I then went through each example and cleaned it up. This does not commit the static assert in dyn_cast, because it would prevent a lot of uses in templated code. To prevent backsliding we should instead add an LLVM aware version of https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html (or expand the existing one).

CONFLICT (content): Merge conflict in llvm/lib/IR/DiagnosticInfo.cpp

The test used to look all good, but actually not. The WeakVH just make itself null after the pointed value being replaced. So a zero value was used because VarIndex become null. The test checks looks all good. Actually only the WeakTrackingVH have the ability to be updated to new value. Change the test slightly to make that using zero index is wrong.

Previously, it generated extra `single` quote marks around the outer braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not expect that. It expects `{6442:\220,1\22}`.

… device (#189140) [Driver][HIP] Fix bundled -S emitting bitcode instead of assembly for device PR #188262 added support for bundling HIP -S output under the new offload driver, but the device backend still entered the bitcode-emitting path in ConstructPhaseAction. The condition at the Backend phase checked for the new offload driver and directed device code to emit TY_LLVM_BC, without excluding the -S case. This caused the device section in the bundled .s to contain LLVM bitcode instead of textual AMDGPU assembly. This broke the HIP UT CheckCodeObjAttr test which greps copyKernel.s for "uniform_work_group_size" — a string that only appears in textual assembly, not in bitcode. Fix by excluding -S (without -emit-llvm) from the new-driver bitcode path, so the device backend falls through to emit TY_PP_Asm (textual assembly). Also add a missing lit test check that the device backend produces assembler output for the bundled -S case. Fixes: LCOMPILER-553

…aries (#189044) We only did this for local variables but were were missing it for globals.

… (#189058)

…188917)

…ardOperands API to BranchOpInterface (#187864) To simplify the output of the reduction-tree pass, this PR introduces the eraseRedundantBlocksInRegion. For regions containing multiple execution paths, this functionality selects the shortest 'interesting' path. Additionally, this PR adds the getSuccessorForwardOperands API to BranchOpInterface. This allows us to extract the ForwardOperands for a specific path chosen from multiple alternatives, enabling the creation of a cf.br operation for the redirected jump.

…tions (#189113) Fixes llvm/llvm-project#187716.

…ter (#188924)

…ssorForwardOperands API to BranchOpInterface" (#189150) Reverts llvm/llvm-project#187864, because it is causing same build bot failures. See https://lab.llvm.org/buildbot/#/builders/138/builds/27662 and https://lab.llvm.org/buildbot/#/builders/169/builds/21376/steps/11/logs/stdio for memory leak issues.

…on index (#188508) When a dynamic index of -1 (the kPoisonIndex sentinel) was folded into the static position of a vector.insert op, foldDenseElementsAttrDestInsertOp would proceed to call calculateInsertPosition, which returned -1. The subsequent iterator arithmetic (allValues.begin() + (-1)) was undefined behaviour, causing an assertion in DenseElementsAttr::get. Fix by bailing out early in foldDenseElementsAttrDestInsertOp when any static position equals kPoisonIndex, consistent with how InsertChainFullyInitialized already guards this case. Fixes #188404 Assisted-by: Claude Code

…nt (#189163) When invoking `-test-bytecode-roundtrip=test-dialect-version=X.Y` on a module that contains no test dialect operations, the reader type callback in `runTest0` called `reader.getDialectVersion<test::TestDialect>()` and then immediately asserted that it succeeded. However, if the test dialect was never referenced in the bytecode (because no test dialect types appear in the module), the dialect's version information is not stored in the bytecode, so `getDialectVersion` legitimately returns failure. When the test dialect version is unavailable in the bytecode being read, the module contains no test dialect types, so no "funky"-group overrides are needed and the callback can safely skip by returning `success()`. A regression test is added with a module that has no test dialect ops, exercising the `test-dialect-version=2.0` path that previously crashed. Fixes #128321 Fixes #128325 Assisted-by: Claude Code

… (#188064) This PR adds two new field specifiers (`operand` and `attribute`) and extends the existing one (`result`): - `default_factory` parameter is added for `result` and `attribute` to specify default value via a lambda/function - `kw_only` parameter is added for all these three specifiers, to make a field a keyword-only parameter (without giving a default value). ```python def result( *, infer_type: bool = False, default_factory: Optional[Callable[[], Any]] = None, kw_only: bool = False, ) -> Any: ... def operand( *, kw_only: bool = False, ) -> Any: ... def attribute( *, default_factory: Optional[Callable[[], Any]] = None, kw_only: bool = False, ) -> Any: ... ``` Examples about how to use them: ```python class OperandSpecifierOp(TestFieldSpecifiers.Operation, name="operand_specifier"): a: Operand[IntegerType[32]] = operand() b: Optional[Operand[IntegerType[32]]] = None c: Operand[IntegerType[32]] = operand(kw_only=True) class ResultSpecifierOp(TestFieldSpecifiers.Operation, name="result_specifier"): a: Result[IntegerType[32]] = result() b: Result[IntegerType[16]] = result(infer_type=True) c: Result[IntegerType] = result( default_factory=lambda: IntegerType.get_signless(8) ) d: Sequence[Result[IntegerType]] = result(default_factory=list) e: Result[IntegerType[32]] = result(kw_only=True) class AttributeSpecifierOp( TestFieldSpecifiers.Operation, name="attribute_specifier" ): a: IntegerAttr = attribute() b: IntegerAttr = attribute( default_factory=lambda: IntegerAttr.get(IntegerType.get_signless(32), 42) ) c: StringAttr["a"] | StringAttr["b"] = attribute( default_factory=lambda: StringAttr.get("a") ) d: IntegerAttr = attribute(kw_only=True) ``` --------- Co-authored-by: Rolf Morel <rolfmorel@gmail.com>

Summary: These were renamed and the aliases removed, fix running the tests.

Signed-off-by: Shikhar Soni <shikharish05@gmail.com>

…89128) This fixes #186684. Also fix (not) breaking variables declared on the same line as the closing brace. And adapt whitesmith to that changes.

…efs (#188860) Fixes #188695

…ng and tests (#184365) Closes #181654

… broadcast from sg-to-wi (#185960) This PR adds distribution patterns for vector.step, vector.shape_cast & vector.broadcast in the new sg-to-wi pass

…. (#188721) If a load and a store have different address spaces, we cannot create a runtime check. Instead, always copy the data to an alloca matching the store address space. Fixes llvm/llvm-project#185236. PR: llvm/llvm-project#188721

This fixes b6e4d27. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

…t & mask ops in sg to wi pass (#187392) This PR adds patterns for following vector ops in the new sg-to-wi pass 1. Transpose 2. BitCast 3. CreateMask 4. ConstantMask

…6 (#189468) Fixes: LCOMPILER-1673

…ol-conversion (#189149) Fixes llvm/llvm-project#176889.

…(#189279) This patch introduces an amdgpu wrapper for `rocdl.global.load.async.to.lds.bN` intrinsics, which were introduced in gfx1250. Assisted-by: Claude --------- Signed-off-by: Eric Feng <Eric.Feng@amd.com>

…e.delinearize_index (#188369) Allow `affine.delinearize_index` and `affine.linearize_index` to operate on `vector<...x index>` types in addition to scalar indices. --------- Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This implements handling of cleanup scopes in cases where a flag is needed to indicate whether or not the cleanup is active. This happens in cases where a cleanup is no longer required, but it isn't at the top of the cleanup stack so it can't be popped. A temporary variable is used to set the cleanup to an inactive state when it is no longer needed. Assisted-by: Cursor / claude-4.6-opus-high (implementation) Assisted-by: Cursor / gpt-5.3-codex (tests)

…v_pulldown

…sts (#3660) Round trip for corresponding CHECK-LLVM is already working for some tests. So they could be enabled Original commit: KhronosGroup/SPIRV-LLVM-Translator@3f5257681447f4c

Update after llvm-project commit 8e1e371 ("[IR][NFC] Mark BranchInst as deprecated (#187314)", 2026-03-19). Original commit: KhronosGroup/SPIRV-LLVM-Translator@6b5f17f12b4be00

After llvm-project commit cf92512 ("[DebugInfo] Add Verifier check for local imports in CU's imports field (#187118)", 2026-03-19), DebugInfo got lost for these tests. Ensure the metadata follows the expected format. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9691713f67ce02c

The tests started to fail with "Unable to meet SPIR-V requirements for this target" after upstream commit llvm/llvm-project@85049fc357ac ("[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info (#187051)", 2026-03-25). Original commit: KhronosGroup/SPIRV-LLVM-Translator@40ce6c71d8d5b56

) Original commit: KhronosGroup/SPIRV-LLVM-Translator@34fdf7fcf4e0fd7

Replace manual save/set/restore of `SPIRVUseTextFormat` with `llvm::SaveAndRestore` to guarantee restoration on all exit paths, including the early return on write error. Fixes Coverity CID 546125. Resolves KhronosGroup/SPIRV-LLVM-Translator#3414 Original commit: KhronosGroup/SPIRV-LLVM-Translator@01ee67ccc9a2c61

Move annotation strings created from UserSemantic decorations to the constant address space. Even though these strings should disappear before instruction selection, we ought to avoid globals in the private addrspace. Also set the source file and auxilliary data arguments to `null` instead poison/undef which seems to be more common in llvm. Original commit: KhronosGroup/SPIRV-LLVM-Translator@8f16307ff9dbe9e

A recent version of SPIRV-Tools found several issues with the test, such as `DebugTypeFunction` having the wrong return type operand and `DebugTypeBasic` missing the flags operand. Original commit: KhronosGroup/SPIRV-LLVM-Translator@bf469923a25d484

) A malformed SPIR-V binary can contain an instruction WordCount below the instruction's minimum, causing wraparound in `resize(WordCount - FixedWC)` and a ~17 GB allocation that can result in `std::bad_alloc` when VA space is limited (32-bit systems, ulimit) or process hang on memory access. Fix by rejecting the malformed input early. AI-assisted: Claude Sonnet 4.6 (commercial SaaS) Original commit: KhronosGroup/SPIRV-LLVM-Translator@5adf335eedd8ba0

As in title, problem exposed during `sanitize_overflow` enablement in triton compiler: intel/intel-xpu-backend-for-triton#6533 Original commit: KhronosGroup/SPIRV-LLVM-Translator@b2410000b1ff3c9

Conflicts: clang/test/lit.site.cfg.py.in libclc/clc/lib/amdgpu/workitem/clc_get_local_id.cl libclc/libspirv/lib/amdgcn-amdhsa/SOURCES

github-advanced-security

zizmor found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Replace deprecated BranchInst::Create calls with UncondBrInst::Create and CondBrInst::Create throughout SYCLNativeCPUUtils. This addresses the LLVM deprecation of the unified BranchInst API in favor of separate unconditional and conditional branch instruction classes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fix the test naming to use target_name instead of ARG_ARCH_SUFFIX, matching upstream LLVM commit 90e5a1e which fixed name conflicts when multiple libraries use the same target triple. Changes: - Remove unused REMANGLE parameter from cmake_parse_arguments - Remove unnecessary ARG_ARCH_SUFFIX computation - Use ${target_name} for unique test names (upstream approach) - Use ${builtins_file} instead of undefined ${libclc_builtins_lib} - Use ${LIBCLC_SOURCE_DIR} for WORKING_DIRECTORY (upstream approach) This ensures native_cpu builds without CMake test naming conflicts while staying aligned with community conventions. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

jsji · 2026-04-13T00:35:54Z

[libclc] Fix native cpu build
[libclc] Align AddLibclc.cmake with upstream LLVM

@wenju-he Please review and follow up for libclc native cpu support after libclc refactoring. Thanks!

This reverts commit 7a3eee9. Missing symbols in native_cpu check-libclc are implemented in libdevice, not libclc: MemoryBarrier/ControlBarrier/BuiltInWorkgroupSize/BuiltInLocalInvocationId

native_cpu is not tested in sycl branch. Skip llvmspirv_pulldown branch as well.

wenju-he · 2026-04-13T07:13:33Z

[libclc] Fix native cpu build

Reverted in 54aa435. The missing symbols are implemented in libdevice, e.g.

llvm/libdevice/nativecpu_utils.cpp

Line 46 in c62d1d4

__spirv_ControlBarrier(int32_t Execution, int32_t Memory,

I have skipped native_cpu check-libclc in d93a810. This aligns with sycl branch.

[[libclc] Align AddLibclc.cmake with upstream LLVM]
(4465643)

LGTM. just added a minor code formatting in 6e622e5 to align with https://github.com/intel-restricted/applications.compilers.llvm-project/blob/ef83a191161833ae6a631d2a64630a88003e7ac0/libclc/CMakeLists.txt#L597-L601

arsenm and others added 30 commits March 28, 2026 00:16

libclc: Simplify fract implementation (#189080)

15bc5b0

[compiler-rt] Add interceptors for free_[aligned_]sized for asan+hwas…

a5fa4db

…an (#189109)

[Fuchsia] Set LIBCXX_ABI_UNSTABLE instead of LIBCXX_ABI_VERSION (#189…

c4847d2

…123) Use the generic switch rather than encoding the version number it currently corresponds to.

AMDGPU: Skip last corrections and scaling for afn llvm.sqrt.f64 (#183…

9be0cc1

…697) Device libs has a fast sqrt macro implemented this way.

Merge from 'main' to 'sycl-web' (18 commits)

0781c47

CONFLICT (content): Merge conflict in llvm/lib/IR/DiagnosticInfo.cpp

[XeVM] Fix the cache-control metadata string generation. (#187591)

8e59c3a

Previously, it generated extra `single` quote marks around the outer braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not expect that. It expects `{6442:\220,1\22}`.

[clang][bytecode] Skip rvalue subobject adjustments for global tempor…

fb09449

…aries (#189044) We only did this for local variables but were were missing it for globals.

[clang][bytecode] Add support for objc array- and dictionary literals…

cb8b65e

… (#189058)

[clang][bytecode] Handle strcmp() not pointing to primitive arrays (#…

097abb3

…188917)

[clang-tidy] Fix rvalue-reference-param-not-moved FP on implicit func…

ad91a2f

…tions (#189113) Fixes llvm/llvm-project#187716.

[mlir][vector] Reject alignment attribute on tensor-level gather/scat…

5ae2fe7

…ter (#188924)

[Clang][NFC] Add the list of C++26 papers approved in Kona and Croydon

64d2f70

[Clang][NFC] Trivial relocation is no longer a c++26 feature

16e0658

[libcxx] Update GPU cache files to use the proper loader

15940b1

Summary: These were renamed and the aliases removed, fix running the tests.

[libc][math][c23] Add log2p1f16 C23 math function (#186754)

f0ce26d

Signed-off-by: Shikhar Soni <shikharish05@gmail.com>

[clang-format] Fix breaking enum braces when combined with export (#1…

0ac35ec

…89128) This fixes #186684. Also fix (not) breaking variables declared on the same line as the closing brace. And adapt whitesmith to that changes.

[clang-format] Fix annotation of references in function pointer typed…

3f42ec6

…efs (#188860) Fixes #188695

[DAG] SelectionDAG::isKnownToBeAPowerOfTwo - add ISD::TRUNCATE handli…

9d6b92e

…ng and tests (#184365) Closes #181654

[MLIR][XeGPU] Add distribution patterns for vector step, shape_cast &…

9f3a9ea

… broadcast from sg-to-wi (#185960) This PR adds distribution patterns for vector.step, vector.shape_cast & vector.broadcast in the new sg-to-wi pass

forking-google-bazel-bot bot and others added 18 commits March 30, 2026 13:56

[Bazel] Fixes b6e4d27 (#189473)

19caff4

This fixes b6e4d27. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

[MLIR] [XeGPU] Add distribution patterns for vector transpose, bitcas…

e50f08b

…t & mask ops in sg to wi pass (#187392) This PR adds patterns for following vector ops in the new sg-to-wi pass 1. Transpose 2. BitCast 3. CreateMask 4. ConstantMask

[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf1…

5f99854

…6 (#189468) Fixes: LCOMPILER-1673

[clang-tidy] Add AllowLogicalOperatorConversion option to implicit-bo…

76f5c5d

…ol-conversion (#189149) Fixes llvm/llvm-project#176889.

[mlir][amdgpu] implement amdgpu.global_load_async_to_lds for gfx1250 …

ae835de

…(#189279) This patch introduces an amdgpu wrapper for `rocdl.global.load.async.to.lds.bN` intrinsics, which were introduced in gfx1250. Assisted-by: Claude --------- Signed-off-by: Eric Feng <Eric.Feng@amd.com>

Merge commit '7a3b7f142d8ffd7b3e2a9cf0a065e3ff7bf76241' into llvmspir…

78a867a

…v_pulldown

Add round-trip tests through SPIR-V backend for previously failing te…

8b096c3

…sts (#3660) Round trip for corresponding CHECK-LLVM is already working for some tests. So they could be enabled Original commit: KhronosGroup/SPIRV-LLVM-Translator@3f5257681447f4c

Migrate away from BranchInst

dba69fd

Update after llvm-project commit 8e1e371 ("[IR][NFC] Mark BranchInst as deprecated (#187314)", 2026-03-19). Original commit: KhronosGroup/SPIRV-LLVM-Translator@6b5f17f12b4be00

Adjust tests where DCE is removing IR and enable round-trip tests (#3665

dd76f0d

) Original commit: KhronosGroup/SPIRV-LLVM-Translator@34fdf7fcf4e0fd7

Add sadd_with_overflow_i8 support (#3673)

ff06617

As in title, problem exposed during `sanitize_overflow` enablement in triton compiler: intel/intel-xpu-backend-for-triton#6533 Original commit: KhronosGroup/SPIRV-LLVM-Translator@b2410000b1ff3c9

iclsrc added the disable-lint Skip linter check step and proceed with build jobs label Apr 10, 2026

Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown

4582a4c

Conflicts: clang/test/lit.site.cfg.py.in libclc/clc/lib/amdgpu/workitem/clc_get_local_id.cl libclc/libspirv/lib/amdgcn-amdhsa/SOURCES

github-advanced-security bot found potential problems Apr 10, 2026

View reviewed changes

jsji and others added 4 commits April 11, 2026 00:35

[libclc] Fix native cpu build

7a3eee9

Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown

f54a54d

wenju-he added 3 commits April 13, 2026 08:03

[libclc][CMake][NFC] Multi-line formatting for libspirv compile flags

6e622e5

Revert "[libclc] Fix native cpu build"

54aa435

This reverts commit 7a3eee9. Missing symbols in native_cpu check-libclc are implemented in libdevice, not libclc: MemoryBarrier/ControlBarrier/BuiltInWorkgroupSize/BuiltInLocalInvocationId

[libspirv] Skip native_cpu in check-libclc-spirv

d93a810

native_cpu is not tested in sycl branch. Skip llvmspirv_pulldown branch as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026)#21723

LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026)#21723
iclsrc wants to merge 3225 commits intosyclfrom
llvmspirv_pulldown

iclsrc commented Apr 10, 2026

Uh oh!

github-advanced-security bot left a comment

Uh oh!

jsji commented Apr 13, 2026

Uh oh!

wenju-he commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

iclsrc commented Apr 10, 2026

Uh oh!

github-advanced-security bot left a comment

Choose a reason for hiding this comment

Uh oh!

jsji commented Apr 13, 2026

Uh oh!

wenju-he commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants