Isoard.upstream sync by isoard-amd · Pull Request #593 · Xilinx/llvm-aie

isoard-amd · 2025-08-04T20:25:40Z

Merge conflicts in:

lld/ELF/Target.h
lld/ELF/Target.cpp

Mainly:
TargetInfo *getAIETargetInfo(Ctx &) → void setAIETargetInfo(Ctx &)

Follow-up to 9766ce4

fabs and fneg are similar nodes in that they can always be expanded to integer ops, but currently they diverge when widened. If the widened vector fabs is marked as expand (and the corresponding scalar type is too), LegalizeVectorTypes thinks that it may be turned into a libcall and so will unroll it to avoid the overhead on the undef elements. However unlike the other ops in that list like fsin, fround, flog etc., an fabs marked as expand will never be legalized into a libcall. Like fneg, it can always be expanded into an integer op. This moves it below unrollExpandedOp to bring it in line with fneg, which fixes an issue on RISC-V with f16 fabs being unexpectedly scalarized when there's no zfhmin.

Reported-by: Yingwei Zheng <dtcxzyw2333@gmail.com> Fixes: 02debce ("update_test_checks: improve IR value name stability (#110940)")

…IMM constant splat." > As we're after a constant splat value we can avoid all the complexities of trying to recreate the correct constant via getTargetConstantFromNode. This caused builds to fail with an assertion: X86ISelLowering.cpp:48569 Assertion `C.getZExtValue() != 0 && C.getZExtValue() != maxUIntN(VT.getScalarSizeInBits()) && "Both cases that could cause potential overflows should have " "already been handled." See llvm/llvm-project#111325 This reverts commit 1bc87c9.

…ew padding layout" (#111123) Relands llvm/llvm-project#108375 which had to be reverted because it was failing on the Windows buildbot. Trying to reland this with `msvc::no_unique_address` on Windows.

…tant splat. (REAPPLIED) As we're after a constant splat value we can avoid all the complexities of trying to recreate the correct constant via getTargetConstantFromNode.

Follow-up to the LLDB std::optional data-formatter test failure caused by llvm/llvm-project#110355. Two formats are supported: 1. `__val_` has type `value_type` 2. `__val_`'s type is wrapped in `std::remove_cv_t`

…r of ifdefs The current layout *does* have `removecv_t`. So change the ifdefs to reflect that.

…#110988) Prior to this patch, the LLVMContext was shared across inputs to llvm-dis. Consequently, NamedStructTypes was shared across inputs, which impacts StructType::setName - if a name was reused across inputs, it would get renamed during construction of the struct type, leading to tricky to diagnose confusion.

…REAPPLIED) Followup to 3d862c7 fix - always fold multiply to zero/negation

…10267) The main purpose of this patch is to centralize the logic for creating MLIR operation entry blocks and for binding them to the corresponding symbols. This minimizes the chances of mixing arguments up for operations having multiple entry block argument-generating clauses and prevents divergence while binding arguments. Some changes implemented to this end are: - Split into two functions the creation of the entry block, and the binding of its arguments and the corresponding Fortran symbol. This enabled a significant simplification of the lowering of composite constructs, where it's no longer necessary to manually ensure the lists of arguments and symbols refer to the same variables in the same order and also match the expected order by the `BlockArgOpenMPOpInterface`. - Removed redundant and error-prone passing of types and locations from `ClauseProcessor` methods. Instead, these are obtained from the values in the appropriate clause operands structure. This also simplifies argument lists of several lowering functions. - Access block arguments of already created MLIR operations through the `BlockArgOpenMPOpInterface` instead of directly indexing the argument list of the operation, which is not scalable as more entry block argument-generating clauses are added to an operation. - Simplified the implementation of `genParallelOp` to no longer need to define different callbacks depending on whether delayed privatization is enabled.

…ng discrepancies (#111289) Fix two discrepancies between the cited snippets and the full code.

…g avx512f feature (#111337) This test passes as-is on non-X86 hosts only because almost no target implements `isValidFeatureName` (the default implementation unconditionally returns true). RISC-V does implement it, and like X86 checks that the feature name is one supported by the architecture. This means the test creates an additional warning on RISC-V due to `_attribute__((target("avx512f")))`. The simple solution here is to just explicitly target x86_64-linux-gnu.

This makes the test independent of the one provided by a toolchain clang is built into, which can cause the output of -print-multi-flags-experimental to change.

…NFC.

…hat can fold to BEXT/BZHI With BEXT/BZHI the i64 imm mask will be replaced with a i16/i8 control mask Fixes #111323

…#109803) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc

Summary: Make a separate thread to run the server when we launch. This is required by CUDA, which you can force with `export CUDA_LAUNCH_BLOCKING=1`. I figured I might as well be consistent and do it for the AMD implementation as well even though I believe it's not necessary.

This reverts commit 3c83102.

In aea0668, API tests were supposed to use LLVM tools. However, a path to a utility is made up incorrectly there: util name should be prefixed with `llvm-`. Hence, it's fixed here.

…111350) Requested here llvm/llvm-project#111022 (comment)

Reverts llvm/llvm-project#108939 When `AVX` is available but `-mprefer-vector-width=128` some of the `mov` instructions turn into the x86 `rep;movsb` instruction leading to poor performance on "old" architectures (sandybridge, haswell). The possible solutions are : get rid of the `-mprefer-vector-width` option or use smaller static copy sizes in `inline_memcpy_x86_sse2_ge64_sw_prefetching`. Right now a copy size of 3 cache lines (192B) relying exclusively on xmm registers gets turned into `rep;movsb`.

…T_FN_ATTRS_CONSTEXPR defines. NFC. We only need one define - so consistently use __DEFAULT_FN_ATTRS like we do in other headers.

…11001) This is an initial patch to enable constexpr support on the more basic SSE1 intrinsics - such as initialization, arithmetic, logic and fixed shuffles. The plan is to incrementally extend this for SSE2/AVX etc. - initially for the equivalent basic intrinsics, but we can add support for some of the ia32 builtins as well we the need arises.

…sses (#87003)" This caused assertion failures: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7736: SDValue getMemsetValue(SDValue, EVT, SelectionDAG &, const SDLoc &): Assertion `C->getAPIntValue().getBitWidth() == 8' failed. See comment on the PR for a reproducer. > repstosb and repstosd are the same size, but stosd is only done for 0 > because the process of multiplying the constant so that it is copied > across the bytes of the 32-bit number adds extra instructions that cause > the size to increase. For 0, repstosb and repstosd are the same size, > but stosd is only done for 0 because the process of multiplying the > constant so that it is copied across the bytes of the 32-bit number adds > extra instructions that cause the size to increase. For 0, we do not > need to do that at all. > > For memcpy, the same goes, and as a result the minsize check was moved > ahead because a jmp to memcpy encoded takes more bytes than repmovsb. This reverts commit 6de5305.

Previous llvm/llvm-project#110362 (reverted) caused breakage. Here is the PR with fix. My build cmdline: ``` cmake ../llvm \ -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=install \ -DCMAKE_C_COMPILER=gcc-9 \ -DCMAKE_CXX_COMPILER=g++-9 \ -DCMAKE_CUDA_COMPILER=$(which nvcc) \ -DLLVM_ENABLE_LLD=OFF \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_BUILD_EXAMPLES=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DLLVM_CCACHE_BUILD=ON \ -DMLIR_ENABLE_BINDINGS_PYTHON=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_ENABLE_PROJECTS='llvm;mlir' ```

Here I'm splitting up the existing "if" statement into two. Mixing hasDefinition() and insert() in one "if" condition would be extremely confusing as hasDefinition() doesn't change anything while insert() does.

As with other operations such as trunc and fp converts, it should be valid to convert bitcast(undef) to undef.

Conflicts: - lld/ELF/Target.cpp - lld/ELF/Target.h - llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll

…tInfo` See e1a073c for details.

See b91f0de for more details.

See d883ef1 for more details.

zmodem and others added 30 commits October 7, 2024 11:19

[lsan] Make ReportUnsuspendedThreads return bool also for Fuchsia

3137b6a

Follow-up to 9766ce4

update_test_checks: fix a simple regression (#111347)

ae6af37

Reported-by: Yingwei Zheng <dtcxzyw2333@gmail.com> Fixes: 02debce ("update_test_checks: improve IR value name stability (#110940)")

Reland "[lldb][test] TestDataFormatterLibcxxStringSimulator.py: add n…

d148548

…ew padding layout" (#111123) Relands llvm/llvm-project#108375 which had to be reverted because it was failing on the Windows buildbot. Trying to reland this with `msvc::no_unique_address` on Windows.

[x86] combineMul - use computeKnownBits directly to find MUL_IMM cons…

3d862c7

…tant splat. (REAPPLIED) As we're after a constant splat value we can avoid all the complexities of trying to recreate the correct constant via getTargetConstantFromNode.

[lldb][test] Add libcxx-simulators test for std::optional (#111133)

66713a0

Follow-up to the LLDB std::optional data-formatter test failure caused by llvm/llvm-project#110355. Two formats are supported: 1. `__val_` has type `value_type` 2. `__val_`'s type is wrapped in `std::remove_cv_t`

[lldb][test] TestDataFormatterLibcxxOptionalSimulator.py: change orde…

a1c0ba1

…r of ifdefs The current layout *does* have `removecv_t`. So change the ifdefs to reflect that.

[x86] combineMul - handle 0/-1 KnownBits cases before MUL_IMM logic (…

db13404

…REAPPLIED) Followup to 3d862c7 fix - always fold multiply to zero/negation

[doc] Fix Kaleidoscope tutorial chapter 3 code snippet and full listi…

8df6637

…ng discrepancies (#111289) Fix two discrepancies between the cited snippets and the full code.

[Driver] Use empty multilib file in another test (#111352)

f1ec45a

This makes the test independent of the one provided by a toolchain clang is built into, which can cause the output of -print-multi-flags-experimental to change.

[X86] Add test coverage for #111323

9a222a1

[X86] getIntImmCostInst - pull out repeated Imm.getBitWidth() calls. …

c978d05

…NFC.

[X86] getIntImmCostInst - reduce i64 imm costs of AND(X,CMASK) case t…

8b6e1dc

…hat can fold to BEXT/BZHI With BEXT/BZHI the i64 imm mask will be replaced with a i16/i8 control mask Fixes #111323

[LLVM][CodeGen] Add lowering for scalable vector bfloat operations. (…

02dd6b1

…#109803) Specifically: fabs, fadd, fceil, fdiv, ffloor, fma, fmax, fmaxnm, fmin, fminnm, fmul, fnearbyint, fneg, frint, fround, froundeven, fsub, fsqrt & ftrunc

[gn build] Port fb0ef6b

1062007

Revert "[NFC][EarlyIfConverter] Remove unused member variables"

a018353

This reverts commit 3c83102.

[lldb][test] Provide proper path to LLVM utils in API tests (#110837)

e7174a8

In aea0668, API tests were supposed to use LLVM tools. However, a path to a utility is made up incorrectly there: util name should be prefixed with `llvm-`. Hence, it's fixed here.

[flang][debug] set DW_AT_main_subprogram for fortran main function (#…

91d6e77

…111350) Requested here llvm/llvm-project#111022 (comment)

[clang][x86] popcntintrin.h - merge the __DEFAULT_FN_ATTRS / __DEFAUL…

5dc7a5e

…T_FN_ATTRS_CONSTEXPR defines. NFC. We only need one define - so consistently use __DEFAULT_FN_ATTRS like we do in other headers.

[AST] Avoid repeated hash lookups (NFC) (#111327)

4c9c2d6

Here I'm splitting up the existing "if" statement into two. Mixing hasDefinition() and insert() in one "if" condition would be extremely confusing as hasDefinition() doesn't change anything while insert() does.

[Linalg] Avoid repeated hash lookups (NFC) (#111328)

4bc0916

[GlobalISel] Fold bitcast(undef) to undef. (#111491)

3bf33ec

As with other operations such as trunc and fp converts, it should be valid to convert bitcast(undef) to undef.

isoard-amd requested review from F-Stuckmann, SagarMaheshwari99, abhinay-anubola, abnikant, andcarminati, katerynamuts, khallouh, konstantinschwarz, martien-de-jong, niwinanto and stephenneuendorffer as code owners August 4, 2025 20:25

konstantinschwarz reviewed Aug 4, 2025

View reviewed changes

Comment thread clang/test/CodeGen/aie/aie2/aie2-stream-intrinsics.cpp Outdated

Comment thread llvm/test/CodeGen/AIE/GlobalISel/simplify-concat-unmerge-phi.mir Outdated

isoard-amd force-pushed the isoard.upstream-sync branch 3 times, most recently from 9159cdf to 81bb3c4 Compare August 4, 2025 21:16

isoard-amd requested a review from philippjh as a code owner August 4, 2025 21:58

isoard-amd force-pushed the isoard.upstream-sync branch from bf68797 to 81bb3c4 Compare August 4, 2025 22:01

Merge commit '3bf33ecec8f0' into aie-public

8a8a279

Conflicts: - lld/ELF/Target.cpp - lld/ELF/Target.h - llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll

isoard-amd force-pushed the isoard.upstream-sync branch from 81bb3c4 to 4be0dab Compare August 4, 2025 22:50

Alexandre Isoard added 7 commits August 4, 2025 17:24

[ELF][AIE] Rename TargetInfo *getAIETargetInfo to `void setAIETarge…

ef84dee

…tInfo` See e1a073c for details.

[AIE] Change backend callback to require const RecordKeeper (#111064)

c86c544

See b91f0de for more details.

[AIE] Timer code factored out into a TGTimer class (#111054)

e00e0b6

See d883ef1 for more details.

fixup! [AIE] initial commit

55e6574

fixup! [AIE] Add basic clang support, including intrinsics

08ec045

fixup! [AIEX] combiner to simplify Concat-UNMERGE-PHI

6fb94bc

[AMDGPU] Mark unit-test as XFAIL due to local change for aie

f56984e

isoard-amd force-pushed the isoard.upstream-sync branch from 4be0dab to f56984e Compare August 4, 2025 23:27

konstantinschwarz merged commit 229366d into aie-public Aug 4, 2025
6 of 7 checks passed

konstantinschwarz deleted the isoard.upstream-sync branch August 4, 2025 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isoard.upstream sync#593

Isoard.upstream sync#593
konstantinschwarz merged 757 commits into
aie-publicfrom
isoard.upstream-sync

isoard-amd commented Aug 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

isoard-amd commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

isoard-amd commented Aug 4, 2025 •

edited

Loading