[CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds #1191

liusy58 · 2024-12-02T03:53:46Z

Fix issue in #952.

as title. Also add function buildCommonNeonBuiltinExpr just like OG's emitCommonNeonBuiltinExpr. This might help consolidate neon cases and share common code. Notice: - I pretty much keep the skeleton of OG's emitCommonNeonBuiltinExpr at the cost of that we didn't use a few variables they calculate. They might help in the future. - The purpose of having CommonNeonBuiltinExpr is to reduce implementation code duplication. So far, we only have one type implemented, and it's hard for CIR to be more generic. But we should see if in future we can have different types of intrinsics share more generic code path. --------- Co-authored-by: Guojin He <[email protected]>

@test

…no override (llvm#893) As title. The test case used is abort(), but it is from the real code. Notice: Since CIR implementation for NoReturn Call is pending to implement, the generated llvm code is like: `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 ret void }` which is not right, right code should be like, ` `define dso_local void @test() llvm#1 { call void @abort(), !dbg !8 unreachable }` ` Still send this PR as Noreturn implementation is a separate issue.

as title. The test cases are from [clang codegen test case](https://github.com/llvm/clangir/blob/52323c17c6a3708b3eb72651465f7d4b82f057e7/clang/test/CodeGen/builtins.c#L37)

Before this patch, the CC lowering pass was applied only when explicitly requested by the user. This update changes the default behavior to always apply the CC lowering pass, with an option to disable it using the `-fno-clangir-call-conv-lowering` flag if necessary. The primary objective is to make this pass a mandatory step in the compilation pipeline. This ensures that future contributions correctly implement the CC lowering for both existing and new targets, resulting in more consistent and accurate code generation. From an implementation perspective, several `llvm_unreachable` statements have been substituted with a new `assert_or_abort` macro. This macro can be configured to either trigger a non-blocking assertion or a blocking unreachable statement. This facilitates a test-by-testa incremental development as it does not required you to know which code path a test will trigger an just cause a crash if it does. A few notable changes: - Support multi-block function in CC lowering - Ignore pointer-related CC lowering - Ignore no-proto functions CC lowering - Handle missing type evaluation kinds - Fix CC lowering for function declarations - Unblock indirect function calls - Disable CC lowering pass on several tests

…ntrinsicString (llvm#899) as title. In addition, this PR has 2 extra changes. 1. change return type of GetNeonType into mlir::cir::VectorType so we don't have to do cast all the time, this is consistent with [OG](https://github.com/llvm/clangir/blob/db6b7c07c076cb738d0acae248d7c3c199b2b952/clang/lib/CodeGen/CGBuiltin.cpp#L6234) as well. 2. add getAArch64SIMDIntrinsicString helper function so we have better debug info when hitting NYI in buildCommonNeonBuiltinExpr --------- Co-authored-by: Guojin He <[email protected]>

Fix llvm#895 and it's also missing some more throughout behavior for the pass, it also needs to be enabled by default when emitting object files. This reverts commit db6b7c0.

Then we can observe the time consumed in different part of CIR. This patch is not complete. But I think it is fine given we can always add them easily.

> To keep information about whether an OpenCL kernel has uniform work > group size or not, clang generates 'uniform-work-group-size' function > attribute for every kernel: > > "uniform-work-group-size"="true" for OpenCL 1.2 and lower, > "uniform-work-group-size"="true" for OpenCL 2.0 and higher if '-cl-uniform-work-group-size' option was specified, > "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no '-cl-uniform-work-group-size' options was specified. > If the function is not an OpenCL kernel, 'uniform-work-group-size' > attribute isn't generated. > > *From [Differential 43570](https://reviews.llvm.org/D43570)* This PR introduces the `OpenCLKernelUniformWorkGroupSizeAttr` attribute to the ClangIR pipeline, towards the completeness in attributes for OpenCL. While this attribute is represented as a unit attribute in MLIR, its absence signifies either non-kernel functions or a `false` value for kernel functions. To match the original LLVM IR behavior, we also consider whether a function is an OpenCL kernel during lowering: * If the function is not a kernel, the attribute is ignored. No LLVM function attribute is set. * If the function is a kernel: * and the `OpenCLKernelUniformWorkGroupSizeAttr` is present, we generate the LLVM function attribute `"uniform-work-group-size"="true"`. * If absent, we generate `"uniform-work-group-size"="false"`.

…#897) `CIRGenModule::buildGlobal` --[rename]--> `CIRGenModule::getOrCreateCIRGlobal` We already have `CIRGenModule::buildGlobal` that corresponds to `CodeGenModule::EmitGlobal`. But there is an overload of `buildGlobal` used by `getAddrOfGlobalVar`. Since this name is confusing, this PR rename it to `getOrCreateCIRGlobal`. Note that `getOrCreateCIRGlobal` already exists. It is intentional to make the renamed function an overload to it. The reason here is that the renamed function is basically a wrapper of the original `getOrCreateCIRGlobal` with more specific parameters: `getOrCreateCIRGlobal(decl, type, isDef)` --[call]--> `getOrCreateCIRGlobal(getMangledName(decl), type, decl->getType()->getAS(), decl, isDef)`

…m#901) just as title. --------- Co-authored-by: Guojin He <[email protected]>

…aller pieces (llvm#902) The missing feature flag for OpenCL has very few occurrences now. This PR rearranges them into proper pieces to better track them.

) Heterogeneous languages do not support exceptions, which corresponds to `nothrow` in ClangIR and `nounwind` in LLVM IR. This PR adds nothrow attributes for all functions for OpenCL languages in CIRGen. The Lowering for it is already supported previously.

Fix llvm#801 (the remaining `constant` part). Actually the missing stage is CIRGen. There are two places where `GV.setConstant` is called: * `buildGlobalVarDefinition` * `getOrCreateCIRGlobal` Therefore, the primary test `global-constant.c` contains a global definition and a global declaration with use, which should be enough to cover the two paths. A test for OpenCL `constant` qualified global is also added. Some existing testcases need tweaking to avoid failure of missing constant.

as title. --------- Co-authored-by: Guojin He <[email protected]>

@s

Consider the following code snippet `tmp.c`: ``` #define N 3200 struct S { double a[N]; double b[N]; } s; double *b = s.b; void foo() { double x = 0; for (int i = 0; i < N; i++) x += b[i]; } int main() { foo(); return 0; } ``` Running `bin/clang tmp.c -fclangir -o tmp && ./tmp` causes a segmentation fault. I compared the LLVM IR with and without CIR and noticed a difference which causes this: `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 0, i32 1)` // no CIR `@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 1)` // with CIR It seems there is a missing index when creating global pointers from structs. I have updated `Lowering/DirectToLLVM/LowerToLLVM.cpp`, and added a few tests.

as title. Notice this is not target specific nor neon intrinsics.

Entails several minor changes: - Duplicate resume blocks around. - Disable LP caching, we repeat them as often as necessary. - Update maps accordingly for tracking places to patch up. - Make changes to clean up block handling. - Fix an issue in flatten cfg.

as title. The current implementation of this PR is use cir::CastOP integral casting to implement vector type truncation. Thus, LLVM lowering code has been change to accommodate it. In addition. Added code into [CIRGenBuiltinAArch64.cpp](https://github.com/llvm/clangir/pull/909/files#diff-6f7700013aa60ed524eb6ddcbab90c4dd288c384f9434547b038357868334932) to make it more similar to OG. ``` mlir::Type ty = vTy; if (!ty) ``` Added test case into neon.c as the file already contains similar vector move test cases such as vmovl --------- Co-authored-by: Guojin He <[email protected]>

…m#935) as title. Also changed [neon-ldst.c](https://github.com/llvm/clangir/compare/main...ghehg:clangir-llvm-ghehg:macM3?expand=1#diff-ea4814b6503bff2b7bc4afc6400565e6e89e5785bfcda587dc8401d8de5d3a22) to make it have the same RUN options as OG [clang/test/CodeGen/aarch64-neon-intrinsics.c](https://github.com/llvm/clangir/blob/main/clang/test/CodeGen/aarch64-neon-intrinsics.c) Those options help us to avoid checking load/store pairs thus make the test less verbose and easier to compare against OG. Co-authored-by: Guojin He <[email protected]>

Implement derived-to-base address conversions for non-virtual base classes. The code gen for this situation was only implemented when the offset was zero, and it simply created a `cir.base_class_addr` op for which no lowering or other transformation existed. Conversion to a virtual base class is not yet implemented. Two new fields are added to the `cir.base_class_addr` operation: the byte offset of the necessary adjustment, and a boolean flag indicating whether the source operand may be null. The offset is easy to compute in the front end while the entire path of intermediate classes is still available. It would be difficult for the back end to recompute the offset. So it is best to store it in the operation. The null-pointer check is best done late in the lowering process. But whether or not the null-pointer check is needed is only known by the front end; the back end can't figure that out. So that flag needs to be stored in the operation. `CIRGenFunction::getAddressOfBaseClass` was largely rewritten. The code path no longer matches the equivalent function in the LLVM IR code gen, because the generated ClangIR is quite different from the generated LLVM IR. `cir.base_class_addr` is lowered to LLVM IR as a `getelementptr` operation. If a null-pointer check is needed, then that is wrapped in a `select` operation. When generating code for a constructor or destructor, an incorrect `cir.ptr_stride` op was used to convert the pointer to a base class. The code was assuming that the operand of `cir.ptr_stride` was measured in bytes; the operand is the number elements, not the number of bytes. So the base class constructor was being called on the wrong chunk of memory. Fix this by using a `cir.base_class_addr` op instead of `cir.ptr_stride` in this scenario. The use of `cir.ptr_stride` in `ApplyNonVirtualAndVirtualOffset` had the same problem. Continue using `cir.ptr_stride` here, but temporarily convert the pointer to type `char*` so the pointer is adjusted correctly. Adjust the expected results of three existing tests in response to these changes. Add two new tests, one code gen and one lowering, to cover the case where a base class is at a non-zero offset.

Fix llvm#934 While here move scope op codegen outside the builder, so it's easier to dump blocks and operations while debugging.

After 5da4310, the LLVM dialect requires the variadic callee type to be present for variadic calls. The op builders take care of this automatically if you pass the function type, so change our lowering logic to do so. Add tests for this as well as a missing test for indirect function call lowering. Fixes llvm#913 Fixes llvm#933

Lancern · 2024-12-02T05:58:29Z

Thanks for your time working on this!

The current changes are not related to ClangIR and it's not an appropriate way to resolve #952 . Given the following input code:

void foo(int *iptr) { iptr + 2; }

The CIR generated for the above code would be something similar to:

// ...
%1 = cir.const 2 : i32
%2 = cir.ptr_stride(%0 : !cir.ptr<i32>, %1 : i32), !cir.ptr<i32>
// ...

The generated CIR is further lowered to the following LLVM dialect code in clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp:

%0 = llvm.getelementptr %1[%2] : (!llvm.ptr, i32) -> !llvm.ptr, i32

Apparently the root cause is that LowerToLLVM fails to add the inbounds attribute to the llvm.getelementptr operation. You should update code in LowerToLLVM.cpp accordingly to fix this problem.

liusy58 · 2024-12-02T06:46:15Z

Alright, I'll work on LowerToLLVM to address this issue. In fact, I'm not entirely sure when the inbounds attribute should be added. I examined the code in Value *emitPointerArithmetic from clang/lib/CodeGen/CGExprScalar.cpp, but I couldn't identify a clear pattern. Could you please provide some guidance?

Lancern · 2024-12-02T09:36:58Z

@liusy58 The getelementptr instruction is actually emitted in the CodeGenFunction::EmitCheckedInBoundsGEP function defined in the file you pointed out. It has two overloads, and you can find that both overloads set the inbounds flag without many prior conditions.

The C++ standard says that if the result of pointer arithmetic is out of bounds, the behavior is undefined. So I believe for cir.ptr_stride, you should always add the inbounds attribute to the lowered llvm.getelementptr operation.

liusy58 · 2024-12-02T11:06:13Z

Thank you. Let me check it.

Co-authored-by: Sirui Mu <[email protected]>

This PR adds `clang::CodeGenOptions` to the lowering context. Similar to `clang::LangOptions`, the code generation options are currently set to the default values when initializing the lowering context. Besides, this PR also adds a new attribute `#cir.opt_level`. The attribute is a module-level attribute and it holds the optimization level (e.g. -O1, -Oz, etc.). The attribute is consumed when initializing the lowering context to populate the `OptimizationLevel` and the `OptimizeSize` field in the code generation options. CIRGen is updated to attach this attribute to the module op.

Removes some NYIs. But left assert(false) due to missing tests. It looks better since it is not so scaring as NYI.

This PR adds support for base-to-derived and derived-to-base casts on pointer-to-data-member values. Related to llvm#973.

bcardosolopes · 2024-12-02T23:40:33Z

Thanks @Lancern and @seven-mile for the great review and clarifications. @liusy58 welcome to the ClangIR project!

liusy58 · 2024-12-03T09:36:18Z

@Lancern Hi, I have update the code and could you please review it?

Lancern

Thanks for working on this! The CI shows you have 13 failed tests, please resolve them and it should be good to go!

seven-mile

Thanks for the update and bearing all the comments! Adding inbounds unconditionally might be considered not quite right.

I believe inbounds of GEP is about low-level pointer arithmetic rather than memory model in the language. The keyword controls the overflow behaviour concisely (ref), which leads to a common pattern in OG CodeGen:

clangir/clang/lib/CodeGen/CGExprScalar.cpp

Lines 4090 to 4095 in eacaabb

    
           if (CGF.getLangOpts().isSignedOverflowDefined()) 
        
             return CGF.Builder.CreateGEP(elemTy, pointer, index, "add.ptr"); 
        
           return CGF.EmitCheckedInBoundsGEP( 
        
               elemTy, pointer, index, isSigned, isSubtraction, op.E->getExprLoc(), 
        
               "add.ptr");

Additionally, we'd better be careful to apply language conformance: some options are designed to control the conformance or provide some extensions. The code above indicates an instance: -fwrapv controlling SOB. We should take care of them to keep the frontend functional ; )

There might be other considerations for a specific case in OG CodeGen. Usually the reliability comes from the correspondence of skeleton between the old and new codes. Given the fact that we have no choice but migrate these logic to LowerToLLVM, we should be especially cautious. IMHO this fix is not necessarily finished in one single patch.

For the next step, I think we can discuss what changes should this patch include. A good start is to just consider #952. If it's suitable, pack more changes in your following patches, and so on. It's your first-time contribution after all, no need to hurry 😉

clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp

…tic should have inbounds.

liusy58 · 2024-12-04T05:38:58Z

Hi, @seven-mile , I have updated the commit, please review it. Thanks!

seven-mile · 2024-12-04T07:11:40Z

clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp

-      ptrStrideOp, resultTy, elementTy, adaptor.getBase(), index);
+  rewriter.replaceOpWithNewOp<mlir::LLVM::GEPOp>(ptrStrideOp, resultTy,
+                                                 elementTy, adaptor.getBase(),
+                                                 index, /*inbounds=*/true);


It's still unconditional. We cannot accept miscompilation. Please make sure we emit the same LLVM IR as e.g. this godbolt, or optionally lead it to an assersion failure.

ok, I will work on it later.

smeenai · 2024-12-05T00:38:06Z

#886 is a related potential area for follow-up work here if you're interested :)

ghehg and others added 30 commits November 22, 2024 18:11

[CIR][CIRGen][NFC] Split cir.scope creation on buildReturnStmt

3798bbb

[CIR][NFC] Add helpers for cir.try and do some refactoring

94da764

[CIR][CIRGen] Support __builtin_huge_val for float type (llvm#889)

ca05983

as title. The test cases are from [clang codegen test case](https://github.com/llvm/clangir/blob/52323c17c6a3708b3eb72651465f7d4b82f057e7/clang/test/CodeGen/builtins.c#L37)

[CIR][NFC] Rename test

8689e47

[CIR][NFC] Silence unused warning

47d527e

Revert "[CIR][ABI] Apply CC lowering pass by default (llvm#842)"

250d9fe

Fix llvm#895 and it's also missing some more throughout behavior for the pass, it also needs to be enabled by default when emitting object files. This reverts commit db6b7c0.

[CIR][CIRGen] Add time trace to several CIRGen pieces (llvm#898)

f9e4d9c

Then we can observe the time consumed in different part of CIR. This patch is not complete. But I think it is fine given we can always add them easily.

[CIR][CIRGen][Builtin][Neon] Lower neon vld1_lane and vld1q_lane (llv…

d3b4375

…m#901) just as title. --------- Co-authored-by: Guojin He <[email protected]>

[CIR][CodeGen][NFC] Break the missing feature flag for OpenCL into sm…

5c77fea

…aller pieces (llvm#902) The missing feature flag for OpenCL has very few occurrences now. This PR rearranges them into proper pieces to better track them.

[CIR][Test][NFC] Organize CIR CodeGen AArch64 neon tests (llvm#910)

afb76fa

as title. --------- Co-authored-by: Guojin He <[email protected]>

[CIR][CIRGen][Builtin] Implement builtin __sync_fetch_and_sub (llvm#932)

eb33b81

as title. Notice this is not target specific nor neon intrinsics.

[CIR][NFC] Updates against -Wswitch after rebase

e106e19

[CIR][CIRGen] Exceptions: fix agg store for temporaries

3eb30af

Fix llvm#934 While here move scope op codegen outside the builder, so it's easier to dump blocks and operations while debugging.

[CIR][CIRGen] Lower cir.throw in absence of dtors

796de49

[CIR][NFC] Update wrong comments from previous commit

7479ff4

[CIR][CIRGen] Exceptions: support free'ing allocated exception resources

7969186

PikachuHyA and others added 4 commits December 2, 2024 14:29

[CIR][CIRGen] Change SignBitOp result type to !cir.bool (llvm#1187)

aa7b5c6

Co-authored-by: Sirui Mu <[email protected]>

[CIR] [CodeGen] Handle arrangeCXXStructorDeclaration (llvm#1179)

67bbd1e

Removes some NYIs. But left assert(false) due to missing tests. It looks better since it is not so scaring as NYI.

[CIR] Add support for casting pointer-to-data-member values (llvm#1188)

eacaabb

This PR adds support for base-to-derived and derived-to-base casts on pointer-to-data-member values. Related to llvm#973.

liusy58 requested review from lanza and bcardosolopes as code owners December 3, 2024 09:04

liusy58 force-pushed the missing_inbounds branch from 12458ba to 9c7e362 Compare December 3, 2024 09:18

liusy58 requested a review from seven-mile December 3, 2024 09:25

Lancern approved these changes Dec 3, 2024

View reviewed changes

liusy58 force-pushed the missing_inbounds branch 2 times, most recently from cec82cc to d9b7c41 Compare December 3, 2024 12:59

seven-mile requested changes Dec 3, 2024

View reviewed changes

clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp Outdated Show resolved Hide resolved

clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp Show resolved Hide resolved

liusy58 force-pushed the missing_inbounds branch from d9b7c41 to 4e05834 Compare December 4, 2024 01:21

[CIR][LowerToLLVM] fixup! CIR generated LLVM code for pointer arithme…

0362c64

…tic should have inbounds.

liusy58 force-pushed the missing_inbounds branch from 17d944f to 0362c64 Compare December 4, 2024 03:23

liusy58 changed the title ~~GEP with a constant offset should have inbounds attribute.~~ IR generated LLVM code for pointer arithmetic should have inbounds. Dec 4, 2024

Lancern changed the title ~~IR generated LLVM code for pointer arithmetic should have inbounds.~~ [CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds Dec 4, 2024

liusy58 requested a review from seven-mile December 4, 2024 05:25

seven-mile reviewed Dec 4, 2024

View reviewed changes

lanza force-pushed the main branch from 9ffbe92 to 3c3b096 Compare January 27, 2025 22:44

lanza force-pushed the main branch from 0364dd2 to 79d0d74 Compare March 18, 2025 02:22

lanza force-pushed the main branch from d3fe299 to 98e8811 Compare April 9, 2025 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds #1191

[CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds #1191

Uh oh!

liusy58 commented Dec 2, 2024

Uh oh!

Lancern commented Dec 2, 2024

Uh oh!

liusy58 commented Dec 2, 2024 •

edited

Loading

Uh oh!

Lancern commented Dec 2, 2024 •

edited

Loading

Uh oh!

liusy58 commented Dec 2, 2024

Uh oh!

bcardosolopes commented Dec 2, 2024

Uh oh!

liusy58 commented Dec 3, 2024

Uh oh!

Lancern left a comment

Uh oh!

seven-mile left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

liusy58 commented Dec 4, 2024

Uh oh!

seven-mile Dec 4, 2024

Uh oh!

liusy58 Dec 9, 2024

Uh oh!

smeenai commented Dec 5, 2024

Uh oh!

Uh oh!

	if (CGF.getLangOpts().isSignedOverflowDefined())
	return CGF.Builder.CreateGEP(elemTy, pointer, index, "add.ptr");

	return CGF.EmitCheckedInBoundsGEP(
	elemTy, pointer, index, isSigned, isSubtraction, op.E->getExprLoc(),
	"add.ptr");

[CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds #1191

Are you sure you want to change the base?

[CIR][LowerToLLVM] Lowered LLVM code for pointer arithmetic should have inbounds #1191

Uh oh!

Conversation

liusy58 commented Dec 2, 2024

Uh oh!

Lancern commented Dec 2, 2024

Uh oh!

liusy58 commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Lancern commented Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liusy58 commented Dec 2, 2024

Uh oh!

bcardosolopes commented Dec 2, 2024

Uh oh!

liusy58 commented Dec 3, 2024

Uh oh!

Lancern left a comment

Choose a reason for hiding this comment

Uh oh!

seven-mile left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

liusy58 commented Dec 4, 2024

Uh oh!

seven-mile Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

liusy58 Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

smeenai commented Dec 5, 2024

Uh oh!

Uh oh!

liusy58 commented Dec 2, 2024 •

edited

Loading

Lancern commented Dec 2, 2024 •

edited

Loading

seven-mile left a comment •

edited

Loading