Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen][TBAA] Add support for TBAA #1076

Open
wants to merge 2,126 commits into
base: main
Choose a base branch
from

Conversation

PikachuHyA
Copy link
Collaborator

This patch introduces support for TBAA, following the structure ofclang/lib/CodeGen/CodeGenTBAA.h. The key function implemented is CIRGenModule::decorateOperationWithTBAA, which works similarly to CodeGenModule::DecorateInstructionWithTBAA.

Note: Support for vtable pointer and tbaa.struct is not yet included.

For further details, please refer to:

@PikachuHyA PikachuHyA changed the title [CIR][CIRGen] Add support for TBAA [CIR][CIRGen][TBAA] Add support for TBAA Nov 7, 2024
Copy link

github-actions bot commented Nov 7, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing @PikachuHyA, very happy to have TBAA support in ClangIR.

clang/test/CIR/CodeGen/tbaa.cpp Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/tbaa.c Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/tbaa.c Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/tbaa.c Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenTBAA.h Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.cpp Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/tbaa.c Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/tbaa3.c Outdated Show resolved Hide resolved
bcardosolopes pushed a commit that referenced this pull request Nov 19, 2024
This is the first patch to support TBAA, following the discussion at
#1076 (comment)

- add skeleton for CIRGen, utilizing `decorateOperationWithTBAA`
- add empty implementation in `CIRGenTBAA`
- introduce `CIR_TBAAAttr` with empty body
- attach `CIR_TBAAAttr` to `LoadOp` and `StoreOp`
- no handling of vtable pointer
- no LLVM lowering
bcardosolopes and others added 22 commits November 22, 2024 18:12
Before this patch, the CC lowering pass was applied only when explicitly
requested by the user. This update changes the default behavior to
always apply the CC lowering pass, with an option to disable it using
the `-fno-clangir-call-conv-lowering` flag if necessary.

The primary objective is to make this pass a mandatory step in the
compilation pipeline. This ensures that future contributions correctly
implement the CC lowering for both existing and new targets, resulting
in more consistent and accurate code generation.

From an implementation perspective, several `llvm_unreachable`
statements have been substituted with a new `assert_or_abort` macro.
This macro can be configured to either trigger a non-blocking assertion
or a blocking unreachable statement. This facilitates a test-by-testa
incremental development as it does not required you to know which code
path a test will trigger an just cause a crash if it does.

A few notable changes:

 - Support multi-block function in CC lowering
 - Ignore pointer-related CC lowering
 - Ignore no-proto functions CC lowering
 - Handle missing type evaluation kinds
 - Fix CC lowering for function declarations
 - Unblock indirect function calls
 - Disable CC lowering pass on several tests
…ntrinsicString (llvm#899)

as title.
In addition, this PR has 2 extra changes.

1. change return type of GetNeonType into mlir::cir::VectorType so we
don't have to do cast all the time, this is consistent with
[OG](https://github.com/llvm/clangir/blob/db6b7c07c076cb738d0acae248d7c3c199b2b952/clang/lib/CodeGen/CGBuiltin.cpp#L6234)
as well.
2. add getAArch64SIMDIntrinsicString helper function so we have better
debug info when hitting NYI in buildCommonNeonBuiltinExpr

---------

Co-authored-by: Guojin He <[email protected]>
Fix llvm#895 and it's also missing some more
throughout behavior for the pass, it also needs to be enabled by default when
emitting object files.

This reverts commit db6b7c0.
Then we can observe the time consumed in different part of CIR. This
patch is not complete. But I think it is fine given we can always add
them easily.
> To keep information about whether an OpenCL kernel has uniform work
> group size or not, clang generates 'uniform-work-group-size' function
> attribute for every kernel:
> 
> "uniform-work-group-size"="true" for OpenCL 1.2 and lower,
> "uniform-work-group-size"="true" for OpenCL 2.0 and higher if
'-cl-uniform-work-group-size' option was specified,
> "uniform-work-group-size"="false" for OpenCL 2.0 and higher if no
'-cl-uniform-work-group-size' options was specified.
> If the function is not an OpenCL kernel, 'uniform-work-group-size'
> attribute isn't generated.
> 
> *From [Differential 43570](https://reviews.llvm.org/D43570)*

This PR introduces the `OpenCLKernelUniformWorkGroupSizeAttr` attribute
to the ClangIR pipeline, towards the completeness in attributes for
OpenCL. While this attribute is represented as a unit attribute in MLIR,
its absence signifies either non-kernel functions or a `false` value for
kernel functions. To match the original LLVM IR behavior, we also
consider whether a function is an OpenCL kernel during lowering:

* If the function is not a kernel, the attribute is ignored. No LLVM
function attribute is set.
* If the function is a kernel:
* and the `OpenCLKernelUniformWorkGroupSizeAttr` is present, we generate
the LLVM function attribute `"uniform-work-group-size"="true"`.
    * If absent, we generate `"uniform-work-group-size"="false"`.
…#897)

`CIRGenModule::buildGlobal` --[rename]-->
`CIRGenModule::getOrCreateCIRGlobal`

We already have `CIRGenModule::buildGlobal` that corresponds to
`CodeGenModule::EmitGlobal`. But there is an overload of `buildGlobal`
used by `getAddrOfGlobalVar`. Since this name is confusing, this PR
rename it to `getOrCreateCIRGlobal`.

Note that `getOrCreateCIRGlobal` already exists. It is intentional to
make the renamed function an overload to it. The reason here is that the
renamed function is basically a wrapper of the original
`getOrCreateCIRGlobal` with more specific parameters:

`getOrCreateCIRGlobal(decl, type, isDef)` --[call]-->
`getOrCreateCIRGlobal(getMangledName(decl), type,
decl->getType()->getAS(), decl, isDef)`
…aller pieces (llvm#902)

The missing feature flag for OpenCL has very few occurrences now. This
PR rearranges them into proper pieces to better track them.
)

Heterogeneous languages do not support exceptions, which corresponds to
`nothrow` in ClangIR and `nounwind` in LLVM IR.

This PR adds nothrow attributes for all functions for OpenCL languages
in CIRGen. The Lowering for it is already supported previously.
Fix llvm#801 (the remaining `constant` part). Actually the missing stage is
CIRGen.

There are two places where `GV.setConstant` is called:

* `buildGlobalVarDefinition`
* `getOrCreateCIRGlobal`

Therefore, the primary test `global-constant.c` contains a global
definition and a global declaration with use, which should be enough to
cover the two paths.

A test for OpenCL `constant` qualified global is also added. Some
existing testcases need tweaking to avoid failure of missing constant.
Consider the following code snippet `tmp.c`: 
```
#define N 3200

struct S {
  double a[N];
  double b[N];
} s;

double *b = s.b;

void foo() {
  double x = 0;
  for (int i = 0; i < N; i++)
    x += b[i];
}

int main() {
  foo();
  return 0;
}
```
Running `bin/clang tmp.c -fclangir -o tmp && ./tmp` causes a
segmentation fault.

I compared the LLVM IR with and without CIR and noticed a difference
which causes this:
`@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 0, i32
1)` // no CIR
`@b = global ptr getelementptr inbounds (%struct.S, ptr @s, i32 1)` //
with CIR

It seems there is a missing index when creating global pointers from
structs. I have updated `Lowering/DirectToLLVM/LowerToLLVM.cpp`, and
added a few tests.
as title.
Notice this is not target specific nor neon intrinsics.
Entails several minor changes:
- Duplicate resume blocks around.
- Disable LP caching, we repeat them as often as necessary.
- Update maps accordingly for tracking places to patch up.
- Make changes to clean up block handling.
- Fix an issue in flatten cfg.
as title. 
The current implementation of this PR is use cir::CastOP integral
casting to implement vector type truncation. Thus, LLVM lowering code
has been change to accommodate it.
In addition.
Added code into
[CIRGenBuiltinAArch64.cpp](https://github.com/llvm/clangir/pull/909/files#diff-6f7700013aa60ed524eb6ddcbab90c4dd288c384f9434547b038357868334932)
to make it more similar to OG.
```
 mlir::Type ty = vTy;
  if (!ty)
```
Added test case into neon.c as the file already contains similar vector
move test cases such as vmovl

---------

Co-authored-by: Guojin He <[email protected]>
…m#935)

as title. 
Also changed
[neon-ldst.c](https://github.com/llvm/clangir/compare/main...ghehg:clangir-llvm-ghehg:macM3?expand=1#diff-ea4814b6503bff2b7bc4afc6400565e6e89e5785bfcda587dc8401d8de5d3a22)
to make it have the same RUN options as OG
[clang/test/CodeGen/aarch64-neon-intrinsics.c](https://github.com/llvm/clangir/blob/main/clang/test/CodeGen/aarch64-neon-intrinsics.c)
Those options help us to avoid checking load/store pairs thus make the
test less verbose and easier to compare against OG.

Co-authored-by: Guojin He <[email protected]>
Implement derived-to-base address conversions for non-virtual base
classes. The code gen for this situation was only implemented when the
offset was zero, and it simply created a `cir.base_class_addr` op for
which no lowering or other transformation existed.

Conversion to a virtual base class is not yet implemented.

Two new fields are added to the `cir.base_class_addr` operation: the
byte offset of the necessary adjustment, and a boolean flag indicating
whether the source operand may be null. The offset is easy to compute in
the front end while the entire path of intermediate classes is still
available. It would be difficult for the back end to recompute the
offset. So it is best to store it in the operation. The null-pointer
check is best done late in the lowering process. But whether or not the
null-pointer check is needed is only known by the front end; the back
end can't figure that out. So that flag needs to be stored in the
operation.

`CIRGenFunction::getAddressOfBaseClass` was largely rewritten. The code
path no longer matches the equivalent function in the LLVM IR code gen,
because the generated ClangIR is quite different from the generated LLVM
IR.

`cir.base_class_addr` is lowered to LLVM IR as a `getelementptr`
operation. If a null-pointer check is needed, then that is wrapped in a
`select` operation.

When generating code for a constructor or destructor, an incorrect
`cir.ptr_stride` op was used to convert the pointer to a base class. The
code was assuming that the operand of `cir.ptr_stride` was measured in
bytes; the operand is the number elements, not the number of bytes. So
the base class constructor was being called on the wrong chunk of
memory. Fix this by using a `cir.base_class_addr` op instead of
`cir.ptr_stride` in this scenario.

The use of `cir.ptr_stride` in `ApplyNonVirtualAndVirtualOffset` had the
same problem. Continue using `cir.ptr_stride` here, but temporarily
convert the pointer to type `char*` so the pointer is adjusted
correctly.

Adjust the expected results of three existing tests in response to these
changes.

Add two new tests, one code gen and one lowering, to cover the case
where a base class is at a non-zero offset.
Fix llvm#934

While here move scope op codegen outside the builder, so it's easier
to dump blocks and operations while debugging.
ghehg and others added 22 commits November 26, 2024 13:16
This PR also changed implementation of BI__builtin_neon_vshlq_v into
using CIR ShiftOp
After I rebased, I found these problems with Spec2017. I was surprised
why it doesn't have problems. Maybe some updates in LLVM part.
The requirement for the size of then-else part of cir.ternary operation
seems to be too conservative. Like the example shows, it is possible the
regions got expanded during the transformation.
…m#1169)

For example, the following reaches
["NYI"](https://github.com/llvm/clangir/blob/c8b626d49e7f306052b2e6d3ce60b1f689d37cb5/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L348)
when lowering to AArch64:
```
typedef struct {
  union {
    struct {
      char a, b;
    };
    char c;
  };
} A;

void foo(A a) {}

void bar() {
  A a;
  foo(a);
}
```
Currently, the value of the struct becomes a bitcast operation, so we
can simply extend `findAlloca` to be able to trace the source alloca
properly, then use that for the
[coercion](https://github.com/llvm/clangir/blob/c8b626d49e7f306052b2e6d3ce60b1f689d37cb5/clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerFunction.cpp#L341)
through memory. I have also added a test for this case.
Added a few FIXMEs. There are 2 types of FIXMEs;
1. Most of them are missing func call and parameter attributes. I didn't
add for all missing sites for this type as it would have been just copy
pastes.
2. FIXME in lambda __invoke(): OG simply returns but CIR generates call
to llvm.trap. This is just temporary and we will fix in in near future.
But I feel I should still list those IRs so once we fix problem with
codegen of invoke, we'd get test failure on this one and fix it.
Actually, this way, this test file would be a natural test case for
implementation of invoke.
There are scenarios where we are not emitting cleanups, this commit starts to
pave the way to be more complete in that area. Small addition of skeleton here
plus some fixes.

Both `clang/test/CIR/CodeGen/vla.c` and `clang/test/CIR/CodeGen/nrvo.cpp`
now pass in face of this code path.
…lvm#1166)

Close llvm#1131

This is another solution to llvm#1160

This patch revert llvm#1007 and remain
its test. The problem described in
llvm#1007 is workaround by skipping the
check of equivalent of element types in arrays.

We can't mock such checks simply by adding another attribute to
`ConstStructAttr` since the types are aggregated. e.g., we have to
handle the cases like `struct { union { ... } }` and `struct { struct {
union { ... } } }` and so on. To make it, we have to introduce what I
called "two type systems" in llvm#1160.

This is not very good giving it removes a reasonable check. But it might
not be so problematic since the Sema part has already checked it. (Of
course, we still need face the risks to introduce new bugs any way)
…of floating type (llvm#1174)

[PR1132](llvm#1132) implements missing
feature `fpUnaryOPsSupportVectorType`, so revisit this code.

One another thing changed is that I stopped using
`cir::isAnyFloatingPointType` as it contains types like long double and
FP80 which are not supported by the [builtin's
signature](https://clang.llvm.org/docs/LanguageExtensions.html#vector-builtins)
[OG's implementation
](https://github.com/llvm/clangir/blob/aaf38b30d31251f3411790820c5e1bf914393ddc/clang/lib/CodeGen/CGBuiltin.cpp#L7527)
provides one common code to handle all neon SISD intrinsics. But IMHO,
it entangles different things together which hurts readability.
Here, We start with simple easy-to-understand approach with specific
case. And in the future, as we handle more intrinsics, we may come up
with a few simple common patterns.
This PR adds `clang::CodeGenOptions` to the lowering context. Similar to
`clang::LangOptions`, the code generation options are currently set to
the default values when initializing the lowering context.

Besides, this PR also adds a new attribute `#cir.opt_level`. The
attribute is a module-level attribute and it holds the optimization
level (e.g. -O1, -Oz, etc.). The attribute is consumed when initializing
the lowering context to populate the `OptimizationLevel` and the
`OptimizeSize` field in the code generation options. CIRGen is updated
to attach this attribute to the module op.
Removes some NYIs. But left assert(false) due to missing tests. It looks
better since it is not so scaring as NYI.
This PR adds support for base-to-derived and derived-to-base casts on
pointer-to-data-member values.

Related to llvm#973.
llvm#1194)

Basically, for int type, the order of Ops is not the same as OG in the
emitted LLVM IR. OG has constant as the second op position. See [OG's
order ](https://godbolt.org/z/584jrWeYn).
Default assignment operator generation was failing because of memcpy
generation for fields being unsupported. Implement it following
CodeGen's example, as usual. Follow-ups will avoid emitting memcpys for
fields of trivial class types, and extend this to copy constructors as
well.

Fixes llvm#1128
)

Our previous logic here was matching CodeGen, which folds trivial
assignment operator calls into memcpys, but we want to avoid that. Note
that we still end up emitting memcpys for arrays of classes with trivial
assignment operators; llvm#1177 tracks
fixing that.
CodeGen does so for trivial record types as well as non-record types; we
only do it for non-record types.
This is a leftover from when ClangIR was initially focused on analysis
and could ignore default method generation. We now handle default
methods and should generate them in all cases. This fixes several bugs:
- Default methods weren't emitted when emitting LLVM, only CIR.
- Default methods only referenced by other default methods weren't
  emitted.
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing this work! I'd prefer if we start small, perhaps just scalars or whatever is smaller, I'd like to help you trim off unnecessary information we might not need, but the scope is still a bit bigger than necessary. Also please update the title to convey the incremental work happening.

def CIR_TBAAMemberAttr : CIR_Attr<"TBAAMember", "tbaa_member", []> {
let summary = "Attribute representing a member of a TBAA structured type.";

let parameters = (ins "TBAAAttr":$typeDesc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typeDesc -> type_desc, please fix all camelcase on tablegen params and related

//===----------------------------------------------------------------------===//

def CIR_TBAAMemberAttr : CIR_Attr<"TBAAMember", "tbaa_member", []> {
let summary = "Attribute representing a member of a TBAA structured type.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a description and an example. Same for the others

}

def CIR_TBAAScalarTypeDescriptorAttr : CIR_TBAATypeDescriptorAttr<"TBAAScalarTypeDescriptor",
"tbaa_scalar_type_desc"> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation feels odd here, same for other some other ones

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any tools available to format .td files? For example, just as we can use clang-format to format .cpp files:

clang-format -i /path/to/cpp_file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Last I followed was https://discourse.llvm.org/t/formating-mlir-tablegen-code/60767, but I'm not sure there has been any development.

}

class CIR_TBAATypeDescriptorAttr<string name, string attrMnemonic>: CIR_Attr<name, attrMnemonic, [], "TBAAAttr"> {
let summary = "Base class for TBAA type descriptor attributes.";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this base class and what's the intent to compose on top of it? Same for CIR_TBAAAttr, can you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still waiting for an answer for this since it affects my review of #1220

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for overlooking this question.

I make a distinction between scalar types and struct types. The use of class CIR_TBAATypeDescriptorAttr serves to constrain both CIR_TBAAScalarTypeDescriptorAttr and CIR_TBAAStructTypeDescriptorAttr as TBAA type descriptor attributes. If CIR_TBAATypeDescriptorAttr is redundant, we can remove it.

The purpose of using CIR_TBAAAttr as a base class in C++ is to provide a unified way to handle both CIR_TBAAScalarTypeDescriptorAttr and CIR_TBAAStructTypeDescriptorAttr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen them get used in the PR though, hence the question. If are actually using it then fine, otherwise best wait until it's used to introduce. Let's move any discussion over the actual patches that might get landed, thanks for the reply!


let parameters = (ins "TBAAAttr":$typeDesc,
"int64_t":$offset,
"int64_t":$size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like offset and size is something that can be obtained by looking at the type.

General question for this PR: why do we need to re-encode the same information CIR already has? The requirement is to lower to identical LLVM TBAA information, that shouldn't mean we have to replicate all of it in CIR - how can we reuse more of what we already have?

@PikachuHyA
Copy link
Collaborator Author

Thanks for continuing this work! I'd prefer if we start small, perhaps just scalars or whatever is smaller, I'd like to help you trim off unnecessary information we might not need, but the scope is still a bit bigger than necessary. Also please update the title to convey the incremental work happening.

Thank you for your suggestion. I prefer to submit the code in a new PR while keeping the current PR open until it's fully completed.

sent #1220

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.