Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR] introduce cir.unreachable operation #447

Merged
merged 4 commits into from
Feb 14, 2024

Conversation

Lancern
Copy link
Member

@Lancern Lancern commented Feb 4, 2024

In #426 we confirmed that CIR needs a cir.unreachable operation to mark unreachable program points (discussion). This PR adds it.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Since you're adding support for this, let's make it complete while you're here, see other mentions in:

  • clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp:760
  • clang/lib/CIR/CodeGen/CIRGenFunction.cpp:589
  • Add CIRGen for __builtin_unreachable, which should map into this operation.

@Lancern
Copy link
Member Author

Lancern commented Feb 8, 2024

Add CIRGen for __builtin_unreachable, which should map into this operation.

Added. The codegen slightly diverges from upstream clang. The problem one needs to resolve when emitting calls to noreturn functions like __builtin_unreachable is how should we deal with any code that follows such calls:

void test() {
  __builtin_unreachable();

  // How should we deal with the following code?
  do_some_thing_that_may_never_happen();
}

We cannot continue emitting code after cir.unreachable or llvm.unreachable because they are block terminators. The solution is, after emitting cir.unreachable or llvm.unreachable, we immediately start a new block that does not have any predecessors (let's call it "the dangling block") and any code following will be emitted there.

In upstream Clang, the codegen of statements are allowed to erase the current insertion point. Thus, after emitting the full statement that contains a call to __builtin_unreachable, the dangling block will be erased immediately and the code builder no longer has an insertion point.

However, in CIRGen, there are already too many assumptions in the code that the insertion point is always present. So we can't erase the dangling block until the function CIRGen is fully complete. In this PR, the removal of the dangling block is not implemented. I believe we should invent a DCE pass in the future to remove these dangling blocks (I guess there is already one in MLIR).

clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp:760

Updated.

clang/lib/CIR/CodeGen/CIRGenFunction.cpp:589

This place handles control flows that implicitly returns from a function with a non-void return type. In such a circumstance we want to insert a cir.unreachable operation to terminate the implicitly-returning block. However, when the LexicalScope object that represents the whole function body is destroyed, it already inserts cir.return operations at every returning path. So I believe a better place for emitting cir.unreachable here should be in the LexicalScope::cleanup function. This change may get a little large and I'm still working on it.

@bcardosolopes
Copy link
Member

We cannot continue emitting code after cir.unreachable or llvm.unreachable because they are block terminators. The solution is, after emitting cir.unreachable or llvm.unreachable, we immediately start a new block that does not have any predecessors (let's call it "the dangling block") and any code following will be emitted there.

That's right!

In upstream Clang, the codegen of statements are allowed to erase the current insertion point. Thus, after emitting the full statement that contains a call to __builtin_unreachable, the dangling block will be erased immediately and the code builder no longer has an insertion point. However, in CIRGen, there are already too many assumptions in the code that the insertion point is always present.

We take a different approach here. Upstream codegen relies too much on insertion point availability in order to take decisions, and although it works, it feels hacky because you are not really aware at any given point, why is it that an insertion point isn't available - is it because of a bug? is it because it's trying to signal something? if so, signal what? Overall I find it very hard to rely. It's also becomes a bit cumbersome when dealing with structured control flow in CIRGen. So I took the opposite direction: insert guard as much as possible so that we should assume at any point that we have BB to insert operations.

So we can't erase the dangling block until the function CIRGen is fully complete.

This is expected, we want dangling blocks to be around because in the future we want to emit "unreachable" warnings in Clang using CIR. This is an early optimization that upstream does, which we don't want in ClangIR. Some later cleanup pass should remove it before LLVM lowering, but CIRGen should emit them normally.

In this PR, the removal of the dangling block is not implemented. I believe we should invent a DCE pass in the future to remove these dangling blocks (I guess there is already one in MLIR).

Yes, or alternatively apply it in MergeCleanups, whatever that pass is in the future, it should run after LoweringPrepare.

clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp:760

Updated.

Very nice, thanks for updating the testcase too.

This place handles control flows that implicitly returns from a function with a non-void return type. In such a circumstance we want to insert a cir.unreachable operation to terminate the implicitly-returning block. However, when the LexicalScope object that represents the whole function body is destroyed, it already inserts cir.return operations at every returning path. So I believe a better place for emitting cir.unreachable here should be in the LexicalScope::cleanup function. This change may get a little large and I'm still working on it.

Fair, totally a "follow up PR" material. Let me track this in a new issue.

As a last request for this PR, can you then change the comment in clang/lib/CIR/CodeGen/CIRGenFunction.cpp:589 and add a assert(!UnimplementedFeature::unreachableOp()); there? This will mean that for now we should keep unreachableOp() around!

@Lancern
Copy link
Member Author

Lancern commented Feb 14, 2024

Updated. Also rebased onto the latest main.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, LGTM

@bcardosolopes bcardosolopes merged commit 76bb766 into llvm:main Feb 14, 2024
6 checks passed
@Lancern Lancern deleted the unreachable branch February 18, 2024 14:59
lanza pushed a commit that referenced this pull request Mar 23, 2024
In #426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](#426 (comment)).
This PR adds it.
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Mar 24, 2024
In llvm#426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](llvm#426 (comment)).
This PR adds it.
lanza pushed a commit that referenced this pull request Apr 29, 2024
In #426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](#426 (comment)).
This PR adds it.
lanza pushed a commit that referenced this pull request Apr 29, 2024
In #426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](#426 (comment)).
This PR adds it.
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Apr 29, 2024
In llvm#426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](llvm#426 (comment)).
This PR adds it.
lanza pushed a commit that referenced this pull request Apr 29, 2024
In #426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](#426 (comment)).
This PR adds it.
bruteforceboy pushed a commit to bruteforceboy/clangir that referenced this pull request Oct 2, 2024
In llvm#426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](llvm#426 (comment)).
This PR adds it.
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
In llvm#426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](llvm#426 (comment)).
This PR adds it.
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
In llvm#426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](llvm#426 (comment)).
This PR adds it.
lanza pushed a commit that referenced this pull request Nov 5, 2024
In #426 we confirmed that CIR needs a `cir.unreachable` operation to
mark unreachable program points
[(discussion)](#426 (comment)).
This PR adds it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants