Skip to content

Revert "[SimplifyCFG] Deduce paths unreachable if they cause div/rem UB" #137741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bwendling
Copy link
Collaborator

@bwendling bwendling commented Apr 29, 2025

This reverts commit 3793264.

This silently generates incorrect code in the Linux kernel.

drivers/gpu/drm/amd/amdgpu/../display/dc/basics/fixpt31_32.o:
     warning: objtool: dc_fixpt_recip() falls through to next function dc_fixpt_sinc()

drivers/gpu/drm/amd/amdgpu/../display/dc/sspl/spl_fixpt31_32.o:
     warning: objtool: spl_fixpt_recip() falls through to next function spl_fixpt_sinc()

While UB isn't great, it's far better to allow the program to fail during a
divide or modulo operation rather than produce incorrect code due to the
cascading effect of removing the UB code.

For example:

  • Allow the code to signal because of a divide by zero, or
  • Fall through into code that launches the nukes.

Fixes: #137739

This reverts commit 3793264.

This silently generates incorrect code in the Linux kernel. While UB
isn't great, it's far better to allow the program to fail during a
divide or modulo operation rather than produce incorrect code due to the
cascading effect of removing the UB code.

For example:

  - Allow the code to signal because of a divide by zero, or
  - Fall through into code that launches the nukes.

Closes llvm#137739
@bwendling
Copy link
Collaborator Author

This references PR #109008

@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Bill Wendling (bwendling)

Changes

This reverts commit 3793264.

This silently generates incorrect code in the Linux kernel. While UB isn't great, it's far better to allow the program to fail during a divide or modulo operation rather than produce incorrect code due to the cascading effect of removing the UB code.

For example:

  • Allow the code to signal because of a divide by zero, or
  • Fall through into code that launches the nukes.

Closes #137739


Full diff: https://github.com/llvm/llvm-project/pull/137741.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (-10)
  • (modified) llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll (+17-8)
  • (added) llvm/test/Transforms/SimplifyCFG/pr137739.ll (+67)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 8094697cdd13a..2e9566e9b04fa 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -8123,13 +8123,6 @@ static bool passingValueIsAlwaysUndefined(Value *V, Instruction *I, bool PtrValu
       case Instruction::Call:
       case Instruction::CallBr:
       case Instruction::Invoke:
-      case Instruction::UDiv:
-      case Instruction::URem:
-        // Note: signed div/rem of INT_MIN / -1 is also immediate UB, not
-        // implemented to avoid code complexity as it is unclear how useful such
-        // logic is.
-      case Instruction::SDiv:
-      case Instruction::SRem:
         return true;
       }
     });
@@ -8222,9 +8215,6 @@ static bool passingValueIsAlwaysUndefined(Value *V, Instruction *I, bool PtrValu
           return true;
       }
     }
-    // Div/Rem by zero is immediate UB
-    if (match(User, m_BinOp(m_Value(), m_Specific(I))) && User->isIntDivRem())
-      return true;
   }
   return false;
 }
diff --git a/llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll b/llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
index aae1ab032f36e..8d3b35bfb740a 100644
--- a/llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
+++ b/llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
@@ -925,15 +925,18 @@ define i8 @udiv_by_zero(i8 %x, i8 %i, i8 %v) {
 ; CHECK-LABEL: @udiv_by_zero(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    switch i8 [[I:%.*]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i8 0, label [[RETURN:%.*]]
+; CHECK-NEXT:      i8 2, label [[SW_BB1:%.*]]
 ; CHECK-NEXT:      i8 9, label [[SW_BB2:%.*]]
-; CHECK-NEXT:      i8 2, label [[RETURN:%.*]]
 ; CHECK-NEXT:    ]
+; CHECK:       sw.bb1:
+; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       sw.bb2:
 ; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       sw.default:
 ; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       return:
-; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ 9, [[SW_BB2]] ], [ [[V:%.*]], [[SW_DEFAULT]] ], [ 2, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ 2, [[SW_BB1]] ], [ 9, [[SW_BB2]] ], [ [[V:%.*]], [[SW_DEFAULT]] ], [ 0, [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    [[R:%.*]] = udiv i8 [[X:%.*]], [[Y]]
 ; CHECK-NEXT:    ret i8 [[R]]
 ;
@@ -973,9 +976,9 @@ define i8 @urem_by_zero(i8 %x, i8 %i, i8 %v) {
 ; CHECK:       sw.bb2:
 ; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       sw.default:
-; CHECK-NEXT:    unreachable
+; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       return:
-; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ 2, [[SW_BB1]] ], [ 9, [[SW_BB2]] ], [ [[V:%.*]], [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ 2, [[SW_BB1]] ], [ 9, [[SW_BB2]] ], [ 0, [[SW_DEFAULT]] ], [ [[V:%.*]], [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    [[R:%.*]] = urem i8 [[X:%.*]], [[Y]]
 ; CHECK-NEXT:    ret i8 [[R]]
 ;
@@ -1051,10 +1054,13 @@ define i8 @srem_by_zero(i8 %x, i8 %i) {
 ; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
 ; CHECK:       if.then:
 ; CHECK-NEXT:    call void @side.effect()
-; CHECK-NEXT:    unreachable
+; CHECK-NEXT:    br label [[IF_END:%.*]]
 ; CHECK:       if.else:
 ; CHECK-NEXT:    [[V:%.*]] = call i8 @get.i8()
-; CHECK-NEXT:    [[R:%.*]] = srem i8 [[X:%.*]], [[V]]
+; CHECK-NEXT:    br label [[IF_END]]
+; CHECK:       if.end:
+; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ 0, [[IF_THEN]] ], [ [[V]], [[IF_ELSE]] ]
+; CHECK-NEXT:    [[R:%.*]] = srem i8 [[X:%.*]], [[Y]]
 ; CHECK-NEXT:    ret i8 [[R]]
 ;
 entry:
@@ -1156,16 +1162,19 @@ define i8 @sdiv_overflow_ub_2x(i8 %i) {
 ; CHECK-LABEL: @sdiv_overflow_ub_2x(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    switch i8 [[I:%.*]], label [[SW_DEFAULT:%.*]] [
-; CHECK-NEXT:      i8 9, label [[RETURN:%.*]]
+; CHECK-NEXT:      i8 0, label [[RETURN:%.*]]
 ; CHECK-NEXT:      i8 2, label [[SW_BB1:%.*]]
+; CHECK-NEXT:      i8 9, label [[SW_BB2:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       sw.bb1:
 ; CHECK-NEXT:    [[V:%.*]] = call i8 @get.i8()
 ; CHECK-NEXT:    br label [[RETURN]]
+; CHECK:       sw.bb2:
+; CHECK-NEXT:    br label [[RETURN]]
 ; CHECK:       sw.default:
 ; CHECK-NEXT:    unreachable
 ; CHECK:       return:
-; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ [[V]], [[SW_BB1]] ], [ -1, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[Y:%.*]] = phi i8 [ [[V]], [[SW_BB1]] ], [ -1, [[SW_BB2]] ], [ 0, [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    [[R:%.*]] = sdiv i8 -128, [[Y]]
 ; CHECK-NEXT:    ret i8 [[R]]
 ;
diff --git a/llvm/test/Transforms/SimplifyCFG/pr137739.ll b/llvm/test/Transforms/SimplifyCFG/pr137739.ll
new file mode 100644
index 0000000000000..d675656124233
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/pr137739.ll
@@ -0,0 +1,67 @@
+; RUN: opt < %s -passes=simplifycfg -S | FileCheck %s
+
+; CHECK-LABEL: @test(
+; CHECK:       call void asm sideeffect
+; CHECK-NEXT:  call void asm sideeffect
+; CHECK-NOT:   unreachable
+; CHECK-NEXT:  br label
+define dso_local i64 @test(i64 %arg.coerce) local_unnamed_addr #0 align 16 {
+entry:
+  %tobool.not = icmp eq i64 %arg.coerce, 0
+  br i1 %tobool.not, label %if.then.i.i, label %if.end
+
+if.end:                                           ; preds = %entry
+  %cond9.i = call i64 @llvm.abs.i64(i64 %arg.coerce, i1 false)
+  br label %bar.1.exit.i
+
+if.then.i.i:                                      ; preds = %entry
+  call void asm sideeffect "# test", "i,r,i,~{dirflag},~{fpsr},~{flags}"(i32 199, i32 2305, i64 12)
+  call void asm sideeffect "# bar.1", "i,r,i,~{dirflag},~{fpsr},~{flags}"(i32 32, i32 2305, i64 12)
+  br label %bar.1.exit.i
+
+bar.1.exit.i:         ; preds = %if.then.i.i, %if.end
+  %cond9.i4 = phi i64 [ 0, %if.then.i.i ], [ %cond9.i, %if.end ]
+  %rem.i.i.i.i = urem i64 4294967296, %cond9.i4
+  %div.i.i.i.i = udiv i64 4294967296, %cond9.i4
+  br label %do.body.i
+
+do.body.i:                                        ; preds = %do.body.i, %bar.1.exit.i
+  %remainder.0.i = phi i64 [ %rem.i.i.i.i, %bar.1.exit.i ], [ %storemerge.i, %do.body.i ]
+  %i.0.i = phi i32 [ 32, %bar.1.exit.i ], [ %dec.i, %do.body.i ]
+  %res_value.0.i = phi i64 [ %div.i.i.i.i, %bar.1.exit.i ], [ %res_value.1.i, %do.body.i ]
+  %shl.i = shl i64 %remainder.0.i, 1
+  %shl25.i = shl i64 %res_value.0.i, 1
+  %cmp26.not.i = icmp uge i64 %shl.i, %cond9.i4
+  %sub29.i = select i1 %cmp26.not.i, i64 %cond9.i4, i64 0
+  %storemerge.i = sub nuw i64 %shl.i, %sub29.i
+  %or.i = zext i1 %cmp26.not.i to i64
+  %res_value.1.i = or disjoint i64 %shl25.i, %or.i
+  %dec.i = add nsw i32 %i.0.i, -1
+  %cmp31.not.i = icmp eq i32 %dec.i, 0
+  br i1 %cmp31.not.i, label %do.end.i, label %do.body.i
+
+do.end.i:                                         ; preds = %do.body.i
+  %shl33.i = shl i64 %storemerge.i, 1
+  %cmp34.i = icmp uge i64 %shl33.i, %cond9.i4
+  %sub38.i = select i1 %cmp34.i, i64 9223372036854775806, i64 9223372036854775807
+  %cmp39.not.i = icmp ugt i64 %res_value.1.i, %sub38.i
+  br i1 %cmp39.not.i, label %if.then55.i, label %bar.2.exit
+
+if.then55.i:                                      ; preds = %do.end.i
+  call void asm sideeffect "# bar.2", "i,r,i,~{dirflag},~{fpsr},~{flags}"(i32 88, i32 2305, i64 12)
+  br label %bar.2.exit
+
+bar.2.exit:                     ; preds = %if.then55.i, %do.end.i
+  %conv36.i = zext i1 %cmp34.i to i64
+  %add.i = add i64 %res_value.1.i, %conv36.i
+  %numerator.lobit20.i = xor i64 %arg.coerce, 4294967296
+  %sub75.i = sub i64 0, %add.i
+  %tobool72.not22.i = icmp slt i64 %numerator.lobit20.i, 0
+  %retval.sroa.0.0.i = select i1 %tobool72.not22.i, i64 %sub75.i, i64 %add.i
+  ret i64 %retval.sroa.0.0.i
+}
+
+; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
+declare i64 @llvm.abs.i64(i64, i1 immarg)
+
+attributes #0 = { fn_ret_thunk_extern noredzone nounwind null_pointer_is_valid sanitize_address sspstrong "min-legal-vector-width"="0" "no-builtin-wcslen" "no-jump-tables"="true" "no-trapping-math"="true" "patchable-function-entry"="0" "patchable-function-prefix"="16" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+retpoline-external-thunk,+retpoline-indirect-branches,+retpoline-indirect-calls,-aes,-avx,-avx10.1-256,-avx10.1-512,-avx2,-avx512bf16,-avx512bitalg,-avx512bw,-avx512cd,-avx512dq,-avx512f,-avx512fp16,-avx512ifma,-avx512vbmi,-avx512vbmi2,-avx512vl,-avx512vnni,-avx512vp2intersect,-avx512vpopcntdq,-avxifma,-avxneconvert,-avxvnni,-avxvnniint16,-avxvnniint8,-f16c,-fma,-fma4,-gfni,-kl,-mmx,-pclmul,-sha,-sha512,-sm3,-sm4,-sse,-sse2,-sse3,-sse4.1,-sse4.2,-sse4a,-ssse3,-vaes,-vpclmulqdq,-widekl,-x87,-xop" "tune-cpu"="generic" }

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While UB isn't great, it's far better to allow the program to fail during a
divide or modulo operation rather than produce incorrect code due to the
cascading effect of removing the UB code.

Division by zero is not guaranteed to raise an exception on some targets (e.g., RISC-V).
If you want to trap on this case, please insert a check before the div/rem instruction.

As this assumption is widely used by the LLVM middle-end (e.g., ValueTracking/InstCombine), only reverting SimplifyCFG changes doesn't help :( It may break again in the future.

@pinskia
Copy link

pinskia commented Apr 29, 2025

I should note GCC's path isolation pass only ever inserts __builtin_trap and never __builtin_unreachable.
I was looking to improve GCC's path isolation for shifts with out of bound shifters last year and asked for some thoughts on that. https://inbox.sourceware.org/gcc/CA+=Sn1m+2inAtFXMMz1M2VsJ178hDJiwsCJdVrp8wRHf7jYRuw@mail.gmail.com/
is the thread about it.

@pinskia
Copy link

pinskia commented Apr 29, 2025

Another note since we were just talking about __builtin_unreachable today on the GCC list, We were talking about adding traps to places where fall through to another function.
From https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681973.html:

From a QOI perspective we'd really want to turn any unreachable() that
falls off the function
into a trap (also consider hot/cold partitioning here).  ISTR we
talked about all this in detail
somewhen back, but I don't remember the issues with doing this after BB reorder.

@bwendling
Copy link
Collaborator Author

The issue comes about exactly because there's a check for zero before the calculation. LLVM uses the llvm.expect intrinsic to determine if the calculation is UB. The matter is complicated by the fact that the calculation results from a whole bunch of inlining. Whether we generate a __builtin_trap or simply allow the calculation to happen is fine by me, but what we shouldn't do is generate broken code simply because it's UB. That's just a cop-out for the compiler.

As for other places, yes there may be other places where this occurs, but this patch is the one causing the immediate issue.

Perhaps an even better solution is to generate a warning, but no code change, when this situation occurs.

@bwendling
Copy link
Collaborator Author

Another note since we were just talking about __builtin_unreachable today on the GCC list, We were talking about adding traps to places where fall through to another function. From https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681973.html:

From a QOI perspective we'd really want to turn any unreachable() that
falls off the function
into a trap (also consider hot/cold partitioning here).  ISTR we
talked about all this in detail
somewhen back, but I don't remember the issues with doing this after BB reorder.

It would definitely help from a security standpoint. Are there any legitimate reasons to fall through a function into another function? (I suspect not, but ... well ... C is C.)

@dtcxzyw
Copy link
Member

dtcxzyw commented Apr 29, 2025

Whether we generate a __builtin_trap or simply allow the calculation to happen is fine by me, but what we shouldn't do is generate broken code simply because it's UB.

Is --trap-unreachable helpful for your case? TBH it is still unreliable since the optimizer may remove the whole BB and turn a conditional branch into an unconditional one.

@bwendling
Copy link
Collaborator Author

Whether we generate a __builtin_trap or simply allow the calculation to happen is fine by me, but what we shouldn't do is generate broken code simply because it's UB.

Is --trap-unreachable helpful for your case? TBH it is still unreliable since the optimizer may remove the whole BB and turn a conditional branch into an unconditional one.

It helps some, but causes other issues. From the Linux objtool:

panic() missing __noreturn in .c/.h or NORETURN() in noreturns.h

@nikic
Copy link
Contributor

nikic commented Apr 29, 2025

Not sure what exactly that error is supposed to tell us, but maybe the -trap-unreachable -no-trap-after-noreturn variant would work? That's the configuration that Rust uses by default. Enabling this for Clang as well comes up regularly as well (basically whenever a new person discovers function fallthrough), but nobody has bothered actually implementing it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UB optimization leads to really bad code gen
5 participants