[InstCombine] Combine or-disjoint (and->mul), (and->mul) to and->mul #136013

jrbyrnes · 2025-04-16T19:38:32Z

The canonical pattern for bitmasked mul is currently

%val = and %x, %bitMask // where %bitMask is some constant
%cmp = icmp eq %val, 0
%sel = select %cmp, 0, %C // where %C is some constant = C' * %bitMask

In certain cases, where we are combining multiple of these bitmasked muls with common factors, we are able to optimize into and->mul (see #135274 )

This optimization lends itself to further optimizations. This PR addresses one of such optimizations.

In cases where we have

or-disjoint ( mul(and (X, C1), D) , mul (and (X, C2), D))

we can combine into

mul( and (X, (C1 + C2)), D)

provided C1 and C2 are disjoint.

Generalized proof: https://alive2.llvm.org/ce/z/MQYMui

llvmbot · 2025-04-16T19:39:06Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Jeffrey Byrnes (jrbyrnes)

Changes

The canonical pattern for bitmasked mul is currently

%val = and %x, %bitMask // where %bitMask is some constant
%cmp = icmp eq %val, 0
%sel = select %cmp, 0, %C // where %C is some constant

In certain cases, where we are combining multiple of these bitmasked muls with common factors, we are able to optimize into and->mul (see #135274 )

This optimization lends itself to further optimizations. This PR addresses one of such optimizations.

In cases where we have

or-disjoint ( mul(and (X, C1), D) , mul (and (X, C2), D))

we can combine into

mul( and (X, (C1 + C2)), D)

provide C1 and C2 are disjoint.

Generalized proof: https://alive2.llvm.org/ce/z/MQYMui

Full diff: https://github.com/llvm/llvm-project/pull/136013.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (+21)
(modified) llvm/test/Transforms/InstCombine/or.ll (+83-4)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 6cc241781d112..206131ab4a6a7 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -3643,6 +3643,27 @@ Instruction *InstCombinerImpl::visitOr(BinaryOperator &I) {
             foldAddLikeCommutative(I.getOperand(1), I.getOperand(0),
                                    /*NSW=*/true, /*NUW=*/true))
       return R;
+
+    Value *LHSOp = nullptr, *RHSOp = nullptr;
+    const APInt *LHSConst = nullptr, *RHSConst = nullptr;
+
+    // ((X & C1) * D) + ((X & C2) * D) -> (X & (C1 + C2) * D)
+    if (match(I.getOperand(0), m_Mul(m_Value(LHSOp), m_APInt(LHSConst))) &&
+        match(I.getOperand(1), m_Mul(m_Value(RHSOp), m_APInt(RHSConst))) &&
+        LHSConst == RHSConst) {
+      Value *LHSBase = nullptr, *RHSBase = nullptr;
+      const APInt *LHSMask = nullptr, *RHSMask = nullptr;
+      if (match(LHSOp, m_And(m_Value(LHSBase), m_APInt(LHSMask))) &&
+          match(RHSOp, m_And(m_Value(RHSBase), m_APInt(RHSMask))) &&
+          LHSBase == RHSBase &&
+          ((*LHSMask & *RHSMask) == APInt::getZero(LHSMask->getBitWidth()))) {
+        auto NewAnd = Builder.CreateAnd(
+            LHSBase, ConstantInt::get(LHSOp->getType(), (*LHSMask + *RHSMask)));
+
+        return BinaryOperator::CreateMul(
+            NewAnd, ConstantInt::get(NewAnd->getType(), *LHSConst));
+      }
+    }
   }
 
   Value *X, *Y;
diff --git a/llvm/test/Transforms/InstCombine/or.ll b/llvm/test/Transforms/InstCombine/or.ll
index 95f89e4ce11cd..777387cc662d6 100644
--- a/llvm/test/Transforms/InstCombine/or.ll
+++ b/llvm/test/Transforms/InstCombine/or.ll
@@ -1281,10 +1281,10 @@ define <16 x i1> @test51(<16 x i1> %arg, <16 x i1> %arg1) {
 ; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <16 x i1> [[ARG:%.*]], <16 x i1> [[ARG1:%.*]], <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 20, i32 5, i32 6, i32 23, i32 24, i32 9, i32 10, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    ret <16 x i1> [[TMP3]]
 ;
-  %tmp = and <16 x i1> %arg, <i1 true, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 false, i1 false, i1 false>
-  %tmp2 = and <16 x i1> %arg1, <i1 false, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 true, i1 true, i1 true>
-  %tmp3 = or <16 x i1> %tmp, %tmp2
-  ret <16 x i1> %tmp3
+  %out = and <16 x i1> %arg, <i1 true, i1 true, i1 true, i1 true, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 false, i1 false, i1 false>
+  %out2 = and <16 x i1> %arg1, <i1 false, i1 false, i1 false, i1 false, i1 true, i1 false, i1 false, i1 true, i1 true, i1 false, i1 false, i1 true, i1 true, i1 true, i1 true, i1 true>
+  %out3 = or <16 x i1> %out, %out2
+  ret <16 x i1> %out3
 }
 
 ; This would infinite loop because it reaches a transform
@@ -2035,3 +2035,82 @@ define i32 @or_xor_and_commuted3(i32 %x, i32 %y, i32 %z) {
   %or1 = or i32 %xor, %yy
   ret i32 %or1
 }
+
+define i32 @or_combine_mul_and1(i32 %in) {
+; CHECK-LABEL: @or_combine_mul_and1(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[IN:%.*]], 6
+; CHECK-NEXT:    [[OUT:%.*]] = mul nuw nsw i32 [[TMP1]], 72
+; CHECK-NEXT:    ret i32 [[OUT]]
+;
+  %bitop0 = and i32 %in, 2
+  %out0 = mul i32 %bitop0, 72
+  %bitop1 = and i32 %in, 4
+  %out1 = mul i32 %bitop1, 72
+  %out = or disjoint i32 %out0, %out1
+  ret i32 %out
+}
+
+define i32 @or_combine_mul_and2(i32 %in) {
+; CHECK-LABEL: @or_combine_mul_and2(
+; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[IN:%.*]], 10
+; CHECK-NEXT:    [[OUT:%.*]] = mul nuw nsw i32 [[TMP1]], 72
+; CHECK-NEXT:    ret i32 [[OUT]]
+;
+  %bitop0 = and i32 %in, 2
+  %out0 = mul i32 %bitop0, 72
+  %bitop1 = and i32 %in, 8
+  %out1 = mul i32 %bitop1, 72
+  %out = or disjoint i32 %out0, %out1
+  ret i32 %out
+}
+
+define i32 @or_combine_mul_and_diff_factor(i32 %in) {
+; CHECK-LABEL: @or_combine_mul_and_diff_factor(
+; CHECK-NEXT:    [[BITOP0:%.*]] = and i32 [[IN:%.*]], 2
+; CHECK-NEXT:    [[TMP0:%.*]] = mul nuw nsw i32 [[BITOP0]], 36
+; CHECK-NEXT:    [[BITOP1:%.*]] = and i32 [[IN]], 4
+; CHECK-NEXT:    [[TMP1:%.*]] = mul nuw nsw i32 [[BITOP1]], 72
+; CHECK-NEXT:    [[OUT:%.*]] = or disjoint i32 [[TMP0]], [[TMP1]]
+; CHECK-NEXT:    ret i32 [[OUT]]
+;
+  %bitop0 = and i32 %in, 2
+  %out0 = mul i32 %bitop0, 36
+  %bitop1 = and i32 %in, 4
+  %out1 = mul i32 %bitop1, 72
+  %out = or disjoint i32 %out0, %out1
+  ret i32 %out
+}
+
+define i32 @or_combine_mul_and_diff_base(i32 %in, i32 %in1) {
+; CHECK-LABEL: @or_combine_mul_and_diff_base(
+; CHECK-NEXT:    [[BITOP0:%.*]] = and i32 [[IN:%.*]], 2
+; CHECK-NEXT:    [[TMP0:%.*]] = mul nuw nsw i32 [[BITOP0]], 72
+; CHECK-NEXT:    [[BITOP1:%.*]] = and i32 [[IN1:%.*]], 4
+; CHECK-NEXT:    [[TMP1:%.*]] = mul nuw nsw i32 [[BITOP1]], 72
+; CHECK-NEXT:    [[OUT:%.*]] = or disjoint i32 [[TMP0]], [[TMP1]]
+; CHECK-NEXT:    ret i32 [[OUT]]
+;
+  %bitop0 = and i32 %in, 2
+  %out0 = mul i32 %bitop0, 72
+  %bitop1 = and i32 %in1, 4
+  %out1 = mul i32 %bitop1, 72
+  %out = or disjoint i32 %out0, %out1
+  ret i32 %out
+}
+
+define i32 @or_combine_mul_and_decomposed(i32 %in) {
+; CHECK-LABEL: @or_combine_mul_and_decomposed(
+; CHECK-NEXT:    [[TMP2:%.*]] = trunc i32 [[IN:%.*]] to i1
+; CHECK-NEXT:    [[OUT0:%.*]] = select i1 [[TMP2]], i32 72, i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = and i32 [[IN]], 4
+; CHECK-NEXT:    [[OUT:%.*]] = mul nuw nsw i32 [[TMP1]], 72
+; CHECK-NEXT:    [[OUT1:%.*]] = or disjoint i32 [[OUT0]], [[OUT]]
+; CHECK-NEXT:    ret i32 [[OUT1]]
+;
+  %bitop0 = and i32 %in, 1
+  %out0 = mul i32 %bitop0, 72
+  %bitop1 = and i32 %in, 4
+  %out1 = mul i32 %bitop1, 72
+  %out = or disjoint i32 %out0, %out1
+  ret i32 %out
+}

jrbyrnes · 2025-04-16T19:39:33Z

I plan to have 3 or so extensions on #135274 -- the plan is to resolve merge conflicts by extracting this code into a separate function.

jrbyrnes · 2025-04-16T22:38:08Z

Converted to draft / planned changes: on second thought, it makes more sense to integrate this with #135274

jrbyrnes · 2025-04-17T20:29:18Z

Extend #135274 to match the new and->mul sequences, make the merging of PRs explicit.

This PR is meant for commits starting at c74714c

jrbyrnes · 2025-04-28T17:18:06Z

force-push to rebase for #136367

jrbyrnes · 2025-05-15T20:08:01Z

Ping

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

llvm/test/Transforms/InstCombine/or-bitmask.ll

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

jrbyrnes · 2025-05-27T18:07:34Z

We may have cases where the select APInts have different bitwidths than the mask. These cases were unhandled and causing assertion failures.

We can likely improve matchBitmaskMul to handle this type of case, however since this isn't the base case I will leave this as a possible extension.

dtcxzyw · 2025-05-30T13:30:30Z

@jrbyrnes Does this pattern exist in real-world applications? Isn't your motivating case (#133139 (comment)) solved by #135274?

jrbyrnes · 2025-05-30T14:32:56Z

@dtcxzyw The simplified example in #133139 (comment) is solved by #135274 . However, this is very simplified and does not cover all the cases in my real world application. I need to implement 2 or 3 extensions to that PR (including this one) in order to resolve my issue.

Change-Id: I1cc2acd3804dde50636518f3ef2c9581848ae9f6

Change-Id: I4b71adfd8bffdda4d2b0d1cba85a3fd73a105a28

Change-Id: I12f77aedbf1a2edfe63e4d03cd1e5c1c601365a7

jrbyrnes · 2025-06-02T18:22:18Z

Force-push to rebase for 2112379

jrbyrnes · 2025-06-04T17:19:39Z

Ping -- this is the second in a chain of 4 PRs that are needed to solve an urgent issue. Solution has already been stalled for some time

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

llvm/test/Transforms/InstCombine/or-bitmask.ll

nikic · 2025-06-05T13:34:31Z

Could you please remind me what the larger context for these patches was?

jrbyrnes · 2025-06-05T14:15:01Z

Could you please remind me what the larger context for these patches was?

AI workloads are bringing in a new feature called linear layout https://arxiv.org/html/2505.23819v1

The effect of this feature is to rework address calculations s.t. we are using xor and and in places where we may more typically use add and mul. There are important clients that use this feature which are MLIR based and directly produce IR (instead of going through clang) (e.g. Triton).

The problem is that separateConstOffsetFromGEP doesn't extract constant offsets from xor. Moreover, since many of the xor are equivalent to or disjoint, it would be awesome to convert these xor to or disjoint. Doing so actually provides a very significant performance uplift as it significantly reduces RP and avoids spilling in some cases. The problem is that most of these address computation chains are longer than the knownBits recursion depth limit.

The test in #137721 has a reduced example of this, and I've also included some IR in https://discourse.llvm.org/t/rfc-computeknownbits-recursion-depth/85962 . The problem is more general, and there are different common variants of address formulations that aren't included in these examples, but this gives a basic idea.

I think that from a solution perspective, it would be best to change the way the recursion depth works s.t. we always do these conversions. However, I realize there may be some concerns with that approach, so I'm also working on an approach that optimizes the address calculation chains s.t. they are compatible with the recursion depth.

That is where this stack of instcombine patches comes in: they clean up the intermediate code in the address calculations so we can convert these xor to or-disjoint within the depth limit. This solution is a bit less stable, but it will resolve the current performance issues that occur when adopting this technology.

Change-Id: I56a280990a9bae36e59f784a7f48bdbc9f7ca539

nikic

LGTM, with the note that I find it odd that this transform is rooted at or disjoint in particular. It seems like really the more functional pattern here is the one with add, and or disjoint is only relevant insofar as it is equivalent to an add. So I'd expect this handling to be part of foldAddLike(), which is shared by both. (This can be a followup though.)

nikic · 2025-06-11T07:57:14Z

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

@@ -3593,6 +3593,73 @@ static Value *foldOrOfInversions(BinaryOperator &I,
  return nullptr;
 }

+// A decomposition of ((X & Mask) ? 0 : Mask * Factor) . The NUW / NSW bools
+// track these properities for preservation. Note that we can decompose
+// equivalent forms of this expression (e.g. ((X & Mask) * Factor))


FWIW, I think it would make more sense to mention (X & Mask) * Factor first, because this is actually the basic form of the pattern -- the select form is the special case that's equivalent but not canonicalized due to unclear individual profitability.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Change-Id: I04ff0637b85922561dda9e7e827ba3fe9d9c0cbc

llvm-ci · 2025-06-12T01:12:20Z

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/24355

Here is the relevant piece of the build log for the reference

Step 6 (test) failure: build (failure)
...
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/nested_sessions.test (2996 of 3007)
UNSUPPORTED: lldb-shell :: Unwind/windows-unaligned-x86_64.test (2997 of 3007)
UNSUPPORTED: lldb-shell :: Process/Windows/process_load.cpp (2998 of 3007)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/parser_text.test (2999 of 3007)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/interactive_crashlog_legacy.test (3000 of 3007)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/breakpoint_callback.test (3001 of 3007)
UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/lua-python.test (3002 of 3007)
UNSUPPORTED: lldb-shell :: SymbolFile/PDB/variables-locations.test (3003 of 3007)
PASS: lldb-api :: terminal/TestEditlineCompletions.py (3004 of 3007)
UNRESOLVED: lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py (3005 of 3007)
******************** TEST 'lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch -p TestDAP_launch.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 7034014d08249a1e159a668a71e96a0b78636a39)
  clang revision 7034014d08249a1e159a668a71e96a0b78636a39
  llvm revision 7034014d08249a1e159a668a71e96a0b78636a39
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
Change dir to: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch
runCmd: settings clear --all

output: 

runCmd: settings set symbols.enable-external-lookup false

output: 

runCmd: settings set target.inherit-tcc true

output: 

runCmd: settings set target.disable-aslr false

output: 

runCmd: settings set target.detach-on-error false

output: 

runCmd: settings set target.auto-apply-fixits false

llvm-ci · 2025-06-12T05:42:15Z

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building llvm at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/30557

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: Semantics/modfile75.F90' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -c -fhermetic-module-files -DWHICH=1 /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90 && /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -c -fhermetic-module-files -DWHICH=2 /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90 && /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -fdebug-unparse /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90 | /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90 # RUN: at line 1
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -c -fhermetic-module-files -DWHICH=1 /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -c -fhermetic-module-files -DWHICH=2 /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -fdebug-unparse /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90
error: Semantic errors in /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90
/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90:15:11: error: Must be a constant value
    integer(c_int) n
            ^^^^^
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Semantics/modfile75.F90

--

********************

…lvm#136013) The canonical pattern for bitmasked mul is currently ``` %val = and %x, %bitMask // where %bitMask is some constant %cmp = icmp eq %val, 0 %sel = select %cmp, 0, %C // where %C is some constant = C' * %bitMask ``` In certain cases, where we are combining multiple of these bitmasked muls with common factors, we are able to optimize into and->mul (see llvm#135274 ) This optimization lends itself to further optimizations. This PR addresses one of such optimizations. In cases where we have `or-disjoint ( mul(and (X, C1), D) , mul (and (X, C2), D))` we can combine into `mul( and (X, (C1 + C2)), D) ` provided C1 and C2 are disjoint. Generalized proof: https://alive2.llvm.org/ce/z/MQYMui

…ds (#142503) This extends #136013 to capture cases where the combineable bitmask muls are nested under multiple or-disjoints. This PR is meant for commits starting at 8c403c9 op1 = or-disjoint mul(and (X, C1), D) , reg1 op2 = or-disjoint mul(and (X, C2), D) , reg2 out = or-disjoint op1, op2 -> temp1 = or-disjoint reg1, reg2 out = or-disjoint mul(and (X, (C1 + C2)), D), temp1 Case1: https://alive2.llvm.org/ce/z/dHApyV Case2: https://alive2.llvm.org/ce/z/Jz-Nag Case3: https://alive2.llvm.org/ce/z/3xBnEV

…dent operands (#142503) This extends llvm/llvm-project#136013 to capture cases where the combineable bitmask muls are nested under multiple or-disjoints. This PR is meant for commits starting at 8c403c912046505ffc10378560c2fc48f214af6a op1 = or-disjoint mul(and (X, C1), D) , reg1 op2 = or-disjoint mul(and (X, C2), D) , reg2 out = or-disjoint op1, op2 -> temp1 = or-disjoint reg1, reg2 out = or-disjoint mul(and (X, (C1 + C2)), D), temp1 Case1: https://alive2.llvm.org/ce/z/dHApyV Case2: https://alive2.llvm.org/ce/z/Jz-Nag Case3: https://alive2.llvm.org/ce/z/3xBnEV

jrbyrnes requested review from arsenm, andjo403 and dtcxzyw April 16, 2025 19:38

jrbyrnes requested a review from nikic as a code owner April 16, 2025 19:38

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Apr 16, 2025

jrbyrnes marked this pull request as draft April 16, 2025 22:37

jrbyrnes force-pushed the ICAndSelExtendCase0 branch from d040ee7 to c74714c Compare April 17, 2025 20:27

jrbyrnes marked this pull request as ready for review April 17, 2025 20:27

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Apr 17, 2025

jrbyrnes mentioned this pull request Apr 28, 2025

[InstCombine] Combine and->cmp->sel->or-disjoint into and->mul #135274

Merged

jrbyrnes force-pushed the ICAndSelExtendCase0 branch from c74714c to e2f011e Compare April 28, 2025 17:17

arsenm reviewed May 21, 2025

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp Outdated Show resolved Hide resolved

This was referenced May 23, 2025

Fuzz PR136013 dtcxzyw/llvm-mutation-based-fuzz-service#48

Closed

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR136013 dtcxzyw/llvm-opt-benchmark#2369

Closed

dtcxzyw requested changes May 24, 2025

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp Show resolved Hide resolved

dtcxzyw mentioned this pull request May 28, 2025

pre-commit: PR136013 dtcxzyw/llvm-opt-benchmark#2373

Closed

jrbyrnes added 2 commits June 2, 2025 11:05

[InstCombine] Extend bitmask->select combine to match and->mul

0019711

Change-Id: I1cc2acd3804dde50636518f3ef2c9581848ae9f6

Review comments + fix some conditions

7b63d9b

Change-Id: I4b71adfd8bffdda4d2b0d1cba85a3fd73a105a28

Fix crash due to mismatch APInt bitwidth

5fa229b

Change-Id: I12f77aedbf1a2edfe63e4d03cd1e5c1c601365a7

jrbyrnes force-pushed the ICAndSelExtendCase0 branch from 083eb16 to 5fa229b Compare June 2, 2025 18:14

jrbyrnes mentioned this pull request Jun 2, 2025

[InstCombine] Extend bitmask mul combine to handle independent operands #142503

Merged

nikic reviewed Jun 5, 2025

View reviewed changes

Review comments

9ccf1fa

Change-Id: I56a280990a9bae36e59f784a7f48bdbc9f7ca539

nikic approved these changes Jun 11, 2025

View reviewed changes

Review comments 1

acd7e8b

Change-Id: I04ff0637b85922561dda9e7e827ba3fe9d9c0cbc

jrbyrnes merged commit 7034014 into llvm:main Jun 12, 2025
7 checks passed

[InstCombine] Combine or-disjoint (and->mul), (and->mul) to and->mul #136013

[InstCombine] Combine or-disjoint (and->mul), (and->mul) to and->mul #136013

Uh oh!

Conversation

jrbyrnes commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrbyrnes commented Apr 16, 2025

Uh oh!

jrbyrnes commented Apr 16, 2025

Uh oh!

jrbyrnes commented Apr 17, 2025

Uh oh!

jrbyrnes commented Apr 28, 2025

Uh oh!

jrbyrnes commented May 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jrbyrnes commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw commented May 30, 2025

Uh oh!

jrbyrnes commented May 30, 2025

Uh oh!

jrbyrnes commented Jun 2, 2025

Uh oh!

jrbyrnes commented Jun 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikic commented Jun 5, 2025

Uh oh!

jrbyrnes commented Jun 5, 2025

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

nikic Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvm-ci commented Jun 12, 2025

Uh oh!

llvm-ci commented Jun 12, 2025

Uh oh!

Uh oh!

jrbyrnes commented Apr 16, 2025 •

edited

Loading

llvmbot commented Apr 16, 2025 •

edited

Loading

jrbyrnes commented May 27, 2025 •

edited

Loading