[TRI] Remove reserved registers in getRegPressureSetLimit #118787

wangpc-pp · 2024-12-05T10:59:43Z

There are two getRegPressureSetLimit:

RegisterClassInfo::getRegPressureSetLimit.
TargetRegisterInfo::getRegPressureSetLimit.

RegisterClassInfo::getRegPressureSetLimit is a wrapper of
TargetRegisterInfo::getRegPressureSetLimit with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use TargetRegisterInfo::getRegPressureSetLimit
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

These two getRegPressureSetLimits are messy, and easy to confuse
the users. So here we move the logic of adjusting these limits for
reserved registers in RegisterClassInfo::getRegPressureSetLimit
to TargetRegisterInfo::getRegPressureSetLimit. This makes the previous
one a thin cached wrapper of the later one.

Created using spr 1.3.6-beta.1

llvmbot · 2024-12-05T11:00:17Z

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-nvptx
@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-powerpc

Author: Pengcheng Wang (wangpc-pp)

Changes

There are two getRegPressureSetLimit:

RegisterClassInfo::getRegPressureSetLimit.
TargetRegisterInfo::getRegPressureSetLimit.

RegisterClassInfo::getRegPressureSetLimit is a wrapper of
TargetRegisterInfo::getRegPressureSetLimit with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use TargetRegisterInfo::getRegPressureSetLimit
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

However, there exists some passes that use it directly. For example,
MachineLICM, MachineSink, MachinePipeliner, etc. And in these
passes, the register pressure set limits are not adjusted for reserved
registers, which means that the limits are larger than the actual.

These two getRegPressureSetLimits are messy, and easy to confuse
the users. So here we move the logic of adjusting these limits for
reserved registers in RegisterClassInfo::getRegPressureSetLimit
to TargetRegisterInfo::getRegPressureSetLimit. This makes the previous
one a thin cached wrapper of the later one.

This change helps to reduce the number of spills/reloads as well.

Here are the RISC-V's statistics of spills/reloads on llvm-test-suite
with -O3 -march=rva23u64:

Metric: regalloc.NumSpills,regalloc.NumReloads

Program                                       regalloc.NumSpills                  regalloc.NumReloads
                                              baseline           after    diff    baseline            after    diff
External/S...T2017speed/602.gcc_s/602.gcc_s   11811.00           11349.00 -462.00 26812.00            25793.00 -1019.00
External/S...NT2017rate/502.gcc_r/502.gcc_r   11811.00           11349.00 -462.00 26812.00            25793.00 -1019.00
External/S...te/526.blender_r/526.blender_r   13513.00           13251.00 -262.00 27462.00            27195.00  -267.00
SingleSour...nchmarks/Adobe-C++/loop_unroll    1533.00            1413.00 -120.00  2943.00             2633.00  -310.00
External/S...00.perlbench_s/600.perlbench_s    4398.00            4280.00 -118.00  9745.00             9466.00  -279.00
External/S...00.perlbench_r/500.perlbench_r    4398.00            4280.00 -118.00  9745.00             9466.00  -279.00
External/S...rate/510.parest_r/510.parest_r   43985.00           43888.00  -97.00 87407.00            87330.00   -77.00
MultiSourc...sumer-typeset/consumer-typeset    1222.00            1129.00  -93.00  3048.00             2887.00  -161.00
External/S...ed/638.imagick_s/638.imagick_s    4155.00            4064.00  -91.00 10556.00            10463.00   -93.00
External/S...te/538.imagick_r/538.imagick_r    4155.00            4064.00  -91.00 10556.00            10463.00   -93.00
External/S...rate/511.povray_r/511.povray_r    1734.00            1657.00  -77.00  3410.00             3290.00  -120.00
MultiSourc...e/Applications/ClamAV/clamscan    2120.00            2049.00  -71.00  5041.00             4994.00   -47.00
External/S...23.xalancbmk_s/623.xalancbmk_s    1664.00            1608.00  -56.00  2758.00             2663.00   -95.00
External/S...23.xalancbmk_r/523.xalancbmk_r    1664.00            1608.00  -56.00  2758.00             2663.00   -95.00
MultiSource/Applications/SPASS/SPASS           1442.00            1388.00  -54.00  2954.00             2849.00  -105.00
      regalloc.NumSpills                            regalloc.NumReloads
run             baseline         after         diff            baseline         after         diff
mean   86.864054          85.415094    -1.448960     1173.354136          170.657475   -2.69666

Patch is 163.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118787.diff

26 Files Affected:

(modified) llvm/include/llvm/CodeGen/RegisterClassInfo.h (+1-6)
(modified) llvm/include/llvm/CodeGen/TargetRegisterInfo.h (+7-2)
(modified) llvm/lib/CodeGen/MachinePipeliner.cpp (-41)
(modified) llvm/lib/CodeGen/RegisterClassInfo.cpp (-37)
(modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+44)
(modified) llvm/test/CodeGen/LoongArch/jr-without-ra.ll (+56-56)
(modified) llvm/test/CodeGen/NVPTX/misched_func_call.ll (+3-4)
(modified) llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir (-1)
(modified) llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir (-1)
(modified) llvm/test/CodeGen/PowerPC/compute-regpressure.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+3-2)
(modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+61-63)
(modified) llvm/test/CodeGen/Thumb2/mve-gather-increment.ll (+383-405)
(modified) llvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll (+70-70)
(modified) llvm/test/CodeGen/Thumb2/mve-pipelineloops.ll (+32-43)
(modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+2-2)
(modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+4-4)
(modified) llvm/test/CodeGen/X86/sse-regcall.ll (+4-4)
(modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+4-4)
(modified) llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll (+169-166)
(modified) llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll (+294-262)
(modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+8-8)
(modified) llvm/test/TableGen/bare-minimum-psets.td (+1-1)
(modified) llvm/test/TableGen/inhibit-pset.td (+1-1)
(modified) llvm/unittests/CodeGen/MFCommon.inc (+2-2)
(modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+4-3)

diff --git a/llvm/include/llvm/CodeGen/RegisterClassInfo.h b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
index 800bebea0dddb0..417a1e40d02b95 100644
--- a/llvm/include/llvm/CodeGen/RegisterClassInfo.h
+++ b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
@@ -141,16 +141,11 @@ class RegisterClassInfo {
   }
 
   /// Get the register unit limit for the given pressure set index.
-  ///
-  /// RegisterClassInfo adjusts this limit for reserved registers.
   unsigned getRegPressureSetLimit(unsigned Idx) const {
     if (!PSetLimits[Idx])
-      PSetLimits[Idx] = computePSetLimit(Idx);
+      PSetLimits[Idx] = TRI->getRegPressureSetLimit(*MF, Idx);
     return PSetLimits[Idx];
   }
-
-protected:
-  unsigned computePSetLimit(unsigned Idx) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index 292fa3c94969be..f7cd7cfe1aa15b 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -913,9 +913,14 @@ class TargetRegisterInfo : public MCRegisterInfo {
   virtual const char *getRegPressureSetName(unsigned Idx) const = 0;
 
   /// Get the register unit pressure limit for this dimension.
-  /// This limit must be adjusted dynamically for reserved registers.
+  /// TargetRegisterInfo adjusts this limit for reserved registers.
   virtual unsigned getRegPressureSetLimit(const MachineFunction &MF,
-                                          unsigned Idx) const = 0;
+                                          unsigned Idx) const;
+
+  /// Get the raw register unit pressure limit for this dimension.
+  /// This limit must be adjusted dynamically for reserved registers.
+  virtual unsigned getRawRegPressureSetLimit(const MachineFunction &MF,
+                                             unsigned Idx) const = 0;
 
   /// Get the dimensions of register pressure impacted by this register class.
   /// Returns a -1 terminated array of pressure set IDs.
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 7a10bd39e2695d..3ee0ba1fea5079 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -1327,47 +1327,6 @@ class HighRegisterPressureDetector {
   void computePressureSetLimit(const RegisterClassInfo &RCI) {
     for (unsigned PSet = 0; PSet < PSetNum; PSet++)
       PressureSetLimit[PSet] = TRI->getRegPressureSetLimit(MF, PSet);
-
-    // We assume fixed registers, such as stack pointer, are already in use.
-    // Therefore subtracting the weight of the fixed registers from the limit of
-    // each pressure set in advance.
-    SmallDenseSet<Register, 8> FixedRegs;
-    for (const TargetRegisterClass *TRC : TRI->regclasses()) {
-      for (const MCPhysReg Reg : *TRC)
-        if (isFixedRegister(Reg))
-          FixedRegs.insert(Reg);
-    }
-
-    LLVM_DEBUG({
-      for (auto Reg : FixedRegs) {
-        dbgs() << printReg(Reg, TRI, 0, &MRI) << ": [";
-        for (MCRegUnit Unit : TRI->regunits(Reg)) {
-          const int *Sets = TRI->getRegUnitPressureSets(Unit);
-          for (; *Sets != -1; Sets++) {
-            dbgs() << TRI->getRegPressureSetName(*Sets) << ", ";
-          }
-        }
-        dbgs() << "]\n";
-      }
-    });
-
-    for (auto Reg : FixedRegs) {
-      LLVM_DEBUG(dbgs() << "fixed register: " << printReg(Reg, TRI, 0, &MRI)
-                        << "\n");
-      for (MCRegUnit Unit : TRI->regunits(Reg)) {
-        auto PSetIter = MRI.getPressureSets(Unit);
-        unsigned Weight = PSetIter.getWeight();
-        for (; PSetIter.isValid(); ++PSetIter) {
-          unsigned &Limit = PressureSetLimit[*PSetIter];
-          assert(
-              Limit >= Weight &&
-              "register pressure limit must be greater than or equal weight");
-          Limit -= Weight;
-          LLVM_DEBUG(dbgs() << "PSet=" << *PSetIter << " Limit=" << Limit
-                            << " (decreased by " << Weight << ")\n");
-        }
-      }
-    }
   }
 
   // There are two patterns of last-use.
diff --git a/llvm/lib/CodeGen/RegisterClassInfo.cpp b/llvm/lib/CodeGen/RegisterClassInfo.cpp
index 9312bc03bc522a..976d41a54da56f 100644
--- a/llvm/lib/CodeGen/RegisterClassInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterClassInfo.cpp
@@ -195,40 +195,3 @@ void RegisterClassInfo::compute(const TargetRegisterClass *RC) const {
   // RCI is now up-to-date.
   RCI.Tag = Tag;
 }
-
-/// This is not accurate because two overlapping register sets may have some
-/// nonoverlapping reserved registers. However, computing the allocation order
-/// for all register classes would be too expensive.
-unsigned RegisterClassInfo::computePSetLimit(unsigned Idx) const {
-  const TargetRegisterClass *RC = nullptr;
-  unsigned NumRCUnits = 0;
-  for (const TargetRegisterClass *C : TRI->regclasses()) {
-    const int *PSetID = TRI->getRegClassPressureSets(C);
-    for (; *PSetID != -1; ++PSetID) {
-      if ((unsigned)*PSetID == Idx)
-        break;
-    }
-    if (*PSetID == -1)
-      continue;
-
-    // Found a register class that counts against this pressure set.
-    // For efficiency, only compute the set order for the largest set.
-    unsigned NUnits = TRI->getRegClassWeight(C).WeightLimit;
-    if (!RC || NUnits > NumRCUnits) {
-      RC = C;
-      NumRCUnits = NUnits;
-    }
-  }
-  assert(RC && "Failed to find register class");
-  compute(RC);
-  unsigned NAllocatableRegs = getNumAllocatableRegs(RC);
-  unsigned RegPressureSetLimit = TRI->getRegPressureSetLimit(*MF, Idx);
-  // If all the regs are reserved, return raw RegPressureSetLimit.
-  // One example is VRSAVERC in PowerPC.
-  // Avoid returning zero, getRegPressureSetLimit(Idx) assumes computePSetLimit
-  // return non-zero value.
-  if (NAllocatableRegs == 0)
-    return RegPressureSetLimit;
-  unsigned NReserved = RC->getNumRegs() - NAllocatableRegs;
-  return RegPressureSetLimit - TRI->getRegClassWeight(RC).RegWeight * NReserved;
-}
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index 032f1a33e75c43..4cede283a7232c 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -674,6 +674,50 @@ TargetRegisterInfo::prependOffsetExpression(const DIExpression *Expr,
                                       PrependFlags & DIExpression::EntryValue);
 }
 
+unsigned TargetRegisterInfo::getRegPressureSetLimit(const MachineFunction &MF,
+                                                    unsigned Idx) const {
+  const TargetRegisterClass *RC = nullptr;
+  unsigned NumRCUnits = 0;
+  for (const TargetRegisterClass *C : regclasses()) {
+    const int *PSetID = getRegClassPressureSets(C);
+    for (; *PSetID != -1; ++PSetID) {
+      if ((unsigned)*PSetID == Idx)
+        break;
+    }
+    if (*PSetID == -1)
+      continue;
+
+    // Found a register class that counts against this pressure set.
+    // For efficiency, only compute the set order for the largest set.
+    unsigned NUnits = getRegClassWeight(C).WeightLimit;
+    if (!RC || NUnits > NumRCUnits) {
+      RC = C;
+      NumRCUnits = NUnits;
+    }
+  }
+  assert(RC && "Failed to find register class");
+
+  unsigned NReserved = 0;
+  const BitVector Reserved = MF.getRegInfo().getReservedRegs();
+  for (unsigned PhysReg : RC->getRawAllocationOrder(MF))
+    if (Reserved.test(PhysReg))
+      NReserved++;
+
+  unsigned NAllocatableRegs = RC->getNumRegs() - NReserved;
+  unsigned RegPressureSetLimit = getRawRegPressureSetLimit(MF, Idx);
+  // If all the regs are reserved, return raw RegPressureSetLimit.
+  // One example is VRSAVERC in PowerPC.
+  // Avoid returning zero, RegisterClassInfo::getRegPressureSetLimit(Idx)
+  // assumes this returns non-zero value.
+  if (NAllocatableRegs == 0) {
+    LLVM_DEBUG({
+      dbgs() << "All registers of " << getRegClassName(RC) << " are reserved!";
+    });
+    return RegPressureSetLimit;
+  }
+  return RegPressureSetLimit - getRegClassWeight(RC).RegWeight * NReserved;
+}
+
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
 LLVM_DUMP_METHOD
 void TargetRegisterInfo::dumpReg(Register Reg, unsigned SubRegIndex,
diff --git a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
index d1c4459aaa6ee0..2bd89dacb2b37a 100644
--- a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
+++ b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
@@ -20,101 +20,101 @@ define void @jr_without_ra(ptr %rtwdev, ptr %chan, ptr %h2c, i8 %.pre, i1 %cmp.i
 ; CHECK-NEXT:    st.d $s6, $sp, 24 # 8-byte Folded Spill
 ; CHECK-NEXT:    st.d $s7, $sp, 16 # 8-byte Folded Spill
 ; CHECK-NEXT:    st.d $s8, $sp, 8 # 8-byte Folded Spill
-; CHECK-NEXT:    move $s7, $zero
-; CHECK-NEXT:    move $s0, $zero
+; CHECK-NEXT:    move $s6, $zero
+; CHECK-NEXT:    move $s1, $zero
 ; CHECK-NEXT:    ld.d $t0, $sp, 184
-; CHECK-NEXT:    ld.d $s2, $sp, 176
-; CHECK-NEXT:    ld.d $s1, $sp, 168
-; CHECK-NEXT:    ld.d $t1, $sp, 160
-; CHECK-NEXT:    ld.d $t2, $sp, 152
-; CHECK-NEXT:    ld.d $t3, $sp, 144
-; CHECK-NEXT:    ld.d $t4, $sp, 136
-; CHECK-NEXT:    ld.d $t5, $sp, 128
-; CHECK-NEXT:    ld.d $t6, $sp, 120
-; CHECK-NEXT:    ld.d $t7, $sp, 112
-; CHECK-NEXT:    ld.d $t8, $sp, 104
-; CHECK-NEXT:    ld.d $fp, $sp, 96
+; CHECK-NEXT:    ld.d $t1, $sp, 176
+; CHECK-NEXT:    ld.d $s2, $sp, 168
+; CHECK-NEXT:    ld.d $t2, $sp, 160
+; CHECK-NEXT:    ld.d $t3, $sp, 152
+; CHECK-NEXT:    ld.d $t4, $sp, 144
+; CHECK-NEXT:    ld.d $t5, $sp, 136
+; CHECK-NEXT:    ld.d $t6, $sp, 128
+; CHECK-NEXT:    ld.d $t7, $sp, 120
+; CHECK-NEXT:    ld.d $t8, $sp, 112
+; CHECK-NEXT:    ld.d $fp, $sp, 104
+; CHECK-NEXT:    ld.d $s0, $sp, 96
 ; CHECK-NEXT:    andi $a4, $a4, 1
-; CHECK-NEXT:    alsl.d $a6, $a6, $s1, 4
-; CHECK-NEXT:    pcalau12i $s1, %pc_hi20(.LJTI0_0)
-; CHECK-NEXT:    addi.d $s1, $s1, %pc_lo12(.LJTI0_0)
-; CHECK-NEXT:    slli.d $s3, $s2, 2
-; CHECK-NEXT:    alsl.d $s2, $s2, $s3, 1
-; CHECK-NEXT:    add.d $s2, $t5, $s2
-; CHECK-NEXT:    addi.w $s4, $zero, -41
+; CHECK-NEXT:    alsl.d $a6, $a6, $s2, 4
+; CHECK-NEXT:    pcalau12i $s2, %pc_hi20(.LJTI0_0)
+; CHECK-NEXT:    addi.d $s2, $s2, %pc_lo12(.LJTI0_0)
 ; CHECK-NEXT:    ori $s3, $zero, 1
-; CHECK-NEXT:    slli.d $s4, $s4, 3
-; CHECK-NEXT:    ori $s6, $zero, 3
-; CHECK-NEXT:    lu32i.d $s6, 262144
+; CHECK-NEXT:    ori $s4, $zero, 50
+; CHECK-NEXT:    ori $s5, $zero, 3
+; CHECK-NEXT:    lu32i.d $s5, 262144
 ; CHECK-NEXT:    b .LBB0_4
 ; CHECK-NEXT:    .p2align 4, , 16
 ; CHECK-NEXT:  .LBB0_1: # %sw.bb27.i.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ori $s8, $zero, 1
+; CHECK-NEXT:    ori $s7, $zero, 1
 ; CHECK-NEXT:  .LBB0_2: # %if.else.i106
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    alsl.d $s5, $s0, $s0, 3
-; CHECK-NEXT:    alsl.d $s0, $s5, $s0, 1
-; CHECK-NEXT:    add.d $s0, $t0, $s0
-; CHECK-NEXT:    ldx.bu $s8, $s0, $s8
+; CHECK-NEXT:    alsl.d $s8, $s1, $s1, 3
+; CHECK-NEXT:    alsl.d $s1, $s8, $s1, 1
+; CHECK-NEXT:    add.d $s1, $t0, $s1
+; CHECK-NEXT:    ldx.bu $s7, $s1, $s7
 ; CHECK-NEXT:  .LBB0_3: # %phy_tssi_get_ofdm_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    st.b $zero, $t5, 0
-; CHECK-NEXT:    st.b $s7, $t3, 0
-; CHECK-NEXT:    st.b $zero, $t8, 0
-; CHECK-NEXT:    st.b $zero, $t1, 0
-; CHECK-NEXT:    st.b $zero, $a1, 0
+; CHECK-NEXT:    st.b $zero, $t6, 0
+; CHECK-NEXT:    st.b $s6, $t4, 0
+; CHECK-NEXT:    st.b $zero, $fp, 0
 ; CHECK-NEXT:    st.b $zero, $t2, 0
-; CHECK-NEXT:    st.b $s8, $a5, 0
-; CHECK-NEXT:    ori $s0, $zero, 1
-; CHECK-NEXT:    move $s7, $a3
+; CHECK-NEXT:    st.b $zero, $a1, 0
+; CHECK-NEXT:    st.b $zero, $t3, 0
+; CHECK-NEXT:    st.b $s7, $a5, 0
+; CHECK-NEXT:    ori $s1, $zero, 1
+; CHECK-NEXT:    move $s6, $a3
 ; CHECK-NEXT:  .LBB0_4: # %for.body
 ; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
 ; CHECK-NEXT:    beqz $a4, .LBB0_9
 ; CHECK-NEXT:  # %bb.5: # %calc_6g.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    move $s7, $zero
+; CHECK-NEXT:    move $s6, $zero
 ; CHECK-NEXT:    bnez $zero, .LBB0_8
 ; CHECK-NEXT:  # %bb.6: # %calc_6g.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    slli.d $s8, $zero, 3
-; CHECK-NEXT:    ldx.d $s8, $s8, $s1
-; CHECK-NEXT:    jr $s8
+; CHECK-NEXT:    slli.d $s7, $zero, 3
+; CHECK-NEXT:    ldx.d $s7, $s7, $s2
+; CHECK-NEXT:    jr $s7
 ; CHECK-NEXT:  .LBB0_7: # %sw.bb12.i.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ori $s7, $zero, 1
+; CHECK-NEXT:    ori $s6, $zero, 1
 ; CHECK-NEXT:  .LBB0_8: # %if.else58.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ldx.bu $s7, $a6, $s7
+; CHECK-NEXT:    ldx.bu $s6, $a6, $s6
 ; CHECK-NEXT:    b .LBB0_11
 ; CHECK-NEXT:    .p2align 4, , 16
 ; CHECK-NEXT:  .LBB0_9: # %if.end.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    andi $s7, $s7, 255
-; CHECK-NEXT:    ori $s5, $zero, 50
-; CHECK-NEXT:    bltu $s5, $s7, .LBB0_15
+; CHECK-NEXT:    andi $s6, $s6, 255
+; CHECK-NEXT:    bltu $s4, $s6, .LBB0_15
 ; CHECK-NEXT:  # %bb.10: # %if.end.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    sll.d $s7, $s3, $s7
-; CHECK-NEXT:    and $s8, $s7, $s6
-; CHECK-NEXT:    move $s7, $fp
-; CHECK-NEXT:    beqz $s8, .LBB0_15
+; CHECK-NEXT:    sll.d $s6, $s3, $s6
+; CHECK-NEXT:    and $s7, $s6, $s5
+; CHECK-NEXT:    move $s6, $s0
+; CHECK-NEXT:    beqz $s7, .LBB0_15
 ; CHECK-NEXT:  .LBB0_11: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    move $s8, $zero
-; CHECK-NEXT:    st.b $zero, $t7, 0
-; CHECK-NEXT:    ldx.b $ra, $s2, $t4
+; CHECK-NEXT:    move $s7, $zero
+; CHECK-NEXT:    st.b $zero, $t8, 0
+; CHECK-NEXT:    slli.d $s8, $t1, 2
+; CHECK-NEXT:    alsl.d $s8, $t1, $s8, 1
+; CHECK-NEXT:    add.d $s8, $t6, $s8
+; CHECK-NEXT:    ldx.b $s8, $s8, $t5
 ; CHECK-NEXT:    st.b $zero, $a2, 0
 ; CHECK-NEXT:    st.b $zero, $a7, 0
-; CHECK-NEXT:    st.b $zero, $t6, 0
-; CHECK-NEXT:    st.b $ra, $a0, 0
+; CHECK-NEXT:    st.b $zero, $t7, 0
+; CHECK-NEXT:    st.b $s8, $a0, 0
 ; CHECK-NEXT:    bnez $s3, .LBB0_13
 ; CHECK-NEXT:  # %bb.12: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
+; CHECK-NEXT:    addi.w $s8, $zero, -41
+; CHECK-NEXT:    slli.d $s8, $s8, 3
 ; CHECK-NEXT:    pcalau12i $ra, %pc_hi20(.LJTI0_1)
 ; CHECK-NEXT:    addi.d $ra, $ra, %pc_lo12(.LJTI0_1)
-; CHECK-NEXT:    ldx.d $s5, $s4, $ra
-; CHECK-NEXT:    jr $s5
+; CHECK-NEXT:    ldx.d $s8, $s8, $ra
+; CHECK-NEXT:    jr $s8
 ; CHECK-NEXT:  .LBB0_13: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    bnez $s3, .LBB0_1
diff --git a/llvm/test/CodeGen/NVPTX/misched_func_call.ll b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
index e036753ce90306..ee6b5869111c6f 100644
--- a/llvm/test/CodeGen/NVPTX/misched_func_call.ll
+++ b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
@@ -17,7 +17,6 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT:    ld.param.u32 %r2, [wombat_param_0];
 ; CHECK-NEXT:    mov.b32 %r10, 0;
 ; CHECK-NEXT:    mov.u64 %rd1, 0;
-; CHECK-NEXT:    mov.b32 %r6, 1;
 ; CHECK-NEXT:  $L__BB0_1: // %bb3
 ; CHECK-NEXT:    // =>This Inner Loop Header: Depth=1
 ; CHECK-NEXT:    { // callseq 0, 0
@@ -29,16 +28,16 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT:    (
 ; CHECK-NEXT:    param0
 ; CHECK-NEXT:    );
+; CHECK-NEXT:    ld.param.f64 %fd1, [retval0];
+; CHECK-NEXT:    } // callseq 0
 ; CHECK-NEXT:    mul.lo.s32 %r7, %r10, %r3;
 ; CHECK-NEXT:    or.b32 %r8, %r4, %r7;
 ; CHECK-NEXT:    mul.lo.s32 %r9, %r2, %r8;
 ; CHECK-NEXT:    cvt.rn.f64.s32 %fd3, %r9;
-; CHECK-NEXT:    ld.param.f64 %fd1, [retval0];
-; CHECK-NEXT:    } // callseq 0
 ; CHECK-NEXT:    cvt.rn.f64.u32 %fd4, %r10;
 ; CHECK-NEXT:    add.rn.f64 %fd5, %fd4, %fd3;
 ; CHECK-NEXT:    st.global.f64 [%rd1], %fd5;
-; CHECK-NEXT:    mov.u32 %r10, %r6;
+; CHECK-NEXT:    mov.b32 %r10, 1;
 ; CHECK-NEXT:    bra.uni $L__BB0_1;
 bb:
   br label %bb3
diff --git a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
index fba410dc0dafce..7c8a5848b402f4 100644
--- a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
@@ -17,5 +17,4 @@ body: |
 ...
 
 # CHECK-DAG: AllocationOrder(GPRC) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r0 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
-# CHECK-DAG: AllocationOrder(F4RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
 # CHECK-DAG: AllocationOrder(GPRC_and_GPRC_NOR0) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
diff --git a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
index 584b6b0ad46dd9..3617b95b2a6af7 100644
--- a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
@@ -16,6 +16,5 @@ body: |
     $f1 = COPY %2
     BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $f1
 ...
-# CHECK-DAG: AllocationOrder(VFRC) = [ $vf2 $vf3 $vf4 $vf5 $vf0 $vf1 $vf6 $vf7 $vf8 $vf9 $vf10 $vf11 $vf12 $vf13 $vf14 $vf15 $vf16 $vf17 $vf18 $vf19 $vf31 $vf30 $vf29 $vf28 $vf27 $vf26 $vf25 $vf24 $vf23 $vf22 $vf21 $vf20 ]
 # CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
 # CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
diff --git a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
index 9a1b057c2e38d4..9d893b8dbebee2 100644
--- a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
+++ b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
@@ -1,7 +1,7 @@
 ; REQUIRES: asserts
-; RUN: llc -debug-only=regalloc < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
+; RUN: llc -debug-only=target-reg-info < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
 
-; DEBUG-COUNT-1:         AllocationOrder(VRSAVERC) = [ ]
+; DEBUG-COUNT-1: All registers of VRSAVERC are reserved!
 
 target triple = "powerpc64le-unknown-linux-gnu"
 
diff --git a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
index c35f05be304cce..ec2448cb3965f3 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
@@ -489,8 +489,9 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
 ; RV64-NEXT:    j .LBB0_11
 ; RV64-NEXT:  .LBB0_8: # %vector.ph
 ; RV64-NEXT:    # in Loop: Header=BB0_6 Depth=1
-; RV64-NEXT:    slli t6, t0, 28
-; RV64-NEXT:    sub t6, t6, t1
+; RV64-NEXT:    slli t6, t0, 1
+; RV64-NEXT:    slli s0, t0, 28
+; RV64-NEXT:    sub t6, s0, t6
 ; RV64-NEXT:    and t6, t6, a6
 ; RV64-NEXT:    csrwi vxrm, 0
 ; RV64-NEXT:    mv s0, a2
diff --git a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
index 7087041e8dace6..6d082802f9cd75 100644
--- a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
@@ -353,8 +353,8 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
 ; CHECK-NEXT:    push.w {r4, r5, r6, r7, r8, r9, r10, r11, lr}
 ; CHECK-NEXT:    .pad #4
 ; CHECK-NEXT:    sub sp, #4
-; CHECK-NEXT:    .vsave {d8, d9, d10, d11, d12, d13, d14, d15}
-; CHECK-NEXT:    vpush {d8, d9, d10, d11, d12, d13, d14, d15}
+; CHECK-NEXT:    .vsave {d8, d9, d10, d11, d12, d13}
+; CHECK-NEXT:    vpush {d8, d9, d10, d11, d12, d13}
 ; CHECK-NEXT:    .pad #16
 ; CHECK-NEXT:    sub sp, #16
 ; CHECK-NEXT:    mov lr, r0
@@ -364,50 +364,48 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
 ; CHECK-NEXT:  @ %bb.1: @ %for.cond2.preheader.lr.ph
 ; CHECK-NEXT:    movs r0, #1
 ; CHECK-NEXT:    cmp r2, #1
-; CHECK-NEXT:    csel r7, r2, r0, lt
+; CHECK-NEXT:    csel r3, r2, r0, lt
 ; CHECK-NEXT:    mov r12, r1
-; CHECK-NEXT:    mov r1, r7
-; CHECK-NEXT:    cmp r7, #3
+; CHECK-NEXT:    m...
[truncated]

lenary

This looks right. Two small changes below

Do you know why this doesn't have as many changes to the RISC-V tests as the RISCV-V specific patch?

llvm/lib/CodeGen/TargetRegisterInfo.cpp

wangpc-pp · 2024-12-05T13:51:42Z

Do you know why this doesn't have as many changes to the RISC-V tests as the RISCV-V specific patch?

I did a rough investigation, I think it is because we only changed the limit of GPRAll in that RISC-V specific patch, but there are some overlapped register pressure sets like GPRAll, SP, GPRTC, etc. We didn't remove reserved registers for these register pressure sets.

Created using spr 1.3.6-beta.1

llvm/lib/CodeGen/RegisterClassInfo.cpp

llvm/lib/CodeGen/TargetRegisterInfo.cpp

Created using spr 1.3.6-beta.1

…egPressureSetLimit directly Created using spr 1.3.6-beta.1

arsenm · 2024-12-06T15:17:39Z

llvm/lib/CodeGen/TargetRegisterInfo.cpp

+
+  unsigned NReserved = 0;
+  const BitVector Reserved = MF.getRegInfo().getReservedRegs();
+  for (MCPhysReg PhysReg : RC->getRawAllocationOrder(MF))


There's still the pre-existing bug where the pressure number is wrong for overlapping registers in the allocation order, and should probably be reporting number of distinct allocatable registers

I already had a WIP patch to generate the mapping from pressure set to RegisterClass, I wil post it when it's ready.

michaelmaitland · 2024-12-06T15:38:01Z

llvm/lib/CodeGen/TargetRegisterInfo.cpp

+    }
+  }
+  assert(RC && "Failed to find register class");
+


What is the reason we drop the call to compute from computePSetLimit here? It looks like we're doing this instead:

unsigned NReserved = 0; const BitVector Reserved = MF.getRegInfo().getReservedRegs(); for (MCPhysReg PhysReg : RC->getRawAllocationOrder(MF)) if (Reserved.test(PhysReg)) NReserved++;

It looks like this is related to what is in compute, but we seem to drop some costing logic.

The reason why we need a compute is because we need to compute the number of allocatable registers and then get the number of reserved registers (but actually I think we don't need to call compute explicitly as getNumAllocatableRegs is cached, it will call compute in get(RC)):

// ... compute(RC); unsigned NAllocatableRegs = getNumAllocatableRegs(RC); // ... unsigned NReserved = RC->getNumRegs() - NAllocatableRegs;

unsigned getNumAllocatableRegs(const TargetRegisterClass *RC) const { return get(RC).NumRegs; }

The logic of calculating NumRegs in compute is:

// FIXME: Once targets reserve registers instead of removing them from the // allocation order, we can simply use begin/end here. ArrayRef<MCPhysReg> RawOrder = RC->getRawAllocationOrder(*MF); for (unsigned PhysReg : RawOrder) { // Remove reserved registers from the allocation order. if (Reserved.test(PhysReg)) continue; uint8_t Cost = RegCosts[PhysReg]; MinCost = std::min(MinCost, Cost); if (getLastCalleeSavedAlias(PhysReg) && !STI.ignoreCSRForAllocationOrder(*MF, PhysReg)) // PhysReg aliases a CSR, save it for later. CSRAlias.push_back(PhysReg); else { if (Cost != LastCost) LastCostChange = N; RCI.Order[N++] = PhysReg; LastCost = Cost; } } RCI.NumRegs = N + CSRAlias.size();

RCI.NumRegs is the total number of registers minus the number of reserved registers.
And in this patch, we change it to calculate the number of reserved registers first and calculate the allocatable registers later. The effect should be the same I think.

wangpc-pp · 2024-12-09T04:09:24Z

llvm/lib/CodeGen/MachinePipeliner.cpp

@@ -1327,47 +1327,6 @@ class HighRegisterPressureDetector {
  void computePressureSetLimit(const RegisterClassInfo &RCI) {
    for (unsigned PSet = 0; PSet < PSetNum; PSet++)
      PressureSetLimit[PSet] = TRI->getRegPressureSetLimit(MF, PSet);
-


Hi @kasuga-fj! What do you think about this part? I just removed the code below as it seems to be unnecessary now. Related patch: #74807

And I think just removing fixed registers is not enough here.

Thanks for caring. I dared to replace it from RegisterClassInfo::getRegPressureSetLimit before (in #87312). This is because there were duplicate registers between what RegisterClassInfo::getRegPressureSetLimit removes and the fixed registers mentioned here. If RegisterClassInfo::getRegPressureSetLimit now takes care of both, I don't think it's a problem to remove this part as you did.

wangpc-pp · 2024-12-09T04:15:31Z

I need more eyes on test diffs of other targets besides RISC-V! Please help to identify if there are some regressions!

topperc · 2024-12-09T07:07:47Z

Is there a compile time impact for this patch?

wangpc-pp · 2024-12-09T07:50:58Z

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path.
cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

topperc · 2024-12-09T09:13:59Z

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path. cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

The limits aren’t cached for passes like MachineLICM right? And it will be recomputed for each function? My understanding of RegisterClassInfo is that it maintains the cache across functions as long as they have the same the subtarget or something like that?

wangpc-pp · 2024-12-09T09:51:10Z

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path. cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

The limits aren’t cached for passes like MachineLICM right? And it will be recomputed for each function? My understanding of RegisterClassInfo is that it maintains the cache across functions as long as they have the same the subtarget or something like that?

Yes, you are right! I will force these passes to use RegisterClassInfo as follow ups (Edit: done in #119194).

topperc · 2024-12-18T07:28:48Z

I am going to make RegClassInfo a pass so that we can avoid duplicated calculation in RegClassInfo.

Should we just rename the TRI function to discourage use and encourage everyone to use RegClassInfo?

It may be not that easy. There are still some cases that need direct use of TRI::getRegPressureSetLimit like some hooks in ARM/PPC:

llvm-project/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

Lines 645 to 647 in d6e8ab1

// For now we only care about float and double type fma.

unsigned VSSRCLimit = TRI->getRegPressureSetLimit(

*MBB->getParent(), PPC::RegisterPressureSets::VSSRC);

llvm-project/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

Lines 6980 to 6985 in d6e8ab1

auto &P = RPTracker.getPressure().MaxSetPressure;

for (unsigned I = 0, E = P.size(); I < E; ++I)

if (P[I] > TRI->getRegPressureSetLimit(*MF, I)) {

return true;

}

return false;

In which, we can't get RegClassInfo if it is a pass (for now, we can re-calculate RegClassInfo).

Can't the pass just be a wrapper around RegClassInfo? Why do we need to remove the ability to use RegClassInfo as a utility?

wangpc-pp · 2024-12-18T07:32:52Z

I am going to make RegClassInfo a pass so that we can avoid duplicated calculation in RegClassInfo.

Should we just rename the TRI function to discourage use and encourage everyone to use RegClassInfo?

It may be not that easy. There are still some cases that need direct use of TRI::getRegPressureSetLimit like some hooks in ARM/PPC:

llvm-project/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

Lines 645 to 647 in d6e8ab1

// For now we only care about float and double type fma.

unsigned VSSRCLimit = TRI->getRegPressureSetLimit(

*MBB->getParent(), PPC::RegisterPressureSets::VSSRC);

llvm-project/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

Lines 6980 to 6985 in d6e8ab1

auto &P = RPTracker.getPressure().MaxSetPressure;

for (unsigned I = 0, E = P.size(); I < E; ++I)

if (P[I] > TRI->getRegPressureSetLimit(*MF, I)) {

return true;

}

return false;

In which, we can't get RegClassInfo if it is a pass (for now, we can re-calculate RegClassInfo).

Can't the pass just be a wrapper around RegClassInfo? Why do we need to remove the ability to use RegClassInfo as a utility?

Aha! I never thought about that! Good idea!

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm#118787

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from #118787

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm#118787

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from #118787

Created using spr 1.3.6-beta.1

wangpc-pp · 2025-01-09T13:27:33Z

Ping.

llvm/include/llvm/CodeGen/TargetRegisterInfo.h

topperc · 2025-01-10T06:02:07Z

Description needs to be updated if MachineLICM, MachineSink, MachinePipeliner have been migrated to RegisterClassInfo.

llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll

topperc · 2025-01-10T06:08:52Z

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

wangpc-pp · 2025-01-10T07:11:49Z

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

Just checked line by line, I have no idea why X86 has some changes...

wangpc-pp · 2025-01-10T08:59:32Z

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

Just checked line by line, I have no idea why X86 has some changes...

The reason may be mentally absorbing (and costed me a lot of time on debugging...): For some RegisterClasss, getRawAllocationOrder may return different orders by OrderFunc (which is set by AltOrderSelect in TableGen). We calculate the number of reserved registers first, and then calculate the number of allocatable registers. This results in higher allocatable registers, bacause the alternative allocation orders may have less registers.
We change to calculate the number of allocatable registers directly and calculate the number of reserved registers from it, the problem can be solved.

Created using spr 1.3.6-beta.1

wangpc-pp · 2025-01-10T09:03:43Z

llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir

 # CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
+# CHECK-DAG: AllocationOrder(G8RC) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x0 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]


For these PPC changes, it is just because we have different code path now and the dumps are different.

…it` (#119830) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm/llvm-project#118787

…etLimit` (#119827) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and remove replicated code. Separate from llvm/llvm-project#118787

…0377) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm/llvm-project#118787

…(#120383) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm/llvm-project#118787

…it` (#119826) `RegisterClassInfo::getRegPressureSetLimit` is a wrapper of `TargetRegisterInfo::getRegPressureSetLimit` with some logics to adjust the limit by removing reserved registers. It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit` directly, just like the comment "This limit must be adjusted dynamically for reserved registers" said. Separate from llvm/llvm-project#118787

[𝘀𝗽𝗿] initial version

2ec06c7

Created using spr 1.3.6-beta.1

llvmbot added backend:PowerPC backend:X86 tablegen llvm:regalloc backend:NVPTX backend:loongarch labels Dec 5, 2024

wangpc-pp requested a review from rnk December 5, 2024 11:00

wangpc-pp requested review from topperc, wangleiat, lenary, arsenm and nikic December 5, 2024 11:00

wangpc-pp mentioned this pull request Dec 5, 2024

[RISCV] Correct the limit of RegPresureSet GPRAll #118473

Closed

lenary reviewed Dec 5, 2024

View reviewed changes

llvm/lib/CodeGen/TargetRegisterInfo.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/TargetRegisterInfo.cpp Outdated Show resolved Hide resolved

Use MCPhysReg

fae615d

Created using spr 1.3.6-beta.1

topperc reviewed Dec 5, 2024

View reviewed changes

llvm/lib/CodeGen/RegisterClassInfo.cpp Show resolved Hide resolved

arsenm reviewed Dec 5, 2024

View reviewed changes

llvm/lib/CodeGen/TargetRegisterInfo.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/TargetRegisterInfo.cpp Show resolved Hide resolved

wangpc-pp added 2 commits December 6, 2024 11:15

Add new line

723597a

Created using spr 1.3.6-beta.1

Add a comment to suggest the user to not use TargetRegisterInfo::getR…

5adec35

…egPressureSetLimit directly Created using spr 1.3.6-beta.1

arsenm approved these changes Dec 6, 2024

View reviewed changes

michaelmaitland reviewed Dec 6, 2024

View reviewed changes

wangpc-pp commented Dec 9, 2024

View reviewed changes

wangpc-pp mentioned this pull request Dec 18, 2024

[ARM] Use RegisterClassInfo::getRegPressureSetLimit #120377

Merged

wangpc-pp mentioned this pull request Dec 18, 2024

[PowerPC] Use RegisterClassInfo::getRegPressureSetLimit #120383

Merged

Rebase

f5de3e9

Created using spr 1.3.6-beta.1

arsenm reviewed Jan 9, 2025

View reviewed changes

llvm/include/llvm/CodeGen/TargetRegisterInfo.h Show resolved Hide resolved

arsenm approved these changes Jan 10, 2025

View reviewed changes

topperc reviewed Jan 10, 2025

View reviewed changes

llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll Outdated Show resolved Hide resolved

Fix X86 regrssions

47c3275

Created using spr 1.3.6-beta.1

wangpc-pp commented Jan 10, 2025

View reviewed changes

wangpc-pp mentioned this pull request Jan 10, 2025

[CodeGen] Add MachineRegisterClassInfo analysis pass #120690

Open

		# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
		# CHECK-DAG: AllocationOrder(G8RC) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x0 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]

[TRI] Remove reserved registers in getRegPressureSetLimit #118787

Are you sure you want to change the base?

[TRI] Remove reserved registers in getRegPressureSetLimit #118787

Uh oh!

Conversation

wangpc-pp commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wangpc-pp commented Dec 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arsenm Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

michaelmaitland Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Dec 9, 2024

Choose a reason for hiding this comment

Uh oh!

kasuga-fj Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp commented Dec 9, 2024

Uh oh!

topperc commented Dec 9, 2024

Uh oh!

wangpc-pp commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Dec 9, 2024

Uh oh!

wangpc-pp commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc commented Dec 18, 2024

Uh oh!

wangpc-pp commented Dec 18, 2024

Uh oh!

wangpc-pp commented Jan 9, 2025

Uh oh!

Uh oh!

topperc commented Jan 10, 2025

Uh oh!

Uh oh!

topperc commented Jan 10, 2025

Uh oh!

wangpc-pp commented Jan 10, 2025

Uh oh!

wangpc-pp commented Jan 10, 2025

Uh oh!

wangpc-pp Jan 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wangpc-pp commented Dec 5, 2024 •

edited

Loading

llvmbot commented Dec 5, 2024 •

edited

Loading

wangpc-pp commented Dec 5, 2024 •

edited

Loading

wangpc-pp Dec 9, 2024 •

edited

Loading

kasuga-fj Dec 10, 2024 •

edited

Loading

wangpc-pp commented Dec 9, 2024 •

edited

Loading

wangpc-pp commented Dec 9, 2024 •

edited

Loading