Skip to content

[TRI] Remove reserved registers in getRegPressureSetLimit #118787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: users/wangpc-pp/spr/main.tri-remove-reserved-registers-in-getregpressuresetlimit
Choose a base branch
from

Conversation

wangpc-pp
Copy link
Contributor

@wangpc-pp wangpc-pp commented Dec 5, 2024

There are two getRegPressureSetLimit:

  1. RegisterClassInfo::getRegPressureSetLimit.
  2. TargetRegisterInfo::getRegPressureSetLimit.

RegisterClassInfo::getRegPressureSetLimit is a wrapper of
TargetRegisterInfo::getRegPressureSetLimit with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use TargetRegisterInfo::getRegPressureSetLimit
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

These two getRegPressureSetLimits are messy, and easy to confuse
the users. So here we move the logic of adjusting these limits for
reserved registers in RegisterClassInfo::getRegPressureSetLimit
to TargetRegisterInfo::getRegPressureSetLimit. This makes the previous
one a thin cached wrapper of the later one.

Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Dec 5, 2024

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-nvptx
@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-powerpc

Author: Pengcheng Wang (wangpc-pp)

Changes

There are two getRegPressureSetLimit:

  1. RegisterClassInfo::getRegPressureSetLimit.
  2. TargetRegisterInfo::getRegPressureSetLimit.

RegisterClassInfo::getRegPressureSetLimit is a wrapper of
TargetRegisterInfo::getRegPressureSetLimit with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use TargetRegisterInfo::getRegPressureSetLimit
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

However, there exists some passes that use it directly. For example,
MachineLICM, MachineSink, MachinePipeliner, etc. And in these
passes, the register pressure set limits are not adjusted for reserved
registers, which means that the limits are larger than the actual.

These two getRegPressureSetLimits are messy, and easy to confuse
the users. So here we move the logic of adjusting these limits for
reserved registers in RegisterClassInfo::getRegPressureSetLimit
to TargetRegisterInfo::getRegPressureSetLimit. This makes the previous
one a thin cached wrapper of the later one.

This change helps to reduce the number of spills/reloads as well.

Here are the RISC-V's statistics of spills/reloads on llvm-test-suite
with -O3 -march=rva23u64:

Metric: regalloc.NumSpills,regalloc.NumReloads

Program                                       regalloc.NumSpills                  regalloc.NumReloads
                                              baseline           after    diff    baseline            after    diff
External/S...T2017speed/602.gcc_s/602.gcc_s   11811.00           11349.00 -462.00 26812.00            25793.00 -1019.00
External/S...NT2017rate/502.gcc_r/502.gcc_r   11811.00           11349.00 -462.00 26812.00            25793.00 -1019.00
External/S...te/526.blender_r/526.blender_r   13513.00           13251.00 -262.00 27462.00            27195.00  -267.00
SingleSour...nchmarks/Adobe-C++/loop_unroll    1533.00            1413.00 -120.00  2943.00             2633.00  -310.00
External/S...00.perlbench_s/600.perlbench_s    4398.00            4280.00 -118.00  9745.00             9466.00  -279.00
External/S...00.perlbench_r/500.perlbench_r    4398.00            4280.00 -118.00  9745.00             9466.00  -279.00
External/S...rate/510.parest_r/510.parest_r   43985.00           43888.00  -97.00 87407.00            87330.00   -77.00
MultiSourc...sumer-typeset/consumer-typeset    1222.00            1129.00  -93.00  3048.00             2887.00  -161.00
External/S...ed/638.imagick_s/638.imagick_s    4155.00            4064.00  -91.00 10556.00            10463.00   -93.00
External/S...te/538.imagick_r/538.imagick_r    4155.00            4064.00  -91.00 10556.00            10463.00   -93.00
External/S...rate/511.povray_r/511.povray_r    1734.00            1657.00  -77.00  3410.00             3290.00  -120.00
MultiSourc...e/Applications/ClamAV/clamscan    2120.00            2049.00  -71.00  5041.00             4994.00   -47.00
External/S...23.xalancbmk_s/623.xalancbmk_s    1664.00            1608.00  -56.00  2758.00             2663.00   -95.00
External/S...23.xalancbmk_r/523.xalancbmk_r    1664.00            1608.00  -56.00  2758.00             2663.00   -95.00
MultiSource/Applications/SPASS/SPASS           1442.00            1388.00  -54.00  2954.00             2849.00  -105.00
      regalloc.NumSpills                            regalloc.NumReloads
run             baseline         after         diff            baseline         after         diff
mean   86.864054          85.415094    -1.448960     1173.354136          170.657475   -2.69666

Patch is 163.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118787.diff

26 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/RegisterClassInfo.h (+1-6)
  • (modified) llvm/include/llvm/CodeGen/TargetRegisterInfo.h (+7-2)
  • (modified) llvm/lib/CodeGen/MachinePipeliner.cpp (-41)
  • (modified) llvm/lib/CodeGen/RegisterClassInfo.cpp (-37)
  • (modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+44)
  • (modified) llvm/test/CodeGen/LoongArch/jr-without-ra.ll (+56-56)
  • (modified) llvm/test/CodeGen/NVPTX/misched_func_call.ll (+3-4)
  • (modified) llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir (-1)
  • (modified) llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir (-1)
  • (modified) llvm/test/CodeGen/PowerPC/compute-regpressure.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+3-2)
  • (modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+61-63)
  • (modified) llvm/test/CodeGen/Thumb2/mve-gather-increment.ll (+383-405)
  • (modified) llvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll (+70-70)
  • (modified) llvm/test/CodeGen/Thumb2/mve-pipelineloops.ll (+32-43)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+2-2)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+4-4)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+4-4)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+4-4)
  • (modified) llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll (+169-166)
  • (modified) llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll (+294-262)
  • (modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+8-8)
  • (modified) llvm/test/TableGen/bare-minimum-psets.td (+1-1)
  • (modified) llvm/test/TableGen/inhibit-pset.td (+1-1)
  • (modified) llvm/unittests/CodeGen/MFCommon.inc (+2-2)
  • (modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+4-3)
diff --git a/llvm/include/llvm/CodeGen/RegisterClassInfo.h b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
index 800bebea0dddb0..417a1e40d02b95 100644
--- a/llvm/include/llvm/CodeGen/RegisterClassInfo.h
+++ b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
@@ -141,16 +141,11 @@ class RegisterClassInfo {
   }
 
   /// Get the register unit limit for the given pressure set index.
-  ///
-  /// RegisterClassInfo adjusts this limit for reserved registers.
   unsigned getRegPressureSetLimit(unsigned Idx) const {
     if (!PSetLimits[Idx])
-      PSetLimits[Idx] = computePSetLimit(Idx);
+      PSetLimits[Idx] = TRI->getRegPressureSetLimit(*MF, Idx);
     return PSetLimits[Idx];
   }
-
-protected:
-  unsigned computePSetLimit(unsigned Idx) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index 292fa3c94969be..f7cd7cfe1aa15b 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -913,9 +913,14 @@ class TargetRegisterInfo : public MCRegisterInfo {
   virtual const char *getRegPressureSetName(unsigned Idx) const = 0;
 
   /// Get the register unit pressure limit for this dimension.
-  /// This limit must be adjusted dynamically for reserved registers.
+  /// TargetRegisterInfo adjusts this limit for reserved registers.
   virtual unsigned getRegPressureSetLimit(const MachineFunction &MF,
-                                          unsigned Idx) const = 0;
+                                          unsigned Idx) const;
+
+  /// Get the raw register unit pressure limit for this dimension.
+  /// This limit must be adjusted dynamically for reserved registers.
+  virtual unsigned getRawRegPressureSetLimit(const MachineFunction &MF,
+                                             unsigned Idx) const = 0;
 
   /// Get the dimensions of register pressure impacted by this register class.
   /// Returns a -1 terminated array of pressure set IDs.
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 7a10bd39e2695d..3ee0ba1fea5079 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -1327,47 +1327,6 @@ class HighRegisterPressureDetector {
   void computePressureSetLimit(const RegisterClassInfo &RCI) {
     for (unsigned PSet = 0; PSet < PSetNum; PSet++)
       PressureSetLimit[PSet] = TRI->getRegPressureSetLimit(MF, PSet);
-
-    // We assume fixed registers, such as stack pointer, are already in use.
-    // Therefore subtracting the weight of the fixed registers from the limit of
-    // each pressure set in advance.
-    SmallDenseSet<Register, 8> FixedRegs;
-    for (const TargetRegisterClass *TRC : TRI->regclasses()) {
-      for (const MCPhysReg Reg : *TRC)
-        if (isFixedRegister(Reg))
-          FixedRegs.insert(Reg);
-    }
-
-    LLVM_DEBUG({
-      for (auto Reg : FixedRegs) {
-        dbgs() << printReg(Reg, TRI, 0, &MRI) << ": [";
-        for (MCRegUnit Unit : TRI->regunits(Reg)) {
-          const int *Sets = TRI->getRegUnitPressureSets(Unit);
-          for (; *Sets != -1; Sets++) {
-            dbgs() << TRI->getRegPressureSetName(*Sets) << ", ";
-          }
-        }
-        dbgs() << "]\n";
-      }
-    });
-
-    for (auto Reg : FixedRegs) {
-      LLVM_DEBUG(dbgs() << "fixed register: " << printReg(Reg, TRI, 0, &MRI)
-                        << "\n");
-      for (MCRegUnit Unit : TRI->regunits(Reg)) {
-        auto PSetIter = MRI.getPressureSets(Unit);
-        unsigned Weight = PSetIter.getWeight();
-        for (; PSetIter.isValid(); ++PSetIter) {
-          unsigned &Limit = PressureSetLimit[*PSetIter];
-          assert(
-              Limit >= Weight &&
-              "register pressure limit must be greater than or equal weight");
-          Limit -= Weight;
-          LLVM_DEBUG(dbgs() << "PSet=" << *PSetIter << " Limit=" << Limit
-                            << " (decreased by " << Weight << ")\n");
-        }
-      }
-    }
   }
 
   // There are two patterns of last-use.
diff --git a/llvm/lib/CodeGen/RegisterClassInfo.cpp b/llvm/lib/CodeGen/RegisterClassInfo.cpp
index 9312bc03bc522a..976d41a54da56f 100644
--- a/llvm/lib/CodeGen/RegisterClassInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterClassInfo.cpp
@@ -195,40 +195,3 @@ void RegisterClassInfo::compute(const TargetRegisterClass *RC) const {
   // RCI is now up-to-date.
   RCI.Tag = Tag;
 }
-
-/// This is not accurate because two overlapping register sets may have some
-/// nonoverlapping reserved registers. However, computing the allocation order
-/// for all register classes would be too expensive.
-unsigned RegisterClassInfo::computePSetLimit(unsigned Idx) const {
-  const TargetRegisterClass *RC = nullptr;
-  unsigned NumRCUnits = 0;
-  for (const TargetRegisterClass *C : TRI->regclasses()) {
-    const int *PSetID = TRI->getRegClassPressureSets(C);
-    for (; *PSetID != -1; ++PSetID) {
-      if ((unsigned)*PSetID == Idx)
-        break;
-    }
-    if (*PSetID == -1)
-      continue;
-
-    // Found a register class that counts against this pressure set.
-    // For efficiency, only compute the set order for the largest set.
-    unsigned NUnits = TRI->getRegClassWeight(C).WeightLimit;
-    if (!RC || NUnits > NumRCUnits) {
-      RC = C;
-      NumRCUnits = NUnits;
-    }
-  }
-  assert(RC && "Failed to find register class");
-  compute(RC);
-  unsigned NAllocatableRegs = getNumAllocatableRegs(RC);
-  unsigned RegPressureSetLimit = TRI->getRegPressureSetLimit(*MF, Idx);
-  // If all the regs are reserved, return raw RegPressureSetLimit.
-  // One example is VRSAVERC in PowerPC.
-  // Avoid returning zero, getRegPressureSetLimit(Idx) assumes computePSetLimit
-  // return non-zero value.
-  if (NAllocatableRegs == 0)
-    return RegPressureSetLimit;
-  unsigned NReserved = RC->getNumRegs() - NAllocatableRegs;
-  return RegPressureSetLimit - TRI->getRegClassWeight(RC).RegWeight * NReserved;
-}
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index 032f1a33e75c43..4cede283a7232c 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -674,6 +674,50 @@ TargetRegisterInfo::prependOffsetExpression(const DIExpression *Expr,
                                       PrependFlags & DIExpression::EntryValue);
 }
 
+unsigned TargetRegisterInfo::getRegPressureSetLimit(const MachineFunction &MF,
+                                                    unsigned Idx) const {
+  const TargetRegisterClass *RC = nullptr;
+  unsigned NumRCUnits = 0;
+  for (const TargetRegisterClass *C : regclasses()) {
+    const int *PSetID = getRegClassPressureSets(C);
+    for (; *PSetID != -1; ++PSetID) {
+      if ((unsigned)*PSetID == Idx)
+        break;
+    }
+    if (*PSetID == -1)
+      continue;
+
+    // Found a register class that counts against this pressure set.
+    // For efficiency, only compute the set order for the largest set.
+    unsigned NUnits = getRegClassWeight(C).WeightLimit;
+    if (!RC || NUnits > NumRCUnits) {
+      RC = C;
+      NumRCUnits = NUnits;
+    }
+  }
+  assert(RC && "Failed to find register class");
+
+  unsigned NReserved = 0;
+  const BitVector Reserved = MF.getRegInfo().getReservedRegs();
+  for (unsigned PhysReg : RC->getRawAllocationOrder(MF))
+    if (Reserved.test(PhysReg))
+      NReserved++;
+
+  unsigned NAllocatableRegs = RC->getNumRegs() - NReserved;
+  unsigned RegPressureSetLimit = getRawRegPressureSetLimit(MF, Idx);
+  // If all the regs are reserved, return raw RegPressureSetLimit.
+  // One example is VRSAVERC in PowerPC.
+  // Avoid returning zero, RegisterClassInfo::getRegPressureSetLimit(Idx)
+  // assumes this returns non-zero value.
+  if (NAllocatableRegs == 0) {
+    LLVM_DEBUG({
+      dbgs() << "All registers of " << getRegClassName(RC) << " are reserved!";
+    });
+    return RegPressureSetLimit;
+  }
+  return RegPressureSetLimit - getRegClassWeight(RC).RegWeight * NReserved;
+}
+
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
 LLVM_DUMP_METHOD
 void TargetRegisterInfo::dumpReg(Register Reg, unsigned SubRegIndex,
diff --git a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
index d1c4459aaa6ee0..2bd89dacb2b37a 100644
--- a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
+++ b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
@@ -20,101 +20,101 @@ define void @jr_without_ra(ptr %rtwdev, ptr %chan, ptr %h2c, i8 %.pre, i1 %cmp.i
 ; CHECK-NEXT:    st.d $s6, $sp, 24 # 8-byte Folded Spill
 ; CHECK-NEXT:    st.d $s7, $sp, 16 # 8-byte Folded Spill
 ; CHECK-NEXT:    st.d $s8, $sp, 8 # 8-byte Folded Spill
-; CHECK-NEXT:    move $s7, $zero
-; CHECK-NEXT:    move $s0, $zero
+; CHECK-NEXT:    move $s6, $zero
+; CHECK-NEXT:    move $s1, $zero
 ; CHECK-NEXT:    ld.d $t0, $sp, 184
-; CHECK-NEXT:    ld.d $s2, $sp, 176
-; CHECK-NEXT:    ld.d $s1, $sp, 168
-; CHECK-NEXT:    ld.d $t1, $sp, 160
-; CHECK-NEXT:    ld.d $t2, $sp, 152
-; CHECK-NEXT:    ld.d $t3, $sp, 144
-; CHECK-NEXT:    ld.d $t4, $sp, 136
-; CHECK-NEXT:    ld.d $t5, $sp, 128
-; CHECK-NEXT:    ld.d $t6, $sp, 120
-; CHECK-NEXT:    ld.d $t7, $sp, 112
-; CHECK-NEXT:    ld.d $t8, $sp, 104
-; CHECK-NEXT:    ld.d $fp, $sp, 96
+; CHECK-NEXT:    ld.d $t1, $sp, 176
+; CHECK-NEXT:    ld.d $s2, $sp, 168
+; CHECK-NEXT:    ld.d $t2, $sp, 160
+; CHECK-NEXT:    ld.d $t3, $sp, 152
+; CHECK-NEXT:    ld.d $t4, $sp, 144
+; CHECK-NEXT:    ld.d $t5, $sp, 136
+; CHECK-NEXT:    ld.d $t6, $sp, 128
+; CHECK-NEXT:    ld.d $t7, $sp, 120
+; CHECK-NEXT:    ld.d $t8, $sp, 112
+; CHECK-NEXT:    ld.d $fp, $sp, 104
+; CHECK-NEXT:    ld.d $s0, $sp, 96
 ; CHECK-NEXT:    andi $a4, $a4, 1
-; CHECK-NEXT:    alsl.d $a6, $a6, $s1, 4
-; CHECK-NEXT:    pcalau12i $s1, %pc_hi20(.LJTI0_0)
-; CHECK-NEXT:    addi.d $s1, $s1, %pc_lo12(.LJTI0_0)
-; CHECK-NEXT:    slli.d $s3, $s2, 2
-; CHECK-NEXT:    alsl.d $s2, $s2, $s3, 1
-; CHECK-NEXT:    add.d $s2, $t5, $s2
-; CHECK-NEXT:    addi.w $s4, $zero, -41
+; CHECK-NEXT:    alsl.d $a6, $a6, $s2, 4
+; CHECK-NEXT:    pcalau12i $s2, %pc_hi20(.LJTI0_0)
+; CHECK-NEXT:    addi.d $s2, $s2, %pc_lo12(.LJTI0_0)
 ; CHECK-NEXT:    ori $s3, $zero, 1
-; CHECK-NEXT:    slli.d $s4, $s4, 3
-; CHECK-NEXT:    ori $s6, $zero, 3
-; CHECK-NEXT:    lu32i.d $s6, 262144
+; CHECK-NEXT:    ori $s4, $zero, 50
+; CHECK-NEXT:    ori $s5, $zero, 3
+; CHECK-NEXT:    lu32i.d $s5, 262144
 ; CHECK-NEXT:    b .LBB0_4
 ; CHECK-NEXT:    .p2align 4, , 16
 ; CHECK-NEXT:  .LBB0_1: # %sw.bb27.i.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ori $s8, $zero, 1
+; CHECK-NEXT:    ori $s7, $zero, 1
 ; CHECK-NEXT:  .LBB0_2: # %if.else.i106
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    alsl.d $s5, $s0, $s0, 3
-; CHECK-NEXT:    alsl.d $s0, $s5, $s0, 1
-; CHECK-NEXT:    add.d $s0, $t0, $s0
-; CHECK-NEXT:    ldx.bu $s8, $s0, $s8
+; CHECK-NEXT:    alsl.d $s8, $s1, $s1, 3
+; CHECK-NEXT:    alsl.d $s1, $s8, $s1, 1
+; CHECK-NEXT:    add.d $s1, $t0, $s1
+; CHECK-NEXT:    ldx.bu $s7, $s1, $s7
 ; CHECK-NEXT:  .LBB0_3: # %phy_tssi_get_ofdm_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    st.b $zero, $t5, 0
-; CHECK-NEXT:    st.b $s7, $t3, 0
-; CHECK-NEXT:    st.b $zero, $t8, 0
-; CHECK-NEXT:    st.b $zero, $t1, 0
-; CHECK-NEXT:    st.b $zero, $a1, 0
+; CHECK-NEXT:    st.b $zero, $t6, 0
+; CHECK-NEXT:    st.b $s6, $t4, 0
+; CHECK-NEXT:    st.b $zero, $fp, 0
 ; CHECK-NEXT:    st.b $zero, $t2, 0
-; CHECK-NEXT:    st.b $s8, $a5, 0
-; CHECK-NEXT:    ori $s0, $zero, 1
-; CHECK-NEXT:    move $s7, $a3
+; CHECK-NEXT:    st.b $zero, $a1, 0
+; CHECK-NEXT:    st.b $zero, $t3, 0
+; CHECK-NEXT:    st.b $s7, $a5, 0
+; CHECK-NEXT:    ori $s1, $zero, 1
+; CHECK-NEXT:    move $s6, $a3
 ; CHECK-NEXT:  .LBB0_4: # %for.body
 ; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
 ; CHECK-NEXT:    beqz $a4, .LBB0_9
 ; CHECK-NEXT:  # %bb.5: # %calc_6g.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    move $s7, $zero
+; CHECK-NEXT:    move $s6, $zero
 ; CHECK-NEXT:    bnez $zero, .LBB0_8
 ; CHECK-NEXT:  # %bb.6: # %calc_6g.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    slli.d $s8, $zero, 3
-; CHECK-NEXT:    ldx.d $s8, $s8, $s1
-; CHECK-NEXT:    jr $s8
+; CHECK-NEXT:    slli.d $s7, $zero, 3
+; CHECK-NEXT:    ldx.d $s7, $s7, $s2
+; CHECK-NEXT:    jr $s7
 ; CHECK-NEXT:  .LBB0_7: # %sw.bb12.i.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ori $s7, $zero, 1
+; CHECK-NEXT:    ori $s6, $zero, 1
 ; CHECK-NEXT:  .LBB0_8: # %if.else58.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    ldx.bu $s7, $a6, $s7
+; CHECK-NEXT:    ldx.bu $s6, $a6, $s6
 ; CHECK-NEXT:    b .LBB0_11
 ; CHECK-NEXT:    .p2align 4, , 16
 ; CHECK-NEXT:  .LBB0_9: # %if.end.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    andi $s7, $s7, 255
-; CHECK-NEXT:    ori $s5, $zero, 50
-; CHECK-NEXT:    bltu $s5, $s7, .LBB0_15
+; CHECK-NEXT:    andi $s6, $s6, 255
+; CHECK-NEXT:    bltu $s4, $s6, .LBB0_15
 ; CHECK-NEXT:  # %bb.10: # %if.end.i
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    sll.d $s7, $s3, $s7
-; CHECK-NEXT:    and $s8, $s7, $s6
-; CHECK-NEXT:    move $s7, $fp
-; CHECK-NEXT:    beqz $s8, .LBB0_15
+; CHECK-NEXT:    sll.d $s6, $s3, $s6
+; CHECK-NEXT:    and $s7, $s6, $s5
+; CHECK-NEXT:    move $s6, $s0
+; CHECK-NEXT:    beqz $s7, .LBB0_15
 ; CHECK-NEXT:  .LBB0_11: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT:    move $s8, $zero
-; CHECK-NEXT:    st.b $zero, $t7, 0
-; CHECK-NEXT:    ldx.b $ra, $s2, $t4
+; CHECK-NEXT:    move $s7, $zero
+; CHECK-NEXT:    st.b $zero, $t8, 0
+; CHECK-NEXT:    slli.d $s8, $t1, 2
+; CHECK-NEXT:    alsl.d $s8, $t1, $s8, 1
+; CHECK-NEXT:    add.d $s8, $t6, $s8
+; CHECK-NEXT:    ldx.b $s8, $s8, $t5
 ; CHECK-NEXT:    st.b $zero, $a2, 0
 ; CHECK-NEXT:    st.b $zero, $a7, 0
-; CHECK-NEXT:    st.b $zero, $t6, 0
-; CHECK-NEXT:    st.b $ra, $a0, 0
+; CHECK-NEXT:    st.b $zero, $t7, 0
+; CHECK-NEXT:    st.b $s8, $a0, 0
 ; CHECK-NEXT:    bnez $s3, .LBB0_13
 ; CHECK-NEXT:  # %bb.12: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
+; CHECK-NEXT:    addi.w $s8, $zero, -41
+; CHECK-NEXT:    slli.d $s8, $s8, 3
 ; CHECK-NEXT:    pcalau12i $ra, %pc_hi20(.LJTI0_1)
 ; CHECK-NEXT:    addi.d $ra, $ra, %pc_lo12(.LJTI0_1)
-; CHECK-NEXT:    ldx.d $s5, $s4, $ra
-; CHECK-NEXT:    jr $s5
+; CHECK-NEXT:    ldx.d $s8, $s8, $ra
+; CHECK-NEXT:    jr $s8
 ; CHECK-NEXT:  .LBB0_13: # %phy_tssi_get_ofdm_trim_de.exit
 ; CHECK-NEXT:    # in Loop: Header=BB0_4 Depth=1
 ; CHECK-NEXT:    bnez $s3, .LBB0_1
diff --git a/llvm/test/CodeGen/NVPTX/misched_func_call.ll b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
index e036753ce90306..ee6b5869111c6f 100644
--- a/llvm/test/CodeGen/NVPTX/misched_func_call.ll
+++ b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
@@ -17,7 +17,6 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT:    ld.param.u32 %r2, [wombat_param_0];
 ; CHECK-NEXT:    mov.b32 %r10, 0;
 ; CHECK-NEXT:    mov.u64 %rd1, 0;
-; CHECK-NEXT:    mov.b32 %r6, 1;
 ; CHECK-NEXT:  $L__BB0_1: // %bb3
 ; CHECK-NEXT:    // =>This Inner Loop Header: Depth=1
 ; CHECK-NEXT:    { // callseq 0, 0
@@ -29,16 +28,16 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT:    (
 ; CHECK-NEXT:    param0
 ; CHECK-NEXT:    );
+; CHECK-NEXT:    ld.param.f64 %fd1, [retval0];
+; CHECK-NEXT:    } // callseq 0
 ; CHECK-NEXT:    mul.lo.s32 %r7, %r10, %r3;
 ; CHECK-NEXT:    or.b32 %r8, %r4, %r7;
 ; CHECK-NEXT:    mul.lo.s32 %r9, %r2, %r8;
 ; CHECK-NEXT:    cvt.rn.f64.s32 %fd3, %r9;
-; CHECK-NEXT:    ld.param.f64 %fd1, [retval0];
-; CHECK-NEXT:    } // callseq 0
 ; CHECK-NEXT:    cvt.rn.f64.u32 %fd4, %r10;
 ; CHECK-NEXT:    add.rn.f64 %fd5, %fd4, %fd3;
 ; CHECK-NEXT:    st.global.f64 [%rd1], %fd5;
-; CHECK-NEXT:    mov.u32 %r10, %r6;
+; CHECK-NEXT:    mov.b32 %r10, 1;
 ; CHECK-NEXT:    bra.uni $L__BB0_1;
 bb:
   br label %bb3
diff --git a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
index fba410dc0dafce..7c8a5848b402f4 100644
--- a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
@@ -17,5 +17,4 @@ body: |
 ...
 
 # CHECK-DAG: AllocationOrder(GPRC) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r0 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
-# CHECK-DAG: AllocationOrder(F4RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
 # CHECK-DAG: AllocationOrder(GPRC_and_GPRC_NOR0) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
diff --git a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
index 584b6b0ad46dd9..3617b95b2a6af7 100644
--- a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
@@ -16,6 +16,5 @@ body: |
     $f1 = COPY %2
     BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $f1
 ...
-# CHECK-DAG: AllocationOrder(VFRC) = [ $vf2 $vf3 $vf4 $vf5 $vf0 $vf1 $vf6 $vf7 $vf8 $vf9 $vf10 $vf11 $vf12 $vf13 $vf14 $vf15 $vf16 $vf17 $vf18 $vf19 $vf31 $vf30 $vf29 $vf28 $vf27 $vf26 $vf25 $vf24 $vf23 $vf22 $vf21 $vf20 ]
 # CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
 # CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
diff --git a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
index 9a1b057c2e38d4..9d893b8dbebee2 100644
--- a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
+++ b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
@@ -1,7 +1,7 @@
 ; REQUIRES: asserts
-; RUN: llc -debug-only=regalloc < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
+; RUN: llc -debug-only=target-reg-info < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
 
-; DEBUG-COUNT-1:         AllocationOrder(VRSAVERC) = [ ]
+; DEBUG-COUNT-1: All registers of VRSAVERC are reserved!
 
 target triple = "powerpc64le-unknown-linux-gnu"
 
diff --git a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
index c35f05be304cce..ec2448cb3965f3 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
@@ -489,8 +489,9 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
 ; RV64-NEXT:    j .LBB0_11
 ; RV64-NEXT:  .LBB0_8: # %vector.ph
 ; RV64-NEXT:    # in Loop: Header=BB0_6 Depth=1
-; RV64-NEXT:    slli t6, t0, 28
-; RV64-NEXT:    sub t6, t6, t1
+; RV64-NEXT:    slli t6, t0, 1
+; RV64-NEXT:    slli s0, t0, 28
+; RV64-NEXT:    sub t6, s0, t6
 ; RV64-NEXT:    and t6, t6, a6
 ; RV64-NEXT:    csrwi vxrm, 0
 ; RV64-NEXT:    mv s0, a2
diff --git a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
index 7087041e8dace6..6d082802f9cd75 100644
--- a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
@@ -353,8 +353,8 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
 ; CHECK-NEXT:    push.w {r4, r5, r6, r7, r8, r9, r10, r11, lr}
 ; CHECK-NEXT:    .pad #4
 ; CHECK-NEXT:    sub sp, #4
-; CHECK-NEXT:    .vsave {d8, d9, d10, d11, d12, d13, d14, d15}
-; CHECK-NEXT:    vpush {d8, d9, d10, d11, d12, d13, d14, d15}
+; CHECK-NEXT:    .vsave {d8, d9, d10, d11, d12, d13}
+; CHECK-NEXT:    vpush {d8, d9, d10, d11, d12, d13}
 ; CHECK-NEXT:    .pad #16
 ; CHECK-NEXT:    sub sp, #16
 ; CHECK-NEXT:    mov lr, r0
@@ -364,50 +364,48 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
 ; CHECK-NEXT:  @ %bb.1: @ %for.cond2.preheader.lr.ph
 ; CHECK-NEXT:    movs r0, #1
 ; CHECK-NEXT:    cmp r2, #1
-; CHECK-NEXT:    csel r7, r2, r0, lt
+; CHECK-NEXT:    csel r3, r2, r0, lt
 ; CHECK-NEXT:    mov r12, r1
-; CHECK-NEXT:    mov r1, r7
-; CHECK-NEXT:    cmp r7, #3
+; CHECK-NEXT:    m...
[truncated]

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right. Two small changes below

Do you know why this doesn't have as many changes to the RISC-V tests as the RISCV-V specific patch?

@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Dec 5, 2024

Do you know why this doesn't have as many changes to the RISC-V tests as the RISCV-V specific patch?

I did a rough investigation, I think it is because we only changed the limit of GPRAll in that RISC-V specific patch, but there are some overlapped register pressure sets like GPRAll, SP, GPRTC, etc. We didn't remove reserved registers for these register pressure sets.

Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
…egPressureSetLimit directly

Created using spr 1.3.6-beta.1

unsigned NReserved = 0;
const BitVector Reserved = MF.getRegInfo().getReservedRegs();
for (MCPhysReg PhysReg : RC->getRawAllocationOrder(MF))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still the pre-existing bug where the pressure number is wrong for overlapping registers in the allocation order, and should probably be reporting number of distinct allocatable registers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already had a WIP patch to generate the mapping from pressure set to RegisterClass, I wil post it when it's ready.

}
}
assert(RC && "Failed to find register class");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason we drop the call to compute from computePSetLimit here? It looks like we're doing this instead:

unsigned NReserved = 0;
const BitVector Reserved = MF.getRegInfo().getReservedRegs();
for (MCPhysReg PhysReg : RC->getRawAllocationOrder(MF))
    if (Reserved.test(PhysReg))
      NReserved++;

It looks like this is related to what is in compute, but we seem to drop some costing logic.

Copy link
Contributor Author

@wangpc-pp wangpc-pp Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why we need a compute is because we need to compute the number of allocatable registers and then get the number of reserved registers (but actually I think we don't need to call compute explicitly as getNumAllocatableRegs is cached, it will call compute in get(RC)):

  // ...
  compute(RC);
  unsigned NAllocatableRegs = getNumAllocatableRegs(RC);
  // ...
  unsigned NReserved = RC->getNumRegs() - NAllocatableRegs;
  unsigned getNumAllocatableRegs(const TargetRegisterClass *RC) const {
    return get(RC).NumRegs;
  }

The logic of calculating NumRegs in compute is:

  // FIXME: Once targets reserve registers instead of removing them from the
  // allocation order, we can simply use begin/end here.
  ArrayRef<MCPhysReg> RawOrder = RC->getRawAllocationOrder(*MF);
  for (unsigned PhysReg : RawOrder) {
    // Remove reserved registers from the allocation order.
    if (Reserved.test(PhysReg))
      continue;
    uint8_t Cost = RegCosts[PhysReg];
    MinCost = std::min(MinCost, Cost);

    if (getLastCalleeSavedAlias(PhysReg) &&
        !STI.ignoreCSRForAllocationOrder(*MF, PhysReg))
      // PhysReg aliases a CSR, save it for later.
      CSRAlias.push_back(PhysReg);
    else {
      if (Cost != LastCost)
        LastCostChange = N;
      RCI.Order[N++] = PhysReg;
      LastCost = Cost;
    }
  }
  RCI.NumRegs = N + CSRAlias.size();

RCI.NumRegs is the total number of registers minus the number of reserved registers.
And in this patch, we change it to calculate the number of reserved registers first and calculate the allocatable registers later. The effect should be the same I think.

@@ -1327,47 +1327,6 @@ class HighRegisterPressureDetector {
void computePressureSetLimit(const RegisterClassInfo &RCI) {
for (unsigned PSet = 0; PSet < PSetNum; PSet++)
PressureSetLimit[PSet] = TRI->getRegPressureSetLimit(MF, PSet);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kasuga-fj! What do you think about this part? I just removed the code below as it seems to be unnecessary now. Related patch: #74807

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think just removing fixed registers is not enough here.

Copy link
Contributor

@kasuga-fj kasuga-fj Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for caring. I dared to replace it from RegisterClassInfo::getRegPressureSetLimit before (in #87312). This is because there were duplicate registers between what RegisterClassInfo::getRegPressureSetLimit removes and the fixed registers mentioned here. If RegisterClassInfo::getRegPressureSetLimit now takes care of both, I don't think it's a problem to remove this part as you did.

@wangpc-pp
Copy link
Contributor Author

I need more eyes on test diffs of other targets besides RISC-V! Please help to identify if there are some regressions!

@topperc
Copy link
Collaborator

topperc commented Dec 9, 2024

Is there a compile time impact for this patch?

@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Dec 9, 2024

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path.
cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

@topperc
Copy link
Collaborator

topperc commented Dec 9, 2024

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path. cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

The limits aren’t cached for passes like MachineLICM right? And it will be recomputed for each function? My understanding of RegisterClassInfo is that it maintains the cache across functions as long as they have the same the subtarget or something like that?

@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Dec 9, 2024

Is there a compile time impact for this patch?

This should increase some compile-time, but I don't know if it is significant. The limits are cached in RegisterClassInfo::getRegPressureSetLimit, so it is the same for the users of this API; the limits will be calculated once in these direct users of TargetRegisterInfo::getRegPressureSetLimit, so it is not a performance/time-critical code path. cc @nikic @dtcxzyw Can you help me to meassure the compile time impact?

The limits aren’t cached for passes like MachineLICM right? And it will be recomputed for each function? My understanding of RegisterClassInfo is that it maintains the cache across functions as long as they have the same the subtarget or something like that?

Yes, you are right! I will force these passes to use RegisterClassInfo as follow ups (Edit: done in #119194).

@topperc
Copy link
Collaborator

topperc commented Dec 18, 2024

I am going to make RegClassInfo a pass so that we can avoid duplicated calculation in RegClassInfo.

Should we just rename the TRI function to discourage use and encourage everyone to use RegClassInfo?

It may be not that easy. There are still some cases that need direct use of TRI::getRegPressureSetLimit like some hooks in ARM/PPC:

// For now we only care about float and double type fma.
unsigned VSSRCLimit = TRI->getRegPressureSetLimit(
*MBB->getParent(), PPC::RegisterPressureSets::VSSRC);

auto &P = RPTracker.getPressure().MaxSetPressure;
for (unsigned I = 0, E = P.size(); I < E; ++I)
if (P[I] > TRI->getRegPressureSetLimit(*MF, I)) {
return true;
}
return false;

In which, we can't get RegClassInfo if it is a pass (for now, we can re-calculate RegClassInfo).

Can't the pass just be a wrapper around RegClassInfo? Why do we need to remove the ability to use RegClassInfo as a utility?

@wangpc-pp
Copy link
Contributor Author

I am going to make RegClassInfo a pass so that we can avoid duplicated calculation in RegClassInfo.

Should we just rename the TRI function to discourage use and encourage everyone to use RegClassInfo?

It may be not that easy. There are still some cases that need direct use of TRI::getRegPressureSetLimit like some hooks in ARM/PPC:

// For now we only care about float and double type fma.
unsigned VSSRCLimit = TRI->getRegPressureSetLimit(
*MBB->getParent(), PPC::RegisterPressureSets::VSSRC);

auto &P = RPTracker.getPressure().MaxSetPressure;
for (unsigned I = 0, E = P.size(); I < E; ++I)
if (P[I] > TRI->getRegPressureSetLimit(*MF, I)) {
return true;
}
return false;

In which, we can't get RegClassInfo if it is a pass (for now, we can re-calculate RegClassInfo).

Can't the pass just be a wrapper around RegClassInfo? Why do we need to remove the ability to use RegClassInfo as a utility?

Aha! I never thought about that! Good idea!

wangpc-pp added a commit to wangpc-pp/llvm-project that referenced this pull request Dec 18, 2024
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm#118787
wangpc-pp added a commit to wangpc-pp/llvm-project that referenced this pull request Dec 18, 2024
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm#118787
wangpc-pp added a commit that referenced this pull request Dec 20, 2024
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from #118787
wangpc-pp added a commit that referenced this pull request Jan 3, 2025
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from #118787
wangpc-pp added a commit to wangpc-pp/llvm-project that referenced this pull request Jan 9, 2025
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm#118787
wangpc-pp added a commit that referenced this pull request Jan 9, 2025
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from #118787
Created using spr 1.3.6-beta.1
@wangpc-pp
Copy link
Contributor Author

Ping.

@topperc
Copy link
Collaborator

topperc commented Jan 10, 2025

Description needs to be updated if MachineLICM, MachineSink, MachinePipeliner have been migrated to RegisterClassInfo.

@topperc
Copy link
Collaborator

topperc commented Jan 10, 2025

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

@wangpc-pp
Copy link
Contributor Author

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

Just checked line by line, I have no idea why X86 has some changes...

@wangpc-pp
Copy link
Contributor Author

Do you know what caused the X86 changes? I don't see any uses of getRegPressureSetLimit in the X86 directory.

Just checked line by line, I have no idea why X86 has some changes...

The reason may be mentally absorbing (and costed me a lot of time on debugging...): For some RegisterClasss, getRawAllocationOrder may return different orders by OrderFunc (which is set by AltOrderSelect in TableGen). We calculate the number of reserved registers first, and then calculate the number of allocatable registers. This results in higher allocatable registers, bacause the alternative allocation orders may have less registers.
We change to calculate the number of allocatable registers directly and calculate the number of reserved registers from it, the problem can be solved.

Created using spr 1.3.6-beta.1
# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
# CHECK-DAG: AllocationOrder(G8RC) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x0 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these PPC changes, it is just because we have different code path now and the dumps are different.

github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…it` (#119830)

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm/llvm-project#118787
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…etLimit` (#119827)

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and
remove replicated code.

Separate from llvm/llvm-project#118787
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…0377)

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm/llvm-project#118787
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…(#120383)

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm/llvm-project#118787
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…it` (#119826)

`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.

It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.

Separate from llvm/llvm-project#118787
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants