[SelectionDAG] Improve value type selection for inline asm within selected register class #135097

dlee992 · 2025-04-09T22:49:59Z

Summary

This patch improves value type selection for inline assembly when a specific register class is involved. It avoids generating imprecise load/store instructions and unnecessary type extensions or truncations. Previously, the first legal value type was always selected, but this patch chooses a more accurate type when possible.

Example (NVPTX)

In the NVPTX target, a new bf16 addition test previously generated ld.param.u16 instead of the expected ld.param.b16, due to always selecting the first value type in the list:

def Int16Regs : NVPTXRegClass<[i16, f16, bf16], 16, (add (sequence "RS%u", 0, 4))>;

Impact

This is a target-independent improvement and may benefit other targets in similar scenarios.

github-actions · 2025-04-09T22:50:21Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-04-09T22:50:53Z

@llvm/pr-subscribers-backend-nvptx

@llvm/pr-subscribers-llvm-selectiondag

Author: Da Li (李达) (dlee992)

Changes

Summary

This patch improves value type selection for inline assembly when a specific register class is involved. It avoids generating imprecise load/store instructions and unnecessary type extensions or truncations. Previously, the first legal value type was always selected, but this patch chooses a more accurate type when possible.

Example (NVPTX)

In the NVPTX target, a new bf16 addition test previously generated ld.param.u16 instead of the expected ld.param.b16, due to always selecting the first value type in the list:

def Int16Regs : NVPTXRegClass&lt;[i16, f16, bf16], 16, (add (sequence "RS%u", 0, 4))&gt;;

Impact

This is a target-independent improvement and may benefit other targets in similar scenarios.

Full diff: https://github.com/llvm/llvm-project/pull/135097.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+13-2)
(modified) llvm/test/CodeGen/NVPTX/bf16-instructions.ll (+17)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..528c8bd648332 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -9738,8 +9738,9 @@ getRegistersForValue(SelectionDAG &DAG, const SDLoc &DL,
   // register class, find it.
   unsigned AssignedReg;
   const TargetRegisterClass *RC;
+  const MVT RefValueVT = RefOpInfo.ConstraintVT;
   std::tie(AssignedReg, RC) = TLI.getRegForInlineAsmConstraint(
-      &TRI, RefOpInfo.ConstraintCode, RefOpInfo.ConstraintVT);
+      &TRI, RefOpInfo.ConstraintCode, RefValueVT);
   // RC is unset only on failure. Return immediately.
   if (!RC)
     return std::nullopt;
@@ -9747,7 +9748,17 @@ getRegistersForValue(SelectionDAG &DAG, const SDLoc &DL,
   // Get the actual register value type.  This is important, because the user
   // may have asked for (e.g.) the AX register in i32 type.  We need to
   // remember that AX is actually i16 to get the right extension.
-  const MVT RegVT = *TRI.legalclasstypes_begin(*RC);
+  MVT RegVT = *TRI.legalclasstypes_begin(*RC);
+
+  // If the reference value type is legal and belongs to the register class,
+  // use it instead of the first legal value type. This avoids generating
+  // inaccurate load/store instructions or unnecessary type extensions and
+  // truncations.
+  if (TLI.isTypeLegal(RefValueVT) &&
+      llvm::is_contained(llvm::make_range(TRI.legalclasstypes_begin(*RC),
+                                          TRI.legalclasstypes_end(*RC)),
+                         RefValueVT.SimpleTy))
+    RegVT = RefValueVT.SimpleTy;
 
   if (OpInfo.ConstraintVT != MVT::Other && RegVT != MVT::Untyped) {
     // If this is an FP operand in an integer register (or visa versa), or more
diff --git a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
index 9be54a746cacd..4db68f28b3c0a 100644
--- a/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
+++ b/llvm/test/CodeGen/NVPTX/bf16-instructions.ll
@@ -81,6 +81,23 @@ define bfloat @test_fadd(bfloat %0, bfloat %1) {
   ret bfloat %3
 }
 
+define bfloat @test_fadd_inlineasm(bfloat %0, bfloat %1) nounwind {
+; SM90-LABEL: test_fadd_inlineasm(
+; SM90:       {
+; SM90-NEXT:    .reg .b16 %rs<4>;
+; SM90-EMPTY:
+; SM90-NEXT:  // %bb.0:
+; SM90-NEXT:    ld.param.b16 %rs2, [test_fadd_inlineasm_param_0];
+; SM90-NEXT:    ld.param.b16 %rs3, [test_fadd_inlineasm_param_1];
+; SM90-NEXT:    // begin inline asm
+; SM90-NEXT:    add.rn.bf16 %rs1, %rs2, %rs3
+; SM90-NEXT:    // end inline asm
+; SM90-NEXT:    st.param.b16 [func_retval0], %rs1;
+; SM90-NEXT:    ret;
+  %3 = tail call bfloat asm "add.rn.bf16 $0, $1, $2", "=h,h,h"(bfloat %0, bfloat %1) nounwind
+  ret bfloat %3
+}
+
 define bfloat @test_fsub(bfloat %0, bfloat %1) {
 ; SM70-LABEL: test_fsub(
 ; SM70:       {

arsenm · 2025-04-10T09:26:23Z

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

+      llvm::is_contained(llvm::make_range(TRI.legalclasstypes_begin(*RC),
+                                          TRI.legalclasstypes_end(*RC)),
+                         RefValueVT.SimpleTy))


This is reinventing RC->isTypeLegalForClass. It should also be redundant to check isTypeLegal. I'd assume this looks something like

if (!isTypeLegalForClass()) RegVT = *TRI.legalclasstypes_begin(*RC)

arsenm · 2025-04-10T09:27:18Z

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

@@ -9738,16 +9738,27 @@ getRegistersForValue(SelectionDAG &DAG, const SDLoc &DL,
  // register class, find it.
  unsigned AssignedReg;
  const TargetRegisterClass *RC;
+  const MVT RefValueVT = RefOpInfo.ConstraintVT;


Keep using ConstraintVT directly?

dlee992 · 2025-04-10T16:52:27Z

Unfortunately, my PR breaks dozens of lit tests across 5 targets:
https://buildkite.com/llvm-project/github-pull-requests/builds/167049

I looked into one of the failures: LLVM.CodeGen/RISCV/inline-asm-d-constraint-f.ll, specifically the constraint_gpr function. My PR causes it to emit unnecessary fmv (bit-wise float-int move) instructions:

# %bb.0:
    fmv.d.x fa5, a0
    fmv.x.d a0, fa5
    #APP
    mv      a0, a0
    #NO_APP
    fmv.d.x fa5, a0
    fmv.x.d a0, fa5
    ret

This happens because of RISCV64's handling of ZDINX on RV64 — it intentionally allows f64 in GPRs and uses the first value type in the VTList for the register class.

Other cases, like some in AMDGPU, require some instruction reordering. And there are even more issues I haven't fully investigated yet.

My current thinking: this original behavior has been around for a long time, and public target maintainers either:

explicitly worked around it by tuning their RegisterInfo definitions and backend logic, or
simply haven’t encountered it yet due to a lack of inline asm test coverage.

So, I think this work might need to be put on hold for now. What do you think? @arsenm . BTW, thanks for your valueable comments!

arsenm · 2025-04-10T18:47:09Z

So, I think this work might need to be put on hold for now. What do you think? @arsenm . BTW, thanks for your valueable comments!

Most codegen changes require yak shaving in other targets. There's no "hold for now", you push through them or this change will never happen.

dlee992 added 2 commits April 9, 2025 17:30

patch

42a0aff

update tests for NVPTX

72f7612

llvmbot added backend:NVPTX llvm:SelectionDAG SelectionDAGISel as well labels Apr 9, 2025

arsenm reviewed Apr 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SelectionDAG] Improve value type selection for inline asm within selected register class #135097

[SelectionDAG] Improve value type selection for inline asm within selected register class #135097

dlee992 commented Apr 9, 2025

Uh oh!

github-actions bot commented Apr 9, 2025

Uh oh!

llvmbot commented Apr 9, 2025 •

edited

Loading

Summary

Example (NVPTX)

Impact

Uh oh!

arsenm Apr 10, 2025

Uh oh!

arsenm Apr 10, 2025

Uh oh!

dlee992 commented Apr 10, 2025

Uh oh!

arsenm commented Apr 10, 2025

Uh oh!

Uh oh!

[SelectionDAG] Improve value type selection for inline asm within selected register class #135097

Are you sure you want to change the base?

[SelectionDAG] Improve value type selection for inline asm within selected register class #135097

Conversation

dlee992 commented Apr 9, 2025

Summary

Example (NVPTX)

Impact

Uh oh!

github-actions bot commented Apr 9, 2025

Uh oh!

llvmbot commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example (NVPTX)

Impact

Uh oh!

arsenm Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

dlee992 commented Apr 10, 2025

Uh oh!

arsenm commented Apr 10, 2025

Uh oh!

Uh oh!

llvmbot commented Apr 9, 2025 •

edited

Loading