[WIP][AMDGPU] Fix emitting illegal COPY #131752

PankajDwivedi-25 · 2025-03-18T08:09:25Z

Shader kernel calling convention expects the return value to be in SGPR, instruction selection introduced COPY which is illegal/wrong here. Instead of a copy, it should insert readfirstlane?

bb.0 (%ir-block.0):
liveins: $sgpr0, $sgpr1
%1:sgpr_32 = COPY $sgpr1
%0:sgpr_32 = COPY $sgpr0
%3:sreg_32 = S_MOV_B32 2147483647
%5:vgpr_32 = COPY %0:sgpr_32
%6:vgpr_32 = COPY %1:sgpr_32
%4:vgpr_32 = V_BFI_B32_e64 killed %3:sreg_32, %5:vgpr_32, %6:vgpr_32, implicit $exec
$sgpr0 = COPY %4:vgpr_32
SI_RETURN_TO_EPILOG $sgpr0

llvmbot · 2025-03-18T08:09:58Z

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-amdgpu

Author: Pankaj Dwivedi (PankajDwivedi-25)

Changes

Shader kernel calling convention expects the return value to be in SGPR, instruction selection introduced COPY which is illegal/wrong here. Instead of a copy, it should insert readfirstlane?

> bb.0 (%ir-block.0):
> liveins: $sgpr0, $sgpr1
> %1:sgpr_32 = COPY $sgpr1
> %0:sgpr_32 = COPY $sgpr0
> %3:sreg_32 = S_MOV_B32 2147483647
> %5:vgpr_32 = COPY %0:sgpr_32
> %6:vgpr_32 = COPY %1:sgpr_32
> %4:vgpr_32 = V_BFI_B32_e64 killed %3:sreg_32, %5:vgpr_32, %6:vgpr_32, implicit $exec
> $sgpr0 = COPY %4:vgpr_32
> SI_RETURN_TO_EPILOG $sgpr0

Full diff: https://github.com/llvm/llvm-project/pull/131752.diff

1 Files Affected:

(added) llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll (+25)

diff --git a/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
new file mode 100644
index 0000000000000..6b14a660f580d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; XFAIL: *
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx1100 -verify-machineinstrs | FileCheck %s -check-prefixes=GCN
+
+define amdgpu_ps i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+  %sign = fpext bfloat %sign.bf16 to float
+  %op = call float @llvm.copysign.f32(float %mag, float %sign)
+  %cast = bitcast float %op to i32
+  ret i32 %cast
+}
+
+; define i32 @s_copysign_f32_bf16(float %mag, bfloat %sign.bf16) {
+;   %sign = fpext bfloat %sign.bf16 to float
+;   %op = call float @llvm.copysign.f32(float %mag, float %sign)
+;   %cast = bitcast float %op to i32
+;   ret i32 %cast
+; }
+
+; define i32 @s_copysign_f32_bf16(float inreg %mag, bfloat inreg %sign.bf16) {
+;   %sign = fpext bfloat %sign.bf16 to float
+;   %op = call float @llvm.copysign.f32(float %mag, float %sign)
+;   %cast = bitcast float %op to i32
+;   %readlane = call i32 @llvm.amdgcn.readfirstlane(i32 %cast)
+;   ret i32 %readlane
+; }
\ No newline at end of file

shiltian · 2025-03-24T18:14:54Z

llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp

@@ -899,6 +899,55 @@ bool SIFixSGPRCopies::lowerSpecialCase(MachineInstr &MI,
                                      MI, MI.getDebugLoc())) {
      I = std::next(I);
      MI.eraseFromParent();
+    } else {
+      // At this point, if we still have a VGPR → SGPR copy, it is completely


Not a good idea to use → in the comment. -> is preferred.

shiltian · 2025-03-24T18:17:51Z

llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp

@@ -899,6 +899,55 @@ bool SIFixSGPRCopies::lowerSpecialCase(MachineInstr &MI,
                                      MI, MI.getDebugLoc())) {
      I = std::next(I);
      MI.eraseFromParent();
+    } else {
+      // At this point, if we still have a VGPR → SGPR copy, it is completely
+      // illegal. We assume that it was intentionally introduced and should be


Is it safe to assume? For example, #130443 can also introduce VGPR to SGPR copy but that doesn't seem to be the case here.

This fix can only fix one case IIUC so the comment needs to be adapted.

I'd go with no. I don't want to make semantic guesses in this pass, and require inserting these up front where required. We're already suffering from the assumed usage of SCC in this pass

I'd go with no. I don't want to make semantic guesses in this pass, and require inserting these up front where required. We're already suffering from the assumed usage of SCC in this pass

Agree with you.
This COPY itself is completely wrong at this point, Typically, if a value needs to be moved from a VGPR to an SGPR, instruction selection (ISel) should emit readfirstlane instead of a direct COPY.

It arises at many places after enabling the AMDGPUUniformIntrinsicCombine pass into llc pipeline PR: #128687, which optimizes certain intrinsic by propagating uniform values directly to their uses.

Is it safe to assume? For example, #130443 can also introduce VGPR to SGPR copy but that doesn't seem to be the case here

I think it is similar case here as llvm/test/CodeGen/AMDGPU/fix-illegal-copy.ll?
Is it wrong to replace those COPY emitted with readfirstlane here in llvm/test/CodeGen/AMDGPU/issue130443.ll?

Fix emitting illegal COPY

52e2176

PankajDwivedi-25 requested review from arsenm and cdevadas March 18, 2025 08:09

llvmbot added the backend:AMDGPU label Mar 18, 2025

PankajDwivedi-25 added the llvm:SelectionDAG SelectionDAGISel as well label Mar 18, 2025

replace VGPR to SGPR COPY with readfirstlane

487fdd1

shiltian reviewed Mar 24, 2025

View reviewed changes

add issue 130443 test

fc9de33

PankajDwivedi-25 mentioned this pull request Apr 11, 2025

[AMDGPU] Insert readfirstlane in the function returns in sgpr. #135326

Merged

PankajDwivedi-25 marked this pull request as draft April 11, 2025 10:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][AMDGPU] Fix emitting illegal COPY #131752

[WIP][AMDGPU] Fix emitting illegal COPY #131752

PankajDwivedi-25 commented Mar 18, 2025

Uh oh!

llvmbot commented Mar 18, 2025 •

edited

Loading

Uh oh!

shiltian Mar 24, 2025

Uh oh!

shiltian Mar 24, 2025

Uh oh!

arsenm Mar 25, 2025

Uh oh!

PankajDwivedi-25 Mar 25, 2025 •

edited

Loading

Uh oh!

PankajDwivedi-25 Mar 26, 2025

Uh oh!

Uh oh!

[WIP][AMDGPU] Fix emitting illegal COPY #131752

Are you sure you want to change the base?

[WIP][AMDGPU] Fix emitting illegal COPY #131752

Conversation

PankajDwivedi-25 commented Mar 18, 2025

Uh oh!

llvmbot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shiltian Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

shiltian Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

PankajDwivedi-25 Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PankajDwivedi-25 Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Mar 18, 2025 •

edited

Loading

PankajDwivedi-25 Mar 25, 2025 •

edited

Loading