[X86] Merge insertsubvector(load(p0),load_subv(p0),hi) -> subvbroadcast(p0) if either load is oneuse #128857

RKSimon · 2025-02-26T11:09:27Z

This fold is currently limited to cases where the load_subv(p0) has oneuse, but its beneficial if either load has oneuse and will be replaced.

Yet another yak shave for #122671

…st(p0) if either load is oneuse This fold is currently limited to cases where the load_subv(p0) has oneuse, but its beneficial if either load has oneuse and will be replaced. Yes another yak shave for llvm#122671

llvmbot · 2025-02-26T11:10:00Z

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

This fold is currently limited to cases where the load_subv(p0) has oneuse, but its beneficial if either load has oneuse and will be replaced.

Yes another yak shave for #122671

Full diff: https://github.com/llvm/llvm-project/pull/128857.diff

2 Files Affected:

(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+3-2)
(modified) llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll (+4-8)

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 84aaf86550842..2791021ee0f10 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -58562,8 +58562,9 @@ static SDValue combineINSERT_SUBVECTOR(SDNode *N, SelectionDAG &DAG,
 
   // If we're splatting the lower half subvector of a full vector load into the
   // upper half, attempt to create a subvector broadcast.
-  if (IdxVal == (OpVT.getVectorNumElements() / 2) && SubVec.hasOneUse() &&
-      Vec.getValueSizeInBits() == (2 * SubVec.getValueSizeInBits())) {
+  if (IdxVal == (OpVT.getVectorNumElements() / 2) &&
+      Vec.getValueSizeInBits() == (2 * SubVec.getValueSizeInBits()) &&
+      (Vec.hasOneUse() || SubVec.hasOneUse())) {
     auto *VecLd = dyn_cast<LoadSDNode>(Vec);
     auto *SubLd = dyn_cast<LoadSDNode>(SubVec);
     if (VecLd && SubLd &&
diff --git a/llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll b/llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll
index c1dba071b4353..b6cbddd667d8e 100644
--- a/llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll
+++ b/llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-7.ll
@@ -10210,11 +10210,9 @@ define void @store_i8_stride7_vf64(ptr %in.vecptr0, ptr %in.vecptr1, ptr %in.vec
 ; AVX512BW-NEXT:    vpshufb %ymm28, %ymm18, %ymm23
 ; AVX512BW-NEXT:    vporq %ymm2, %ymm23, %ymm2
 ; AVX512BW-NEXT:    vinserti64x4 $1, %ymm2, %zmm1, %zmm1
-; AVX512BW-NEXT:    vmovdqa64 (%rdx), %zmm2
-; AVX512BW-NEXT:    vinserti64x4 $1, %ymm14, %zmm2, %zmm2
+; AVX512BW-NEXT:    vbroadcasti64x4 {{.*#+}} zmm2 = mem[0,1,2,3,0,1,2,3]
 ; AVX512BW-NEXT:    vpshufb %zmm20, %zmm2, %zmm2
-; AVX512BW-NEXT:    vmovdqa64 (%rcx), %zmm20
-; AVX512BW-NEXT:    vinserti64x4 $1, %ymm15, %zmm20, %zmm20
+; AVX512BW-NEXT:    vbroadcasti64x4 {{.*#+}} zmm20 = mem[0,1,2,3,0,1,2,3]
 ; AVX512BW-NEXT:    vpshufb %zmm22, %zmm20, %zmm20
 ; AVX512BW-NEXT:    vporq %zmm2, %zmm20, %zmm2
 ; AVX512BW-NEXT:    vpermq {{.*#+}} zmm1 = zmm1[2,3,2,3,6,7,6,7]
@@ -10816,11 +10814,9 @@ define void @store_i8_stride7_vf64(ptr %in.vecptr0, ptr %in.vecptr1, ptr %in.vec
 ; AVX512DQ-BW-NEXT:    vpshufb %ymm28, %ymm18, %ymm23
 ; AVX512DQ-BW-NEXT:    vporq %ymm2, %ymm23, %ymm2
 ; AVX512DQ-BW-NEXT:    vinserti64x4 $1, %ymm2, %zmm1, %zmm1
-; AVX512DQ-BW-NEXT:    vmovdqa64 (%rdx), %zmm2
-; AVX512DQ-BW-NEXT:    vinserti64x4 $1, %ymm14, %zmm2, %zmm2
+; AVX512DQ-BW-NEXT:    vbroadcasti64x4 {{.*#+}} zmm2 = mem[0,1,2,3,0,1,2,3]
 ; AVX512DQ-BW-NEXT:    vpshufb %zmm20, %zmm2, %zmm2
-; AVX512DQ-BW-NEXT:    vmovdqa64 (%rcx), %zmm20
-; AVX512DQ-BW-NEXT:    vinserti64x4 $1, %ymm15, %zmm20, %zmm20
+; AVX512DQ-BW-NEXT:    vbroadcasti64x4 {{.*#+}} zmm20 = mem[0,1,2,3,0,1,2,3]
 ; AVX512DQ-BW-NEXT:    vpshufb %zmm22, %zmm20, %zmm20
 ; AVX512DQ-BW-NEXT:    vporq %zmm2, %zmm20, %zmm2
 ; AVX512DQ-BW-NEXT:    vpermq {{.*#+}} zmm1 = zmm1[2,3,2,3,6,7,6,7]

phoebewang · 2025-02-26T13:06:55Z

llvm/lib/Target/X86/X86ISelLowering.cpp

-      Vec.getValueSizeInBits() == (2 * SubVec.getValueSizeInBits())) {
+  if (IdxVal == (OpVT.getVectorNumElements() / 2) &&
+      Vec.getValueSizeInBits() == (2 * SubVec.getValueSizeInBits()) &&
+      (Vec.hasOneUse() || SubVec.hasOneUse())) {


Does Vec have oneuse matter? Can't the loaded value be used more than once?

Let me see if I can get this to always replace uses of subvec - then we can get rid of the oneuse checks entirely

The oneuse checks are still preventing some regressions due to shouldReduceLoadWidth being so permissive, but I've at least ensured that any other uses of SubVec are replaced now.

phoebewang · 2025-02-27T03:22:41Z

llvm/lib/Target/X86/X86ISelLowering.cpp

+      SDValue NewSubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, SubVecVT,
+                                      BcastLd, DAG.getVectorIdxConstant(0, dl));
+      DCI.CombineTo(SubLd, NewSubVec, BcastLd.getValue(1));


This only required when SubVec has multiple user. Do we need to add a condition check?

we don't need the check, oneuse cases will be handled fine, but it might slightly improve perf

its easier if we always call it, just in case there's weird cases like only Subvec's chain is still in use.

phoebewang

LGTM.

KanRobert

LGTM

…st(p0) if either load is oneuse (llvm#128857) This fold is currently limited to cases where the load_subv(p0) has oneuse, but its beneficial if either load has oneuse and will be replaced. Yet another yak shave for llvm#122671

RKSimon requested review from phoebewang and KanRobert February 26, 2025 11:09

llvmbot added the backend:X86 label Feb 26, 2025

phoebewang reviewed Feb 26, 2025

View reviewed changes

RKSimon added 2 commits February 26, 2025 14:57

Ensure ensure all subvector uses are replaced

5b93950

Merge branch 'upstream' into x86-insertvector-subvbroadcast-oneuse

f1798e8

phoebewang reviewed Feb 27, 2025

View reviewed changes

phoebewang approved these changes Feb 27, 2025

View reviewed changes

KanRobert approved these changes Feb 27, 2025

View reviewed changes

RKSimon merged commit 0e3ba99 into llvm:main Feb 27, 2025
11 checks passed

RKSimon deleted the x86-insertvector-subvbroadcast-oneuse branch February 27, 2025 09:49

huaatian mentioned this pull request Feb 28, 2025

fix live interval empty issue huaatian/llvm-project#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] Merge insertsubvector(load(p0),load_subv(p0),hi) -> subvbroadcast(p0) if either load is oneuse #128857

[X86] Merge insertsubvector(load(p0),load_subv(p0),hi) -> subvbroadcast(p0) if either load is oneuse #128857

Uh oh!

RKSimon commented Feb 26, 2025 •

edited

Loading

Uh oh!

llvmbot commented Feb 26, 2025

Uh oh!

phoebewang Feb 26, 2025

Uh oh!

RKSimon Feb 26, 2025

Uh oh!

RKSimon Feb 26, 2025 •

edited

Loading

Uh oh!

phoebewang Feb 27, 2025

Uh oh!

RKSimon Feb 27, 2025

Uh oh!

RKSimon Feb 27, 2025

Uh oh!

phoebewang left a comment

Uh oh!

KanRobert left a comment

Uh oh!

Uh oh!

Uh oh!

[X86] Merge insertsubvector(load(p0),load_subv(p0),hi) -> subvbroadcast(p0) if either load is oneuse #128857

[X86] Merge insertsubvector(load(p0),load_subv(p0),hi) -> subvbroadcast(p0) if either load is oneuse #128857

Uh oh!

Conversation

RKSimon commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 26, 2025

Uh oh!

phoebewang Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phoebewang Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

phoebewang left a comment

Choose a reason for hiding this comment

Uh oh!

KanRobert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

RKSimon commented Feb 26, 2025 •

edited

Loading

RKSimon Feb 26, 2025 •

edited

Loading