Skip to content

Commit 84ec0ee

Browse files
stevesuzuki-armalexreinking
authored andcommitted
Improve performance of vector broadcast in SVE2
Modified codegen of vector broadcast in SVE2 to emit TBL ARM intrin instead of llvm.vector.insert. Fix performance test failure of nested_vectorization_gemm
1 parent 85ef5b5 commit 84ec0ee

File tree

1 file changed

+0
-13
lines changed

1 file changed

+0
-13
lines changed

src/CodeGen_ARM.cpp

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1991,19 +1991,6 @@ void CodeGen_ARM::visit(const Shuffle *op) {
19911991
value = insert_scalable_vector(padding, val_0, 0);
19921992
return;
19931993
}
1994-
} else if (op->is_broadcast()) {
1995-
// Undo simplification to avoid arbitrary-indexed shuffle
1996-
Expr equiv;
1997-
for (int f = 0; f < op->broadcast_factor(); ++f) {
1998-
if (equiv.defined()) {
1999-
equiv = Shuffle::make_concat({equiv, op->vectors[0]});
2000-
} else {
2001-
equiv = op->vectors[0];
2002-
}
2003-
}
2004-
equiv = common_subexpression_elimination(equiv);
2005-
value = codegen(equiv);
2006-
return;
20071994
}
20081995

20091996
CodeGen_Posix::visit(op);

0 commit comments

Comments
 (0)