Skip to content

Commit c66ffcf

Browse files
authored
gh-129987: Selectively re-enable SLP autovectorization of _PyEval_EvalFrameDefault (#132530)
Only disable SLP autovectorization of `_PyEval_EvalFrameDefault` on newer GCCs, as the optimization bug seems to exist only on GCC 12 and later, and before GCC 9 disabling the optimization has a dramatic performance impact.
1 parent 0879ebc commit c66ffcf

File tree

1 file changed

+8
-4
lines changed

1 file changed

+8
-4
lines changed

Diff for: Python/ceval.c

+8-4
Original file line numberDiff line numberDiff line change
@@ -948,11 +948,15 @@ _PyObjectArray_Free(PyObject **array, PyObject **scratch)
948948
#include "generated_cases.c.h"
949949
#endif
950950

951-
#if (defined(__GNUC__) && !defined(__clang__)) && defined(__x86_64__)
951+
#if (defined(__GNUC__) && __GNUC__ >= 10 && !defined(__clang__)) && defined(__x86_64__)
952952
/*
953-
* gh-129987: The SLP autovectorizer can cause poor code generation for opcode
954-
* dispatch, negating any benefit we get from vectorization elsewhere in the
955-
* interpreter loop.
953+
* gh-129987: The SLP autovectorizer can cause poor code generation for
954+
* opcode dispatch in some GCC versions (observed in GCCs 12 through 15,
955+
* probably caused by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115777),
956+
* negating any benefit we get from vectorization elsewhere in the
957+
* interpreter loop. Disabling it significantly affected older GCC versions
958+
* (prior to GCC 9, 40% performance drop), so we have to selectively disable
959+
* it.
956960
*/
957961
#define DONT_SLP_VECTORIZE __attribute__((optimize ("no-tree-slp-vectorize")))
958962
#else

0 commit comments

Comments
 (0)