Description
There is a desire to specialize the remaining binary operations (including binary subscript).
However adding more and more specialized instructions is likely to make performance worse.
This idea is to have a lookup table of types pairs and function pointers. This is less efficient than inlining the code, but more extensible.
A single instruction can then support up to 256 specializations.
This will only work for immutable classes.
struct table_entry {
PyTypeObject *left;
PyTypeObject *left;
binaryfunc *func;
};
TARGET(BINARY_OP_TABLE) {
PyObject *lhs = SECOND();
PyObject *rhs = TOP();
Cache *cache = GET_CACHE();
struct table_entry* entry = &THE_TABLE[cache->table_index];
DEOPT_IF(Py_TYPE(lhs) != entry->left);
DEOPT_IF(Py_TYPE(rhs) != entry->right);
PyObject *res = entry->func(lhs, rhs);
if (res == NULL) {
goto error;
}
STACK_SHRINK(1);
Py_DECREF(lhs);
Py_DECREF(rhs);
SET_TOP(res);
DISPATCH();
}
An ancillary mapping of (left, right) -> index
will be needed for efficient specialization.
It is probably worth keeping the most common operations int + int
, float + float
, etc. inline.
We can replace BINARY_SUBSCR
with BINARY_OP ([])
to allow effective specialization of BINARY_SUBSCR
E.g. subscripting array.array[int]
can be handled with the registration mechanism described below.
Registering binary functions at runtime
Linked PRs
- gh-100239: specialize long tail of binary operations #128722
- gh-100239: specialize bitwise logical binary ops on ints #128927
- gh-100239: Specialize concatenation of lists and tuples #128956
- gh-100239: Handle NaN and zero division in guards #128963
- gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR #129379
- gh-100239: specialize left and right shift ops on ints #129431
- gh-100239: replace BINARY_SUBSCR & family by BINARY_OP with oparg NB_SUBSCR #129700