Open
Description
Describe the enhancement requested
There were some concerns after the following PR was merged on the performance about it:
I've tested with both previous case and new case with the following results:
Previously to the PR using numpy:
In [2]: a = pa.array(np.arange(0, 2_000_000))
In [3]: %timeit a[slice(1, 1_000_000, 2)]
763 μs ± 4.28 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
After the PR using Python object:
In [2]: a = pa.array(np.arange(0, 2_000_000))
In [3]: %timeit a[slice(1, 1_000_000, 2)]
25.7 ms ± 703 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
We should probably use np.arange
if numpy is available otherwise revert to using a Python object.
We could also potentially add the possibility for Array::Slice to have a step attribute being able to remove some of the custom code and just use step internally.
Component(s)
Python