Description
Currently series.values
and especially series.to_cupy()
are substantially slower than cupy.asarray(series)
.
In [2]: s = cudf.Series(range(10000))
In [3]: %timeit s.values
81.4 µs ± 1.68 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [4]: %timeit cp.asarray(s)
19.1 µs ± 168 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [5]: %timeit s.to_cupy()
349 µs ± 75.2 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
There are at least two obvious potential culprits in Frame._to_array
(the underlying method for to_cupy
):
- It always performs an extra allocation, even when
copy=False
. - It performs dtype inference using
find_common_dtype
, which is slow (and slower forDataFrame
s with many columns):
In [11]: df = cudf.DataFrame({'a': [1], 'b': [3.], 'c': ['a']})
In [12]: %timeit cudf.utils.dtypes.find_common_type([col.dtype for col in df._data.values()])
53.6 µs ± 530 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [13]: df = cudf.DataFrame({'a': [1], 'b': [3.]})
In [14]: %timeit cudf.utils.dtypes.find_common_type([col.dtype for col in df._data.values()])
39.8 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
The implementation of values
drops down to ColumnBase.values
and requires some deeper consideration. However, since we use .values
frequently internally (and we occasionally use to_cupy
) we are likely giving up a lot of performance. We should profile these functions to determine the bottlenecks, and if there are valid reasons for them we should establish some policies on how to select the right function to use when performing these conversions to arrays internally. While this exact analogy does not hold for DataFrame
(because that doesn't support the conversion to an array), any optimization that we make for Series
will likely also help speed up DataFrame
operations.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status