Open
Description
In my answer to a recent question on StackOverflow I noted that recode(a, pairs...)
and recode!(a, pairs...)
when a
is a Vector
are an order of magnitude slower than unwrap.(recode(a, pairs)
. This problem is even worse on master (perhaps because of #345). The following benchmark shows a slowdown of 3 orders of magnitude:
using BenchmarkTools, CategoricalArrays, Random
Random.seed!(596551)
a = CategoricalArray(rand(string.('X':'Z'), 100000))
@btime unwrap.(recode($a, "X"=>1, "Y"=>2, "Z"=>3));
@btime recode!($(similar(a, Int)), $a, "X"=>1, "Y"=>2, "Z"=>3);
@btime recode!($(similar(a, Int)), $(unwrap.(a)), "X"=>1, "Y"=>2, "Z"=>3);
@btime recode(unwrap.($a), "X"=>1, "Y"=>2, "Z"=>3);
Result:
178.212 μs (47 allocations: 1.15 MiB)
130.184 ms (1098374 allocations: 32.00 MiB)
155.115 ms (998374 allocations: 28.95 MiB)
154.162 ms (998378 allocations: 30.48 MiB)
Since recode!
also allocates a lot more than unwrap.(recode(a))
when dest
is a Vector
it seems reasonable to replace it with:
dest .= unwrap.(recode(src, default, pairs...))
for this case.
There is likely more room for optimization.
Metadata
Metadata
Assignees
Labels
No labels