Conversation
|
We should think about improving inferability so it stops making unionings over wether or not it is going to Still this is a small union and doesn't really matter |
|
Yota failure is unrelated |
|
Without #718 |
|
Added some performance enhancements Before julia> x = rand(300,300);
julia> dx = OneElement(3.14, (231, 21), axes(x));
julia> @btime $x + $(collect(dx)); # Baseline
31.978 μs (2 allocations: 703.17 KiB)
julia> @btime $x + $dx;
61.043 μs (2 allocations: 703.17 KiB)
After: The orig version of this PR (and thus more or less what is in Zygote` was 2x slower than baseline, for accumulation, For allocation no change vs what in Zygote, it remains basically ∞x faster than baseline -- because it just constant folds out julia> @btime collect(OneElement(3.14, (231, 21), axes($x))); # approximate baseline
54.806 μs (2 allocations: 703.17 KiB)
julia> @btime OneElement(3.14, (231, 21), axes($x));
1.748 ns (0 allocations: 0 bytes)I think this is worth a the type instability of going to a small union |
|
bumping @mcabbott or @ToucheSir for a review |
|
One of the things I don't like about OneElement is that it's tied to Array. Somewhere I had a prototype which stored BroadcastStyle too, so that it remembers that you indexed e.g. a CuArray. Maybe worth mentioning that JuliaArrays/FillArrays.jl#235 added essentially the same struct that Zygote has to FillArrays. CR could consider getting it from there, although it's so simple that owning it is fine. I think FillArrays may overload some Sorry not really a review, a bit swamped. |
|
I have nothing intelligent to add on top of what Michael said, but it's cool that FillArrays has |
|
FillArrays is still pending in #46 I am happy merging this as is, with other stuff to come in follow-ups. |
|
Happy to get this landed first. Before I approve, does anything need to be done to make CI happy? |
|
ok, now everything we expect to be passing is passing |
I think this is maybe is a precondition for FluxML/Zygote.jl#1328
so that we know that deleting the Zygote rules will not cause performance regressions.
Still WIP as i chase down failing tests.