Open
Description
edit: see #509, this is due to the relu
, not really the BatchNorm
julia> using Flux
julia> layer = BatchNorm(32, relu)
BatchNorm(32, relu) # 64 parameters, plus 64 non-trainable
julia> layer(NaN32*zeros(Float32, (32,1)))
32×1 Matrix{Float32}:
NaN
NaN
NaN
NaN
NaN
NaN
NaN
⋮
NaN
NaN
NaN
NaN
NaN
NaN
julia> gpu(layer)(gpu(NaN32*zeros(Float32, (32,1))))
32×1 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
0.0
0.0
0.0
0.0
0.0
0.0
0.0
⋮
0.0
0.0
0.0
0.0
0.0
0.0
edit: just saw I'm on Flux v0.12.10 so maybe this is outdated