-
-
Couldn't load subscription status.
- Fork 129
Open
Description
Motivation and description
Currently gelu_tanh uses sigmoid which prevents us from pattern matching and fusing the gelu into gemm calls for dense layers. See EnzymeAD/Reactant.jl#1420 for details. cc @wsmoses
Possible Implementation
Rename the current gelu_tanh to gelu_sigmoid. Re-implement gelu_tanh to follow the original paper implementation
Metadata
Metadata
Assignees
Labels
No labels