Skip to content

gelu_tanh should actually use tanh #640

@avik-pal

Description

@avik-pal

Motivation and description

Currently gelu_tanh uses sigmoid which prevents us from pattern matching and fusing the gelu into gemm calls for dense layers. See EnzymeAD/Reactant.jl#1420 for details. cc @wsmoses

Possible Implementation

Rename the current gelu_tanh to gelu_sigmoid. Re-implement gelu_tanh to follow the original paper implementation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions