Skip to content

One-layer MLP Possibly Missing #3

@ni9elf

Description

@ni9elf

The attention layer works directly on the GRU embeddings (denoted by h_it in the HAN paper) in the call function of the AttentionLayer. In the paper description, h_it should be fed to a one-layer MLP with a tanh activation to obtain u_it by u_it = tanh(W.h_it + b). The attention weights are then computed on u_it. Is this happening in the code and I have missed it out, or has this been (intentionally) left out? Please clarify.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions