Skip to content

Masking operation errornous #22

@zaxliu

Description

@zaxliu

Hi @JonathanRaiman , first off thanks for setting up this repo. Learned a lot from your implementation.

I have a question regarding the masking operation for variable length inputs. You mentioned in docs that

Elementwise mutliply this mask with your output, and then apply your objective function to this masked output. The error will be obtained everywhere, but will be zero in areas that were masked, yielding the correct error function.

But by doing so you only cuts off the back-propagation (BP) paths from masked outputs, whereas the BP paths from unmasked outputs via masked hidden units remain. The resulting loss function is still errornous.

I notice from other LSTM implementations (e.g. Lasagne) that the usual approach is to use the mask to switch between previous hidden states (if current input is masked) and computed hidden states (otherwise). In this way, both type of unwanted BP paths (masked outputs -> masked hidden -> weight & unmasked outputs -> masked hidden -> weight.)

Plz correct me if I'm wrong. Am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions