Skip to content

Discrepancy with implementation and the paper  #60

Open
@AetherPrior

Description

@AetherPrior

Hi, I hope you are doing well. While going through your implementation on the pointer generator, I have noticed that there's a difference in the implementation of the p_gen calculation versus the formula mentioned in the paper.
I request some clarity as to why it has been implemented this way (if there is any advantage in doing so).

        y_t_1_embd = self.embedding(y_t_1)
        x = self.x_context(torch.cat((c_t_1, y_t_1_embd), 1))
        lstm_out, s_t = self.lstm(x.unsqueeze(1), s_t_1)

        h_decoder, c_decoder = s_t
        s_t_hat = torch.cat((h_decoder.view(-1, config.hidden_dim),
                             c_decoder.view(-1, config.hidden_dim)), 1)  # B x 2*hidden_dim
        c_t, attn_dist, coverage_next = self.attention_network(s_t_hat, encoder_outputs, encoder_feature,
                                                          enc_padding_mask, coverage)

        if self.training or step > 0:
            coverage = coverage_next

        p_gen = None
        if config.pointer_gen:
            p_gen_input = torch.cat((c_t, s_t_hat, x), 1)  # B x (2*2*hidden_dim + emb_dim)
            p_gen = self.p_gen_linear(p_gen_input)
            p_gen = F.sigmoid(p_gen)

From what I know, the p_gen takes in the context vector c_t , the s_t_hat and the input y_t_1 separately, but you've passed the concatenated input x .
I am attaching a screenshot from the original paper as a reference.
pointer_gen
From what I can see here, they are directly passing in the decoder input x_t into the sigmoid instead of concatenating the context vector with it.
In this line however,

x = self.x_context(torch.cat((c_t_1, y_t_1_embd), 1))

the context vector is being concatenated with the input, before being fed into the sigmoid function:

            p_gen_input = torch.cat((c_t, s_t_hat, x), 1)  # B x (2*2*hidden_dim + emb_dim)
            p_gen = self.p_gen_linear(p_gen_input)
            p_gen = F.sigmoid(p_gen)

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions