Refactor Transformer model

Currently the Transformer is not really implemented as it should. We should revisit to implement it like the in original Transformer paper; including always training for predicting next sample (like language models), and calling the encoder+decoder in auto-regressive ways when producing forecasts. See: Attention Is All You Need 

 

Note from @pennfranc :
This current implementation is fully functional and can already produce some good predictions. However,        it is still limited in how it uses the Transformer architecture because the `tgt` input of        `torch.nn.Transformer` is not utlized to its full extent. Currently, we simply pass the last value of the        `src` input to `tgt`. To get closer to the way the Transformer is usually used in language models, we        should allow the model to consume its own output as part of the `tgt` argument, such that when predicting        sequences of values, the input to the `tgt` argument would grow as outputs of the transformer model would be        added to it. Of course, the training of the model would have to be adapted accordingly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor Transformer model #601

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor Transformer model #601

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions