Skip to content

Add Transformer as embedding net #1413

@manuelgloeckler

Description

@manuelgloeckler

🚀 Feature Request

Transformers can be a flexible embedding network for general data modalities. We currently have permutation-invariant networks, whereas plain transformers are permutation equivariant (allowing support for exchangeable but not independent data). With suitable positional embeddings, this can also serve as a general embedding network.

Describe the solution you'd like

Todo so the following steps have to be completed:

  • Add a PyTorch transformer class here
  • Currently, all flows will need a "statically" size input. So, the output sequence of the transformer needs to be "pooled" into a single vector of fixed dimension. There are multiple ways to do this and this needs some testing/literature research on what we want to use as default (but multiple methods can be implemented).
  • Add tests

📌 Additional Context

Currently, other "sequence" models like the permutation-invariant networks support learning on sequences of different sizes in parallel using "nan"-padding. One can think of adding this support here, too (if not please add an additional issue).

The issue #1324 #218 does currently soft-block variable sequence lengths, but should not have an effect on this feature request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions