Add Transformer as embedding net

## 🚀 Feature Request

Transformers can be a flexible embedding network for general data modalities. We currently have permutation-invariant networks, whereas plain transformers are permutation equivariant (allowing support for exchangeable but not independent data). With suitable positional embeddings, this can also serve as a general embedding network.


### **Describe the solution you'd like**

Todo so the following steps have to be completed:
- [ ] Add a PyTorch transformer class [here](https://github.com/sbi-dev/sbi/tree/e063c58e2ebf4f438b840c2904c7ff5c1b597e76/sbi/neural_nets/embedding_nets)
- [ ] Currently, all flows will need a "statically" size input. So, the output sequence of the transformer needs to be "pooled" into a single vector of fixed dimension. There are multiple ways to do this and this needs some testing/literature research on what we want to use as default (but multiple methods can be implemented).
- [ ] Add tests 



### **📌 Additional Context**

Currently, other "sequence" models like the permutation-invariant networks support learning on sequences of different sizes in parallel using "nan"-padding. One can think of adding this support here, too (if not please add an additional issue).

The issue  #1324 #218 does currently soft-block variable sequence lengths, but should not have an effect on this feature request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Transformer as embedding net #1413

🚀 Feature Request

Describe the solution you'd like

📌 Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Transformer as embedding net #1413

Description

🚀 Feature Request

Describe the solution you'd like

📌 Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions