It would be great to add a few transformer architectures. This will also help us to prioritize which op and module implementations to add next.
It probably makes sense to port models from Hugging Face, so we can load/convert weights from their hub.
It also might make sense to create an extra sub-project for transformer models, where we can also put shared modules/helpers.
It would be great to add a few transformer architectures. This will also help us to prioritize which op and module implementations to add next.
It probably makes sense to port models from Hugging Face, so we can load/convert weights from their hub.
It also might make sense to create an extra sub-project for transformer models, where we can also put shared modules/helpers.