Migrate tensor parallelism code to use OSLO

**Is your feature request related to a problem? Please describe.**
Would be good to remove the megatron tensor parallelism code from NeoX, and [OSLO](https://github.com/tunib-ai/oslo) currently has support for this, and a slightly nicer interface.

**Describe the solution you'd like**

Steps:

- [ ] Rewrite all current modules as plain pytorch implementations, removing the `mpu` dependency from any internal code as much as possible. (so, anything that's currently an `mpu.[Column|Row]ParallelLinear` or `mpu.VocabParallelEmbedding` should be replaced with its plain pytorch equivalent (`nn.Linear` / `nn.Embedding` respectively).
- [ ] Write a [mapping](https://github.com/tunib-ai/oslo/blob/master/oslo/pytorch/model_parallelism/utils/mappings.py#L38) for neox modules, which oslo uses to handle parallelization.
- [ ] Ensure backwards compatibility



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate tensor parallelism code to use OSLO #578

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Migrate tensor parallelism code to use OSLO #578

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions