Skip to content

GPU Middle Class?  #2161

Closed
Closed
@EugenHotaj

Description

@EugenHotaj

Does torchtune have any plans to support "GPU middle class" users?

We're trying to evaluate using torchtune for post-training, especially since there are many useful features implemented (RLHF, LORA, etc). However, one big sticking point is that the system seems heavily geared towards single-node training. Are there plans to support multi-node training (e.g. 16-64 nodes) and things like model parallelism, 128k context training, etc?

If not, is torchtitan the recommended system to use?

Thanks!

Metadata

Metadata

Assignees

Labels

discussionStart a discussiondistributedAnything related to distributed env (multi-GPU, multi-node)triagedThis issue has been assigned an owner and appropriate label

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions