Skip to content

[Feature] Include Support for Trainium and Inferentia #2106

@gbladislau

Description

@gbladislau

Feature request

As of today, Oumi doesn't seem to support AI specific chips from AWS, namely Traininum and Inferentia.

They don't show as available GPUs, and the process of training is given to the CPU in these cases.

Motivation / references

Enabling support for these can make Oumi more open for different cloud training GPU options.

Relevant documentation for this is available here probably here as well, and also at huggingface.

Your contribution

I can try to help in this matter, but I'm not a huge expert in this area, and it would involve some expert in the training code from transformers which seems to be the main train path (at least for SFT).

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureenhancementNew feature or requesttriageThis issue needs review by the core team.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions