Skip to content

Add support for TPU devices #173

Open
@vfdev-5

Description

@vfdev-5

Clear and concise description of the problem

It would be good to provide an option to select accelerator as TPU instead of GPU
We can also auto-select TPU accelerator if open with Colab + add torch_xla installation steps.

What to do:
0) Try a template with TPUs. Choose distributed training option with 8 processes and spawning option. "Open in colab" one template, for example, vision classification template, install manually torch_xla (see https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) and run the code with backend xla-tpu: python main.py --nproc_per_node 8 --backend nccl. If everything is correctly done, training should probably run

  1. Update UI
  • Add a drop-out menu for backend selection: "nccl" and "xla-tpu" in "Training Options"
  • when user selects "xla-tpu", training should be only distributed with 8 processes and "Run the training with torch.multiprocessing.spawn".
  1. Update content: README.md and other impacted files
  2. if exported to Colab, we need to make sure that accelerator is "TPU"

Suggested solution

Alternative

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions