Open
Description
Clear and concise description of the problem
It would be good to provide an option to select accelerator as TPU instead of GPU
We can also auto-select TPU accelerator if open with Colab + add torch_xla installation steps.
What to do:
0) Try a template with TPUs. Choose distributed training option with 8 processes and spawning option. "Open in colab" one template, for example, vision classification template, install manually torch_xla (see https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) and run the code with backend xla-tpu
: python main.py --nproc_per_node 8 --backend nccl
. If everything is correctly done, training should probably run
- Update UI
- Add a drop-out menu for backend selection: "nccl" and "xla-tpu" in "Training Options"
- when user selects "xla-tpu", training should be only distributed with 8 processes and "Run the training with torch.multiprocessing.spawn".
- Update content: README.md and other impacted files
- if exported to Colab, we need to make sure that accelerator is "TPU"