PyTorch tensor load/store with GDS
This code can be run in a container with PyTorch built with GDS support. For e.g.:
FROM nvidia/cuda:12.8.0-devel-ubuntu22.04
RUN apt update
RUN apt install cmake -y
RUN apt install build-essential -y
RUN apt install git -y
RUN apt install python3-dev python3-pip -y
# PyTorch now has prebuilt GDS support in the nightly
RUN pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128