-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Thank you for your interest in Grape 🍇.
Please note that since part of Grape 🍇 involves communicating with the NVIDIA GPU kernel module via the /proc filesystem (which, to my best knowledge, cannot be handled easily with Docker), all of the following steps are done natively (i.e., outside the Docker environment). We also assume that the OS is either Ubuntu 20.04 or 22.04.
-
Checkout Grape's 🍇 source code:
git clone https://github.com/UofT-EcoSystem/Grape-MICRO56-Artifact
-
Make sure that common software dependencies are installed properly:
./scripts/Installation/0-install_build_essentials.sh
-
Install our customized NVIDIA GPU driver and then reboot the machine:
./scripts/Installation/1-install_NVIDIA_GPU_driver.sh sudo reboot
When the machine is rebooted, make sure that the message
NVRM: loading customized kernel module from Grapeappears when running the commandsudo dmesg. If it does not, reinstall the GPU driver and then reboot again:# Note the `--reinstall` option. ./scripts/Installation/1-install_NVIDIA_GPU_driver.sh --reinstall sudo reboot -
Install CUDA:
./scripts/Installation/2-install_CUDA.sh
-
Build PyTorch:
./scripts/Installation/3-build_PyTorch.sh
-
Checkout the HuggingFace Transformers submodule (no building or installation is required):
git submodule update --init submodules/transformers
-
Finally, use the
activatescript to modify the environment variables accordingly:source scripts/Installation/activate
The script
./scripts/Experiment_Workflow/1-test_metadata_compression.shruns the experiments that compress CUDA graphs' memory regions and calculate the compression ratios for different models. At the end of the experiments, the results are dumped into a CSV file named "metadata_compression.csv" and visualized as follows:
Model Original Size Compressed Size
GPT-2 ___ ___
GPT-J ___ ___
Wav2Vec2 ___ ___
./scripts/Experiment_Workflow/2-test_runtime_performance.sh --model=gpt2
./scripts/Experiment_Workflow/2-test_runtime_performance.sh --model=gptj
./scripts/Experiment_Workflow/2-test_runtime_performance.sh --model=wav2vec2runs the experiments that measure the runtime performance of different models (under 3 different settings, namely Baseline, PtGraph, and Grape as described in the paper). At the end of the experiments, the results are dumped into a CSV file named "speedometer.csv" and visualized as follows:
Name Attrs Avg Std min Median MAX
Baseline {"Model": "GPT-2"} ___ ___ ___ ___ ___
PtGraph {"Model": "GPT-2"} ___ ___ ___ ___ ___
Grape {"Model": "GPT-2"} ___ ___ ___ ___ ___
...
Last edited on 2023/8/7.