Skip to content

Commit 7f171fe

Browse files
committed
adding exercise
1 parent 68edecb commit 7f171fe

File tree

1 file changed

+34
-14
lines changed

1 file changed

+34
-14
lines changed

docs/DL_exercises.md

Lines changed: 34 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -47,10 +47,10 @@
4747

4848
???+ question "Submit a Slurm job"
4949

50-
- Close the [cifar10 resnet repository](https://github.com/akamaster/pytorch_resnet_cifar10?tab=readme-ov-file) and edit the run.sh by adding appropriate slurm sbatch commands.
50+
- Submit `sentiment_analysis.py` with appropriate flags for its slurm job.
5151

5252
??? tip "Answer"
53-
- edit a file using you prefered editor, named `my_bio_worksflow.sh`, for example, with the content
53+
- edit a file using you prefered editor, named `sentiment_analysis_batch.sh`, for example, with the content
5454
5555
```bash
5656
#!/bin/bash -l
@@ -59,33 +59,53 @@
5959
#SBATCH -p node
6060
#SBATCH -N 1
6161
#SBATCH -t 01:00:00
62-
#SBATCH -J cifar_demo
62+
#SBATCH -J sentiment_analysis
6363
#SBATCH -M snowy
6464
#SBATCH --gres=gpu:1
6565

66-
module load python_ML_packages/3.9.5-gpu
66+
source .....
6767

6868
python -c "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.get_device_properties(0)); print(torch.randn(1).cuda())"
6969

70-
#for model in resnet20 resnet32 resnet44 resnet56 resnet110 resnet1202
71-
for model in resnet20 resnet110
72-
do
73-
echo "python -u trainer.py --arch=$model --save-dir=save_$model |& tee -a log_$model"
74-
python -u trainer.py --arch=$model --save-dir=save_$model |& tee -a log_$model
75-
done
70+
echo "running sentiment_analysis.py"
7671

72+
python .....sentiment_analysis.py
7773
```
7874

79-
- make the job script executable
75+
- make the job script executable, if not already.
8076
```bash
81-
$ chmod a+x run.sh
77+
$ chmod a+x sentiment_analysis_batch.sh
8278
```
8379
8480
- submit the job
8581
```bash
86-
$ sbatch run.sh
82+
$ sbatch sentiment_analysis_batch.sh
8783
```
88-
84+
85+
- Similarly run `sentiment_analysis.ipynb` jupyter notebook. Add matplotblib and seaborn to your environment. Start a jupyter server on snowy compute node and then tunnel to the host from your local.
86+
87+
??? tip "Answer"
88+
89+
* Install matplotlib and seaborn and start an interactive snowy session:
90+
91+
```console
92+
source torch_env/bin/acivate
93+
pip install matplotlib seaborn
94+
interactive -A uppmax2025-3-5 -M snowy -p node -N 1 -t 1:01:00 --gres=gpu:1
95+
source torch_env/bin/acivate
96+
jupyter notebook --ip 0.0.0.0 --no-browser
97+
```
98+
* Then tunnel:
99+
`ssh -L 8888:s123:8888 [email protected]`
100+
101+
* Copy the localhost url and paste it in your browser
102+
103+
104+
### Cache management
105+
106+
* By default HF will install the models and temp files to yur `$HOME` folde which is rather limited to 32 GB and 300k files.
107+
* To avoid that, you can set `HF_HOME` and `HF_HUB_CACHE` variables to your project folder. Follow instructions on https://huggingface.co/docs/transformers/en/installation?cpu-only=PyTorch#cache-directory.
108+
89109
<!-- ## Doing installations
90110
91111

0 commit comments

Comments
 (0)