Repository preparation
git clone https://github.com/boat1603/SuperAI_LLM_FineTune.git
cd ./SuperAI_LLM_FineTuneml Mamba
conda create -p ./env python=3.10.0 -y
conda activate ./env
pip install -e .ml Apptainer
apptainer build ./llm-finetune.sif docker://boat1603/llm-finetune:latestsbatch submit_multinode.shfor Apptainer
sbatch submit_multinode_apptainer.shNote:
- Change training config via
./smultinode.shor./smultinode_apptainer.sh(for apptainer). - When using Deepspeed training Scheduler will follow the Deepspeed config.
- You can setup training spec in
./submit_multinode.shorsubmit_multinode_apptainer.shfollowing our guideline.
sbatch ./submit_zero_to_fp32.sh