This directory contains training scripts and instructions for reproducing the VisCoder-3B and VisCoder-7B models using ms-swift.
conda create -n swift python=3.10 -y
conda activate swift
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .
sh requirements/install_all.sh
pip install flash-attn -U --no-build-isolation
# Optional: for logging
pip install wandbDownload the VisCode-200K dataset:
huggingface-cli download TIGER-Lab/VisCode-200K\
--repo-type=dataset --resume-download --local-dir databash train_viscoder_3b.sh
bash train_viscoder_7b.shEach script launches full fine-tuning with DeepSpeed and FlashAttention, using Qwen2.5-Coder as the base model.
To start training quickly using ms-swift, you may need to remove the default dataset config:
rm ms-swift/swift/llm/dataset/data/dataset_info.json
echo "[]" > ms-swift/swift/llm/dataset/data/dataset_info.jsonFor detailed training options, refer to the ms-swift CLI documentation.