This guide covers the full pipeline of fine-tuning a language model using LLaMA-Factory, and deploying it with FastAPI for serving inference.
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factorymkdir -p /root/autodl-tmp/conda/pkgs
conda config --add pkgs_dirs /root/autodl-tmp/conda/pkgs
mkdir -p /root/autodl-tmp/conda/envs
conda config --add envs_dirs /root/autodl-tmp/conda/envsconda create -n llama-factory python=3.10 -y
conda activate llama-factory
pip install -e ".[torch,metrics]"llamafactory-cli webuimkdir -p /root/autodl-tmp/Hugging-Face
export HF_HOME=/root/autodl-tmp/Hugging-Face
pip install -U huggingface_hub
huggingface-cli download --resume-download <your-model-name>- Place your training data in the
datadirectory:
LLaMA-Factory/data/your_data.json- Update the dataset configuration
dataset_info.json:
"your_data": {
"file_name": "your_data.json"
}- Launch the WebUI:
llamafactory-cli webui- In the WebUI:
- Set your model path to the unique hash inside the downloaded model folder.
- Select
your_dataas your training dataset. - Configure your training parameters as needed.
- Click Start Training.
After training completes:
mkdir -p Models/<your-model-name>-merged- In the WebUI:
- Set the export path accordingly.
- Click Start Export.
conda create -n fastapi python=3.10 -y
conda activate fastapi
conda install -c conda-forge fastapi uvicorn transformers pytorch -y
pip install safetensors sentencepiece protobufmkdir App
cd App
touch main.py test.py- Paste your FastAPI app code into
main.py. - Paste your test script into
test.py(modify as needed for your use case).
uvicorn main:app --reload --host 0.0.0.0In a new terminal, run the test script:
python test.py├── LLaMA-Factory/
│ ├── data/
│ │ ├── your_data.json
│ │ └── dataset_info.json
├── Models/
│ └── your-model-name-merged/
├── App/
│ ├── main.py
│ └── test.py
- Adjust paths according to your environment setup.
- Ensure that ports are open for API access if running on a remote server.
- You can use
nohuporscreenfor long-running services.
This project combines open-source tools. Please refer to each respective repository for licensing details.