Skip to content

Commit e4d47a5

Browse files
zhangnjuadamlam2-amddanielholanda
authored
Add a playbook of llama factory fine-tuning (#70)
* add llama factory finetuning playbook * update llama factory playbook * update llama factory playbook * correct an typing error * mark pytorch setup as an optional step * add webUI tool info * updated some content * updated * update finetuning time * small updates * remove bitsandbytes and docker setup * add the dependency info * correct typo issue * ui and text formatting * ui * update playbook json file * add supported platforms --------- Co-authored-by: Adam Lam <adamlam2@amd.com> Co-authored-by: Daniel Holanda <holand.daniel@gmail.com>
1 parent ba022b4 commit e4d47a5

5 files changed

Lines changed: 190 additions & 0 deletions

File tree

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# LLM Fine-tuning with LLaMA Factory
2+
3+
## Overview
4+
5+
Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks. LLaMA Factory is an open-source and user-friendly platform that streamlines the training and fine-tuning of large language models (LLMs) and multimodal models. It allows users to customize hundreds of pre-trained models locally with minimal coding.
6+
7+
This playbook teaches you how to finetune LLMs using LLaMA Factory on your local AMD hardware.
8+
9+
## What you'll learn
10+
11+
- How to set up LLaMA Factory with ROCm support
12+
- How to configure LLM finetuning parameters (using Qwen/Qwen3-4B-Instruct-2507 as an example)
13+
- How to run LLaMA Factory finetuning
14+
- How to run inference with the fine-tuned model
15+
- How to export the fine-tuned model
16+
17+
## Estimated Time
18+
- Duration: It will take about 60 minutes to run this playbook (depending on your model/dataset size and network speed).
19+
- View the [LlaMA Factory GitHub](https://github.com/hiyouga/LlamaFactory) for more information.
20+
21+
## Setting up the Environment
22+
23+
### Installing Basic Dependencies
24+
<!-- @os:linux -->
25+
<!-- @require:rocm,pytorch,driver -->
26+
<!-- @os:end -->
27+
<!-- @os:windows -->
28+
<!-- @require:pytorch,driver -->
29+
<!-- @os:end -->
30+
31+
### Installing Additional Dependencies
32+
- **Python**: ensure minimum verison is 3.11
33+
```bash
34+
pip install huggingface_hub
35+
```
36+
37+
38+
### Install LLaMA Factory
39+
40+
LLaMA Factory depends on PyTorch. You should already have it installed per the above requirements.
41+
42+
Download the source code from [LLaMA Factory official GitHub repository](https://github.com/hiyouga/LlamaFactory), and install its dependencies.
43+
44+
```bash
45+
git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git
46+
cd LlamaFactory
47+
pip install -e .
48+
pip install -r requirements/metrics.txt
49+
```
50+
51+
Having successfully installed LLaMA Factory, let's run fine-tuning on it.
52+
53+
## Using LLaMA Factory CLI for Fine Tuning
54+
55+
This section will cover how to prepare finetuning datasets, configure LoRA/QLoRA parameters, and run LoRA finetuning.
56+
57+
### Dataset Preparation
58+
59+
LLaMA Factory supports finetuning datasets in the Alpaca format and ShareGPT format. All the available datasets have been defined in the [dataset_info.json](https://github.com/hiyouga/LlamaFactory/blob/main/data/dataset_info.json). If you are using a custom dataset, please make sure to add a dataset description in dataset_info.json and specify the dataset name before training. Details can be found in their docs [here](https://llamafactory.readthedocs.io/en/latest/getting_started/data_preparation.html).
60+
61+
In this playbook, we will use the identity and alpaca_en_demo datasets as an example, and configure the dataset information in the next step.
62+
63+
64+
### Finetuning parameter configuration
65+
66+
LLaMA Factory supports multiple finetuning schemes.
67+
68+
| Finetuning schemes | LLaMA Factory Examples |
69+
|-----------|------|
70+
| Full-Parameter | [examples/train_full](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_full) |
71+
| LoRA fine-tuning | [examples/train_lora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_lora) |
72+
| QLoRA fine-tuning | [examples/train_qlora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_qlora) |
73+
74+
75+
These example configuration files have specified model parameters, fine-tuning method parameters, dataset parameters, evaluation parameters, and more. You can configure them according to your own needs. In this playbook, we will use [qwen3_lora_sft.yaml](https://github.com/hiyouga/LlamaFactory/blob/main/examples/train_lora/qwen3_lora_sft.yaml).
76+
77+
**Key parameters explained:**
78+
- `model_name_or_path` - HuggingFace Model name or local model file path.
79+
- `stage` - Training stage. Options: rm(reward modeling), pt(pretrain), sft(Supervised Fine-Tuning), PPO, DPO, KTO, ORPO.
80+
- `do_train` - true for training, false for evaluation
81+
- `finetuning_type` - Fine-tuning method. Options: freeze, lora, full
82+
- `lora_rank` - The dimensionality of the low-rank matrix used in LoRA,Typical values: 4, 6, 8, 16 (smaller values = fewer parameters = faster fine-tuning; larger values = better task adaptation but higher resource usage).
83+
- `lora_target` - Target modules for LoRA method. Default: all.
84+
- `dataset` - Dataset(s) to use. Use “,” to separate multiple datasets
85+
- `output_dir` - File-tuning Output path
86+
- `logging_steps` - Logging interval in steps
87+
- `save_steps` - Model checkpoint saving interval.
88+
- `overwrite_output_dir` - Whether to allow overwriting the output directory.
89+
- `per_device_train_batch_size` - Training batch size per device.
90+
- `gradient_accumulation_steps` - Number of gradient accumulation steps.
91+
- `learning_rate` - Learning rate
92+
- `num_train_epochs` - Number of training epochs
93+
- `lr_scheduler_type` - Learning rate schedule. Options: linear, cosine, polynomial, constant, etc.
94+
- `warmup_ratio` - Learning rate warmup ratio
95+
96+
We will modify the default value of lora_rank to run fine-tuning on AMD GPUs.
97+
98+
```bash
99+
sed -i.bak 's/lora_rank: 8/lora_rank: 6/g' examples/train_lora/qwen3_lora_sft.yaml
100+
```
101+
102+
### Run LLaMA factory finetuning
103+
104+
**llamafactory-cli** is the official command-line interface (CLI) tool for LLaMA Factory,developed to simplify end-to-end LLM workflows (data preparation → fine-tuning → evaluation → deployment) without writing complex code.
105+
106+
For training/fine-tuning, **llamafactory-cli train** is the core subcommand of the LLaMA Factory CLI. It abstracts fine-tuning workflows (data preprocessing, hyperparameter tuning, hardware optimization) into a single CLI command, supporting multiple fine-tuning paradigms (LoRA/QLoRA/Full Fine-Tuning) and optimized for low-resource GPUs (e.g., QLoRA on 16GB VRAM).
107+
108+
You can run LLaMA Factory finetuning using the below command, which is based on the modified configuration file of Qwen3 LoRA finetuning.
109+
110+
```bash
111+
llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml
112+
```
113+
114+
After running LLM finetuning, output files can be found in the path of "output_dir", like the model checkpoint files, model configuration files,training metrics data files.
115+
116+
<p align="center">
117+
<img src="assets/qwen3_lora.png" alt="Qwen3 LoRA Fine-tuning" width="600"/>
118+
</p>
119+
120+
121+
### Test the fine-tuned model
122+
123+
**llamafactory-cli chat** is designed for interactive chat/inference with LLMs (both base models and LoRA-fine-tuned models). LLaMA Factory provides the sample configuration to run inference of fine-tuned models in [examples/inference](https://github.com/hiyouga/LlamaFactory/tree/main/examples/inference). You can also modify this sample configuration to change the settings such as the inference backend.
124+
125+
Use the following command to test Qwen3 finetuned model:
126+
127+
```bash
128+
llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml
129+
```
130+
An example chat using the finetuned model is shown below:
131+
132+
<p align="center">
133+
<img src="assets/qwen3_chat.png" alt="Test Qwen3 Finetuned model" width="600"/>
134+
</p>
135+
136+
137+
### Export the fine-tuned model
138+
139+
For production use-cases, the pre-trained model and the LoRA adapter need to be merged and exported into a single model. This merged model can be used as a normal HuggingFace model file. LLaMA Factory provides the sample configurations in [examples/merge_lora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/merge_lora).
140+
141+
Use the following command to export Qwen3 finetuned model:
142+
143+
```bash
144+
llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml
145+
```
146+
The result of exporting the finetuned model is shown below.
147+
148+
<p align="center">
149+
<img src="assets/qwen3_export.png" alt="Export Qwen3 Finetuned model " width="600"/>
150+
</p>
151+
152+
153+
## Using LLaMA Factory GUI
154+
LLaMA-Factory also supports zero-code fine-tuning of large language models through a web UI in the browser.
155+
156+
Use the following command to open it:
157+
158+
```bash
159+
llamafactory-cli webui
160+
```
161+
162+
163+
## Next Steps
164+
- Try different models such as gpt-oss and other state of the art models.
165+
- Experiment with different backends on the finetuned model
166+
167+
For more documentation, please visit: https://llamafactory.readthedocs.io/en/latest/
101 KB
Loading
169 KB
Loading
71.2 KB
Loading
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
{
2+
"id": "llama-factory-finetuning",
3+
"title": "LLM Fine-tuning with llama factory",
4+
"description": "Fine-tune large language models using Llama Factory and LoRA techniques on your STX Halo™",
5+
"time": 60,
6+
"supported_platforms": {
7+
"halo": [
8+
"linux"
9+
]
10+
},
11+
"tested_platforms": {
12+
"halo": [
13+
"linux"
14+
]
15+
},
16+
"platforms": ["linux"],
17+
"difficulty": "intermediate",
18+
"isNew": false,
19+
"isFeatured": false,
20+
"developed": false,
21+
"published": true,
22+
"tags": ["Llama Factory", "lora", "fine-tuning"]
23+
}

0 commit comments

Comments
 (0)