|
| 1 | +# LLM Fine-tuning with LLaMA Factory |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks. LLaMA Factory is an open-source and user-friendly platform that streamlines the training and fine-tuning of large language models (LLMs) and multimodal models. It allows users to customize hundreds of pre-trained models locally with minimal coding. |
| 6 | + |
| 7 | +This playbook teaches you how to finetune LLMs using LLaMA Factory on your local AMD hardware. |
| 8 | + |
| 9 | +## What you'll learn |
| 10 | + |
| 11 | +- How to set up LLaMA Factory with ROCm support |
| 12 | +- How to configure LLM finetuning parameters (using Qwen/Qwen3-4B-Instruct-2507 as an example) |
| 13 | +- How to run LLaMA Factory finetuning |
| 14 | +- How to run inference with the fine-tuned model |
| 15 | +- How to export the fine-tuned model |
| 16 | + |
| 17 | +## Estimated Time |
| 18 | +- Duration: It will take about 60 minutes to run this playbook (depending on your model/dataset size and network speed). |
| 19 | +- View the [LlaMA Factory GitHub](https://github.com/hiyouga/LlamaFactory) for more information. |
| 20 | + |
| 21 | +## Setting up the Environment |
| 22 | + |
| 23 | +### Installing Basic Dependencies |
| 24 | +<!-- @os:linux --> |
| 25 | +<!-- @require:rocm,pytorch,driver --> |
| 26 | +<!-- @os:end --> |
| 27 | +<!-- @os:windows --> |
| 28 | +<!-- @require:pytorch,driver --> |
| 29 | +<!-- @os:end --> |
| 30 | + |
| 31 | +### Installing Additional Dependencies |
| 32 | +- **Python**: ensure minimum verison is 3.11 |
| 33 | +```bash |
| 34 | +pip install huggingface_hub |
| 35 | +``` |
| 36 | + |
| 37 | + |
| 38 | +### Install LLaMA Factory |
| 39 | + |
| 40 | +LLaMA Factory depends on PyTorch. You should already have it installed per the above requirements. |
| 41 | + |
| 42 | +Download the source code from [LLaMA Factory official GitHub repository](https://github.com/hiyouga/LlamaFactory), and install its dependencies. |
| 43 | + |
| 44 | +```bash |
| 45 | +git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git |
| 46 | +cd LlamaFactory |
| 47 | +pip install -e . |
| 48 | +pip install -r requirements/metrics.txt |
| 49 | +``` |
| 50 | + |
| 51 | +Having successfully installed LLaMA Factory, let's run fine-tuning on it. |
| 52 | + |
| 53 | +## Using LLaMA Factory CLI for Fine Tuning |
| 54 | + |
| 55 | +This section will cover how to prepare finetuning datasets, configure LoRA/QLoRA parameters, and run LoRA finetuning. |
| 56 | + |
| 57 | +### Dataset Preparation |
| 58 | + |
| 59 | +LLaMA Factory supports finetuning datasets in the Alpaca format and ShareGPT format. All the available datasets have been defined in the [dataset_info.json](https://github.com/hiyouga/LlamaFactory/blob/main/data/dataset_info.json). If you are using a custom dataset, please make sure to add a dataset description in dataset_info.json and specify the dataset name before training. Details can be found in their docs [here](https://llamafactory.readthedocs.io/en/latest/getting_started/data_preparation.html). |
| 60 | + |
| 61 | +In this playbook, we will use the identity and alpaca_en_demo datasets as an example, and configure the dataset information in the next step. |
| 62 | + |
| 63 | + |
| 64 | +### Finetuning parameter configuration |
| 65 | + |
| 66 | +LLaMA Factory supports multiple finetuning schemes. |
| 67 | + |
| 68 | +| Finetuning schemes | LLaMA Factory Examples | |
| 69 | +|-----------|------| |
| 70 | +| Full-Parameter | [examples/train_full](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_full) | |
| 71 | +| LoRA fine-tuning | [examples/train_lora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_lora) | |
| 72 | +| QLoRA fine-tuning | [examples/train_qlora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/train_qlora) | |
| 73 | + |
| 74 | + |
| 75 | +These example configuration files have specified model parameters, fine-tuning method parameters, dataset parameters, evaluation parameters, and more. You can configure them according to your own needs. In this playbook, we will use [qwen3_lora_sft.yaml](https://github.com/hiyouga/LlamaFactory/blob/main/examples/train_lora/qwen3_lora_sft.yaml). |
| 76 | + |
| 77 | +**Key parameters explained:** |
| 78 | +- `model_name_or_path` - HuggingFace Model name or local model file path. |
| 79 | +- `stage` - Training stage. Options: rm(reward modeling), pt(pretrain), sft(Supervised Fine-Tuning), PPO, DPO, KTO, ORPO. |
| 80 | +- `do_train` - true for training, false for evaluation |
| 81 | +- `finetuning_type` - Fine-tuning method. Options: freeze, lora, full |
| 82 | +- `lora_rank` - The dimensionality of the low-rank matrix used in LoRA,Typical values: 4, 6, 8, 16 (smaller values = fewer parameters = faster fine-tuning; larger values = better task adaptation but higher resource usage). |
| 83 | +- `lora_target` - Target modules for LoRA method. Default: all. |
| 84 | +- `dataset` - Dataset(s) to use. Use “,” to separate multiple datasets |
| 85 | +- `output_dir` - File-tuning Output path |
| 86 | +- `logging_steps` - Logging interval in steps |
| 87 | +- `save_steps` - Model checkpoint saving interval. |
| 88 | +- `overwrite_output_dir` - Whether to allow overwriting the output directory. |
| 89 | +- `per_device_train_batch_size` - Training batch size per device. |
| 90 | +- `gradient_accumulation_steps` - Number of gradient accumulation steps. |
| 91 | +- `learning_rate` - Learning rate |
| 92 | +- `num_train_epochs` - Number of training epochs |
| 93 | +- `lr_scheduler_type` - Learning rate schedule. Options: linear, cosine, polynomial, constant, etc. |
| 94 | +- `warmup_ratio` - Learning rate warmup ratio |
| 95 | + |
| 96 | +We will modify the default value of lora_rank to run fine-tuning on AMD GPUs. |
| 97 | + |
| 98 | +```bash |
| 99 | +sed -i.bak 's/lora_rank: 8/lora_rank: 6/g' examples/train_lora/qwen3_lora_sft.yaml |
| 100 | +``` |
| 101 | + |
| 102 | +### Run LLaMA factory finetuning |
| 103 | + |
| 104 | +**llamafactory-cli** is the official command-line interface (CLI) tool for LLaMA Factory,developed to simplify end-to-end LLM workflows (data preparation → fine-tuning → evaluation → deployment) without writing complex code. |
| 105 | + |
| 106 | +For training/fine-tuning, **llamafactory-cli train** is the core subcommand of the LLaMA Factory CLI. It abstracts fine-tuning workflows (data preprocessing, hyperparameter tuning, hardware optimization) into a single CLI command, supporting multiple fine-tuning paradigms (LoRA/QLoRA/Full Fine-Tuning) and optimized for low-resource GPUs (e.g., QLoRA on 16GB VRAM). |
| 107 | + |
| 108 | +You can run LLaMA Factory finetuning using the below command, which is based on the modified configuration file of Qwen3 LoRA finetuning. |
| 109 | + |
| 110 | +```bash |
| 111 | +llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml |
| 112 | +``` |
| 113 | + |
| 114 | +After running LLM finetuning, output files can be found in the path of "output_dir", like the model checkpoint files, model configuration files,training metrics data files. |
| 115 | + |
| 116 | +<p align="center"> |
| 117 | + <img src="assets/qwen3_lora.png" alt="Qwen3 LoRA Fine-tuning" width="600"/> |
| 118 | +</p> |
| 119 | + |
| 120 | + |
| 121 | +### Test the fine-tuned model |
| 122 | + |
| 123 | +**llamafactory-cli chat** is designed for interactive chat/inference with LLMs (both base models and LoRA-fine-tuned models). LLaMA Factory provides the sample configuration to run inference of fine-tuned models in [examples/inference](https://github.com/hiyouga/LlamaFactory/tree/main/examples/inference). You can also modify this sample configuration to change the settings such as the inference backend. |
| 124 | + |
| 125 | +Use the following command to test Qwen3 finetuned model: |
| 126 | + |
| 127 | +```bash |
| 128 | +llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml |
| 129 | +``` |
| 130 | +An example chat using the finetuned model is shown below: |
| 131 | + |
| 132 | +<p align="center"> |
| 133 | + <img src="assets/qwen3_chat.png" alt="Test Qwen3 Finetuned model" width="600"/> |
| 134 | +</p> |
| 135 | + |
| 136 | + |
| 137 | +### Export the fine-tuned model |
| 138 | + |
| 139 | +For production use-cases, the pre-trained model and the LoRA adapter need to be merged and exported into a single model. This merged model can be used as a normal HuggingFace model file. LLaMA Factory provides the sample configurations in [examples/merge_lora](https://github.com/hiyouga/LlamaFactory/tree/main/examples/merge_lora). |
| 140 | + |
| 141 | +Use the following command to export Qwen3 finetuned model: |
| 142 | + |
| 143 | +```bash |
| 144 | +llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml |
| 145 | +``` |
| 146 | +The result of exporting the finetuned model is shown below. |
| 147 | + |
| 148 | +<p align="center"> |
| 149 | + <img src="assets/qwen3_export.png" alt="Export Qwen3 Finetuned model " width="600"/> |
| 150 | +</p> |
| 151 | + |
| 152 | + |
| 153 | +## Using LLaMA Factory GUI |
| 154 | +LLaMA-Factory also supports zero-code fine-tuning of large language models through a web UI in the browser. |
| 155 | + |
| 156 | +Use the following command to open it: |
| 157 | + |
| 158 | +```bash |
| 159 | +llamafactory-cli webui |
| 160 | +``` |
| 161 | + |
| 162 | + |
| 163 | +## Next Steps |
| 164 | +- Try different models such as gpt-oss and other state of the art models. |
| 165 | +- Experiment with different backends on the finetuned model |
| 166 | + |
| 167 | +For more documentation, please visit: https://llamafactory.readthedocs.io/en/latest/ |
0 commit comments