Code to fine-tune "famous" LLMs from scratch in Pytorch.
Currently, only GPT-2 models are supported.
# Load GPT2-models: "gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl"
from src.load_gpt2 import GPT2_model
model = GPT2_model("gpt2").to(device)
# Load pretrained version
model = GPT2_model.from_pretrained("gpt2").to(device)
You can choose from the following datasets:
- The Alpaca dataset is a synthetic dataset developed by Stanford researchers using OpenAI's davinci model to generate 52k instruction/output pairs.
wget https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
- The Alpaca-GPT4 dataset uses GPT-4, instead of text-davinci-003 (GPT-3), to answer the same prompts. As a result, it contains higher-quality and longer responses.
wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json
- The AlpaGasus dataset is a filtered version of Alpaca, containing only 9k high-quality examples. It has been shown to significantly outperform models trained on the original Alpaca, as evaluated by GPT-4.
wget https://raw.githubusercontent.com/gpt4life/alpagasus/main/data/filtered/chatgpt_9k.json
All the datasets are just a single JSON file, containing prompts in Alpaca style:
instruction: str, describes the task the model should perform.
Each of the 52K instructions is unique.
input: str, optional context or input for the task.
output: str, the answer to the instruction as generated by GPT-4.
Default features:
- Mixed precision with
bfloat16
for general operations andTF32
for matrix multiplication. - A learning rate scheduler with linear warmup followed by cosine annealing.
- Gradient accumalation.
- Low-Rank Adaptation (LoRA)
# Fine-tuning Hyperparameters
batch_size = 4
grad_accum_steps = 8
num_epochs = 3
lr = 5e-4
weight_decay = 0
warmup_ratio = 0.1
eval_freq = 10
save_steps = 200
use_lora = True
lora_r = 16
lora_alpha = 16
# Generation hyperparameters
max_new_tokens = 500
do_sample = True
temperature = 0.9
top_k = 40
top_p = 0.9
eos_id = 50256
use_lora = True
lora_r = 16
lora_alpha = 16