Skip to content

YugAjmera/fine-tune-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-tune LLMs from Scratch

Code to fine-tune "famous" LLMs from scratch in Pytorch.

Load a model

Currently, only GPT-2 models are supported.

# Load GPT2-models: "gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl"
from src.load_gpt2 import GPT2_model
model = GPT2_model("gpt2").to(device)

# Load pretrained version
model = GPT2_model.from_pretrained("gpt2").to(device)

Instruction Fine-tuning on Alpaca or variants

You can choose from the following datasets:

  1. The Alpaca dataset is a synthetic dataset developed by Stanford researchers using OpenAI's davinci model to generate 52k instruction/output pairs.
wget https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json
  1. The Alpaca-GPT4 dataset uses GPT-4, instead of text-davinci-003 (GPT-3), to answer the same prompts. As a result, it contains higher-quality and longer responses.
wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json
  1. The AlpaGasus dataset is a filtered version of Alpaca, containing only 9k high-quality examples. It has been shown to significantly outperform models trained on the original Alpaca, as evaluated by GPT-4.
wget https://raw.githubusercontent.com/gpt4life/alpagasus/main/data/filtered/chatgpt_9k.json

Alpaca style

All the datasets are just a single JSON file, containing prompts in Alpaca style:

instruction: str, describes the task the model should perform. 
                  Each of the 52K instructions is unique.
input:       str, optional context or input for the task.
output:      str, the answer to the instruction as generated by GPT-4.

Training (fine_tune.py)

Default features:

  • Mixed precision with bfloat16 for general operations and TF32 for matrix multiplication.
  • A learning rate scheduler with linear warmup followed by cosine annealing.
  • Gradient accumalation.
  • Low-Rank Adaptation (LoRA)
# Fine-tuning Hyperparameters
batch_size = 4
grad_accum_steps = 8
num_epochs = 3
lr = 5e-4
weight_decay = 0
warmup_ratio = 0.1
eval_freq = 10
save_steps = 200
use_lora = True
lora_r = 16
lora_alpha = 16

Inference (chat.py)

# Generation hyperparameters
max_new_tokens = 500
do_sample = True
temperature = 0.9
top_k = 40
top_p = 0.9
eos_id = 50256
use_lora = True
lora_r = 16
lora_alpha = 16

About

Bare bones Code for Training and Fine-tuning LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages