Fine-tune LLMs from Scratch

Code to fine-tune "famous" LLMs from scratch in Pytorch.

Load a model

Currently, only GPT-2 models are supported.

# Load GPT2-models: "gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl"
from src.load_gpt2 import GPT2_model
model = GPT2_model("gpt2").to(device)

# Load pretrained version
model = GPT2_model.from_pretrained("gpt2").to(device)

Instruction Fine-tuning on Alpaca or variants

You can choose from the following datasets:

The Alpaca dataset is a synthetic dataset developed by Stanford researchers using OpenAI's davinci model to generate 52k instruction/output pairs.

wget https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json

The Alpaca-GPT4 dataset uses GPT-4, instead of text-davinci-003 (GPT-3), to answer the same prompts. As a result, it contains higher-quality and longer responses.

wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json

The AlpaGasus dataset is a filtered version of Alpaca, containing only 9k high-quality examples. It has been shown to significantly outperform models trained on the original Alpaca, as evaluated by GPT-4.

wget https://raw.githubusercontent.com/gpt4life/alpagasus/main/data/filtered/chatgpt_9k.json

Alpaca style

All the datasets are just a single JSON file, containing prompts in Alpaca style:

instruction: str, describes the task the model should perform. 
                  Each of the 52K instructions is unique.
input:       str, optional context or input for the task.
output:      str, the answer to the instruction as generated by GPT-4.

Training (`fine_tune.py`)

Default features:

Mixed precision with bfloat16 for general operations and TF32 for matrix multiplication.
A learning rate scheduler with linear warmup followed by cosine annealing.
Gradient accumalation.
Low-Rank Adaptation (LoRA)

# Fine-tuning Hyperparameters
batch_size = 4
grad_accum_steps = 8
num_epochs = 3
lr = 5e-4
weight_decay = 0
warmup_ratio = 0.1
eval_freq = 10
save_steps = 200
use_lora = True
lora_r = 16
lora_alpha = 16

Inference (`chat.py`)

# Generation hyperparameters
max_new_tokens = 500
do_sample = True
temperature = 0.9
top_k = 40
top_p = 0.9
eos_id = 50256
use_lora = True
lora_r = 16
lora_alpha = 16

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
chat.py		chat.py
fine_tune.py		fine_tune.py
loss_plot.png		loss_plot.png
lr_scheduler.png		lr_scheduler.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fine-tune LLMs from Scratch

Load a model

Instruction Fine-tuning on Alpaca or variants

Alpaca style

Training (`fine_tune.py`)

Inference (`chat.py`)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

YugAjmera/fine-tune-llm

Folders and files

Latest commit

History

Repository files navigation

Fine-tune LLMs from Scratch

Load a model

Instruction Fine-tuning on Alpaca or variants

Alpaca style

Training (fine_tune.py)

Inference (chat.py)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Training (`fine_tune.py`)

Inference (`chat.py`)

Packages