Skip to content

Doondi-Ashlesh/LLM_FineTuning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLM Fine-Tuning with Unsloth

Python Framework Technique GPU Models License

A collection of parameter-efficient LLM fine-tuning experiments using Unsloth, LoRA, and QLoRA — covering REST API knowledge injection, resume generation, and Text-to-SQL translation.


Overview

This repository demonstrates end-to-end LLM fine-tuning workflows built on top of the Unsloth framework for fast, memory-efficient training. Each project targets a different domain and showcases how small, domain-specific datasets can meaningfully improve large language model behavior using parameter-efficient fine-tuning (PEFT) techniques.

Key themes across all projects:

  • 4-bit quantization via bitsandbytes for GPU memory efficiency
  • LoRA / QLoRA with consistent hyperparameter configurations
  • Structured evaluation comparing base vs. fine-tuned model outputs

Projects

# Project Model Technique Domain
1 Basic LoRA Fine-Tuning Llama-3.1-8B LoRA REST API Knowledge
2 Resume Bullet Optimizer Llama-3.1-8B LoRA Resume Writing
3 Phi-3 Text-to-SQL Phi-3 QLoRA SQL Generation

1. Basic LoRA Fine-Tuning

Notebook: LLM_Finetuning_UnSloth_Basic.ipynb

Goal

A proof-of-concept demonstrating the full LoRA fine-tuning pipeline on a minimal custom dataset about REST APIs.

Setup

Parameter Value
Base Model meta-llama/Meta-Llama-3.1-8B
Quantization 4-bit (bitsandbytes)
LoRA Rank (r) 16
LoRA Alpha 16
Target Modules q_proj, k_proj, v_proj, o_proj
Trainable Parameters 13.6M / 8.04B (0.17%)
Max Sequence Length 2048
Training Steps 30
Learning Rate 2e-4

Dataset

4 custom instruction-output pairs covering REST API concepts (explanations, use cases, GET vs POST, request mechanics). Formatted using Unsloth's alpaca chat template.

Results

Metric Value
Final Training Loss 1.010
Training Runtime 36.3 seconds
Samples/sec 6.6

Sample inference:

Input:  "What is the difference between REST and GraphQL?"
Output: "REST sends simple requests to URLs, while GraphQL sends a single
         request that allows a client to fetch or send data in a simple way."

2. Resume Bullet Optimizer

Notebook: Resume_FineTuning/Resume_LLM_FineTuning_UnSloth_fixed.ipynb

Goal

Fine-tune Llama-3.1-8B to transform weak, generic resume bullet points into strong, impact-driven software engineering accomplishments.

Setup

Parameter Value
Base Model meta-llama/Meta-Llama-3.1-8B
Quantization 4-bit (bitsandbytes)
LoRA Rank (r) 16
LoRA Alpha 16
Target Modules q_proj, k_proj, v_proj, o_proj
Training Steps 120
Learning Rate 2e-4
Eval Steps Every 10 steps
Random Seed 42

Dataset

  • Training: 30 instruction-input-output triplets across SE, AI, Cloud, and Frontend roles
  • Evaluation: Held-out examples for base vs. fine-tuned comparison

Sample pair:

Input:   "Built APIs for student platform."
Output:  "Developed scalable backend APIs for a student platform, improving
          service reliability and supporting high-volume user workflows."

Evaluation

Outputs were scored manually on 4 dimensions:

Dimension Description
Strength Impact and action-orientation of the bullet
Clarity Readability and precision
Conciseness No filler words or redundancy
Realism Plausibility as a real resume bullet

Results across 44 evaluation examples are saved in:


3. Phi-3 Text-to-SQL with QLoRA

Directory: phi_3_Text2Sql_QLoRA/

Goal

Fine-tune Microsoft's Phi-3 model on a Text-to-SQL task using QLoRA — enabling the model to translate natural language questions into executable SQL queries given a table schema.

Technique

QLoRA (Quantized LoRA) — combines 4-bit NF4 quantization with low-rank adapter training, enabling fine-tuning with significantly reduced GPU memory compared to standard LoRA.

Dataset

208 examples of natural language question + table schema → SQL query pairs.

Sample:

Question: "Which player had a To par of 13?"
Schema:   Players table with columns: name, score, to_par, rank, ...
Output:   SELECT name FROM players WHERE to_par = '13'

Evaluation Metrics

Metric Description
exact_match Exact string match with ground truth SQL
parses SQL syntax validity (parseable by SQL parser)
strict_match Normalized canonical SQL match
canonical_match Semantic equivalence after normalization
error_type Failure classification (column_mismatch, invalid_sql, etc.)

Results

Evaluation outputs are in:

Key observations:

  • Fine-tuned model significantly improves quote handling (single vs. double quotes in SQL)
  • Reduces invalid_sql errors vs. base model
  • 177/208 examples (85%) show improvement over base Phi-3

Tech Stack

Library Purpose
Unsloth Fast LoRA/QLoRA fine-tuning engine
Transformers Model loading, tokenization, inference
TRL SFTTrainer for supervised fine-tuning
PEFT LoRA adapter management
bitsandbytes 4-bit quantization
Datasets Dataset loading and formatting
xformers Memory-efficient attention
Accelerate Distributed/mixed-precision training
pandas Evaluation result analysis

Hardware Requirements

These notebooks were developed and tested on:

Requirement Minimum Tested On
GPU 80-96 GB VRAM NVIDIA H100 / RTX Pro 6000 Blackwell
CUDA Version 7.5+ CUDA 12.x
RAM 16 GB
Python 3.10+ 3.10

Experiments were conducted on an NVIDIA H100 and NVIDIA RTX Pro 6000 Blackwell. 4-bit quantization is still used for memory efficiency and faster training. For consumer GPUs (8–16 GB VRAM), reduce max_seq_length or per_device_train_batch_size.


Getting Started

1. Clone the Repository

git clone https://github.com/<your-username>/LLM_FineTuning.git
cd LLM_FineTuning

2. Install Dependencies

Install Unsloth and required packages (run inside the notebook or a virtual environment):

pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps trl peft accelerate bitsandbytes xformers
pip install transformers datasets pandas

For Colab environments, Unsloth auto-detects the CUDA version and installs the correct wheels.

3. Authenticate with Hugging Face

Llama-3.1-8B is a gated model. You need to:

  1. Accept the model license at meta-llama/Meta-Llama-3.1-8B
  2. Set your HF token:
from huggingface_hub import login
login(token="hf_your_token_here")

4. Run a Notebook

Open any of the notebooks in Jupyter or Google Colab and run all cells:

jupyter notebook LLM_Finetuning_UnSloth_Basic.ipynb

Repository Structure

LLM_FineTuning/
│
├── LLM_Finetuning_UnSloth_Basic.ipynb          # Project 1: Basic LoRA demo (Llama-3.1-8B)
│
├── Resume_FineTuning/
│   ├── Resume_LLM_FineTuning_UnSloth_fixed.ipynb   # Project 2: Resume bullet optimizer
│   ├── base_vs_finetuned_comparison.csv             # Raw model output comparison
│   └── base_vs_finetuned_comparison_rated.csv       # Manually scored results
│
└── phi_3_Text2Sql_QLoRA/
    ├── phi3_text2sql_comparison.csv                 # Base comparison (v1 metrics)
    ├── phi3_text2sql_comparison_v2.csv              # Full comparison (v2 metrics, 208 rows)
    ├── phi3_text2sql_ft_beats_base_v2.csv           # FT > Base cases (177 rows)
    ├── phi3_text2sql_ft_eval.csv                    # FT eval subset (100 rows)
    ├── phi3_text2sql_base_recovered_v2.csv          # Base model recovered outputs
    └── phi3_text2sql_ft_recovered_v2.csv            # FT model recovered outputs

Results Summary

Project Model Dataset Size Key Result
Basic LoRA Llama-3.1-8B 4 examples Training loss: 1.010 in 36s
Resume Optimizer Llama-3.1-8B 30 train / 44 eval Improved strength, clarity, conciseness vs. base
Text-to-SQL Phi-3 + QLoRA 208 examples 85% of eval examples: FT beats base

License

This project is licensed under the MIT License.


Contributing

Contributions, issues, and feature requests are welcome. Feel free to open an issue or submit a pull request.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/new-experiment)
  3. Commit your changes (git commit -m 'Add new fine-tuning experiment')
  4. Push to the branch (git push origin feature/new-experiment)
  5. Open a Pull Request

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors