A collection of parameter-efficient LLM fine-tuning experiments using Unsloth, LoRA, and QLoRA — covering REST API knowledge injection, resume generation, and Text-to-SQL translation.
This repository demonstrates end-to-end LLM fine-tuning workflows built on top of the Unsloth framework for fast, memory-efficient training. Each project targets a different domain and showcases how small, domain-specific datasets can meaningfully improve large language model behavior using parameter-efficient fine-tuning (PEFT) techniques.
Key themes across all projects:
- 4-bit quantization via
bitsandbytesfor GPU memory efficiency - LoRA / QLoRA with consistent hyperparameter configurations
- Structured evaluation comparing base vs. fine-tuned model outputs
| # | Project | Model | Technique | Domain |
|---|---|---|---|---|
| 1 | Basic LoRA Fine-Tuning | Llama-3.1-8B | LoRA | REST API Knowledge |
| 2 | Resume Bullet Optimizer | Llama-3.1-8B | LoRA | Resume Writing |
| 3 | Phi-3 Text-to-SQL | Phi-3 | QLoRA | SQL Generation |
Notebook: LLM_Finetuning_UnSloth_Basic.ipynb
A proof-of-concept demonstrating the full LoRA fine-tuning pipeline on a minimal custom dataset about REST APIs.
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Meta-Llama-3.1-8B |
| Quantization | 4-bit (bitsandbytes) |
LoRA Rank (r) |
16 |
| LoRA Alpha | 16 |
| Target Modules | q_proj, k_proj, v_proj, o_proj |
| Trainable Parameters | 13.6M / 8.04B (0.17%) |
| Max Sequence Length | 2048 |
| Training Steps | 30 |
| Learning Rate | 2e-4 |
4 custom instruction-output pairs covering REST API concepts (explanations, use cases, GET vs POST, request mechanics). Formatted using Unsloth's alpaca chat template.
| Metric | Value |
|---|---|
| Final Training Loss | 1.010 |
| Training Runtime | 36.3 seconds |
| Samples/sec | 6.6 |
Sample inference:
Input: "What is the difference between REST and GraphQL?"
Output: "REST sends simple requests to URLs, while GraphQL sends a single
request that allows a client to fetch or send data in a simple way."
Notebook: Resume_FineTuning/Resume_LLM_FineTuning_UnSloth_fixed.ipynb
Fine-tune Llama-3.1-8B to transform weak, generic resume bullet points into strong, impact-driven software engineering accomplishments.
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Meta-Llama-3.1-8B |
| Quantization | 4-bit (bitsandbytes) |
LoRA Rank (r) |
16 |
| LoRA Alpha | 16 |
| Target Modules | q_proj, k_proj, v_proj, o_proj |
| Training Steps | 120 |
| Learning Rate | 2e-4 |
| Eval Steps | Every 10 steps |
| Random Seed | 42 |
- Training: 30 instruction-input-output triplets across SE, AI, Cloud, and Frontend roles
- Evaluation: Held-out examples for base vs. fine-tuned comparison
Sample pair:
Input: "Built APIs for student platform."
Output: "Developed scalable backend APIs for a student platform, improving
service reliability and supporting high-volume user workflows."
Outputs were scored manually on 4 dimensions:
| Dimension | Description |
|---|---|
| Strength | Impact and action-orientation of the bullet |
| Clarity | Readability and precision |
| Conciseness | No filler words or redundancy |
| Realism | Plausibility as a real resume bullet |
Results across 44 evaluation examples are saved in:
base_vs_finetuned_comparison.csv— raw model outputsbase_vs_finetuned_comparison_rated.csv— manually rated comparison
Directory: phi_3_Text2Sql_QLoRA/
Fine-tune Microsoft's Phi-3 model on a Text-to-SQL task using QLoRA — enabling the model to translate natural language questions into executable SQL queries given a table schema.
QLoRA (Quantized LoRA) — combines 4-bit NF4 quantization with low-rank adapter training, enabling fine-tuning with significantly reduced GPU memory compared to standard LoRA.
208 examples of natural language question + table schema → SQL query pairs.
Sample:
Question: "Which player had a To par of 13?"
Schema: Players table with columns: name, score, to_par, rank, ...
Output: SELECT name FROM players WHERE to_par = '13'
| Metric | Description |
|---|---|
exact_match |
Exact string match with ground truth SQL |
parses |
SQL syntax validity (parseable by SQL parser) |
strict_match |
Normalized canonical SQL match |
canonical_match |
Semantic equivalence after normalization |
error_type |
Failure classification (column_mismatch, invalid_sql, etc.) |
Evaluation outputs are in:
phi3_text2sql_comparison_v2.csv— full comparison (208 examples, all metrics)phi3_text2sql_ft_beats_base_v2.csv— 177 examples where fine-tuned outperformed basephi3_text2sql_ft_eval.csv— 100-example validation subset
Key observations:
- Fine-tuned model significantly improves quote handling (single vs. double quotes in SQL)
- Reduces
invalid_sqlerrors vs. base model - 177/208 examples (85%) show improvement over base Phi-3
| Library | Purpose |
|---|---|
| Unsloth | Fast LoRA/QLoRA fine-tuning engine |
| Transformers | Model loading, tokenization, inference |
| TRL | SFTTrainer for supervised fine-tuning |
| PEFT | LoRA adapter management |
| bitsandbytes | 4-bit quantization |
| Datasets | Dataset loading and formatting |
| xformers | Memory-efficient attention |
| Accelerate | Distributed/mixed-precision training |
| pandas | Evaluation result analysis |
These notebooks were developed and tested on:
| Requirement | Minimum | Tested On |
|---|---|---|
| GPU | 80-96 GB VRAM | NVIDIA H100 / RTX Pro 6000 Blackwell |
| CUDA Version | 7.5+ | CUDA 12.x |
| RAM | 16 GB | — |
| Python | 3.10+ | 3.10 |
Experiments were conducted on an NVIDIA H100 and NVIDIA RTX Pro 6000 Blackwell. 4-bit quantization is still used for memory efficiency and faster training. For consumer GPUs (8–16 GB VRAM), reduce
max_seq_lengthorper_device_train_batch_size.
git clone https://github.com/<your-username>/LLM_FineTuning.git
cd LLM_FineTuningInstall Unsloth and required packages (run inside the notebook or a virtual environment):
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps trl peft accelerate bitsandbytes xformers
pip install transformers datasets pandasFor Colab environments, Unsloth auto-detects the CUDA version and installs the correct wheels.
Llama-3.1-8B is a gated model. You need to:
- Accept the model license at meta-llama/Meta-Llama-3.1-8B
- Set your HF token:
from huggingface_hub import login
login(token="hf_your_token_here")Open any of the notebooks in Jupyter or Google Colab and run all cells:
jupyter notebook LLM_Finetuning_UnSloth_Basic.ipynbLLM_FineTuning/
│
├── LLM_Finetuning_UnSloth_Basic.ipynb # Project 1: Basic LoRA demo (Llama-3.1-8B)
│
├── Resume_FineTuning/
│ ├── Resume_LLM_FineTuning_UnSloth_fixed.ipynb # Project 2: Resume bullet optimizer
│ ├── base_vs_finetuned_comparison.csv # Raw model output comparison
│ └── base_vs_finetuned_comparison_rated.csv # Manually scored results
│
└── phi_3_Text2Sql_QLoRA/
├── phi3_text2sql_comparison.csv # Base comparison (v1 metrics)
├── phi3_text2sql_comparison_v2.csv # Full comparison (v2 metrics, 208 rows)
├── phi3_text2sql_ft_beats_base_v2.csv # FT > Base cases (177 rows)
├── phi3_text2sql_ft_eval.csv # FT eval subset (100 rows)
├── phi3_text2sql_base_recovered_v2.csv # Base model recovered outputs
└── phi3_text2sql_ft_recovered_v2.csv # FT model recovered outputs
| Project | Model | Dataset Size | Key Result |
|---|---|---|---|
| Basic LoRA | Llama-3.1-8B | 4 examples | Training loss: 1.010 in 36s |
| Resume Optimizer | Llama-3.1-8B | 30 train / 44 eval | Improved strength, clarity, conciseness vs. base |
| Text-to-SQL | Phi-3 + QLoRA | 208 examples | 85% of eval examples: FT beats base |
This project is licensed under the MIT License.
Contributions, issues, and feature requests are welcome. Feel free to open an issue or submit a pull request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-experiment) - Commit your changes (
git commit -m 'Add new fine-tuning experiment') - Push to the branch (
git push origin feature/new-experiment) - Open a Pull Request