A compact, reproducible toolkit for fine-tuning, evaluating, and comparing
TinyLlama-1.1B-Chat-v1.0 using Full FT, LoRA, and QLoRA.
The project is designed for small-GPU environments, research experimentation,
and transparent ablation studies.
- Full Fine-Tuning
- LoRA
- QLoRA
- Shared training utilities (
training_utils.py)
- RAG evaluation (LlamaIndex / LangChain)
- Retrieval-only metrics
- Model comparison (FT / LoRA / QLoRA)
- Local inference script (
local_hf_chat_model.py)
- RAG document samples
- Small QA datasets
Note: Prefix Tuning is intentionally excluded.
ft_lab/
├── app_rag_compare.py
├── app_rag_compare_langchain.py
├── app_rag_compare_llamaindex.py
├── compare_adapters.py
├── eval_models.py
├── eval_retrieval.py
├── local_hf_chat_model.py
├── requirements.txt
├── training_utils.py
├── train_full.py
├── train_lora.py
├── train_qlora.py
│
├── models/
│ ├── ft_full/
│ ├── ft_lora/
│ └── ft_qlora/
│
├── data/
│ ├── toy_qa.jsonl
│ └── sample_eval.jsonl
│
├── docs/
│ ├── sample1.txt
│ └── sample2.txt
│
└── examples/
└── FT-Lab.ipynb
Updates all parameters.
python train_full.pyParameter-efficient training with injected low-rank matrices.
python train_lora.py4-bit quantized base model + LoRA adapters.
python train_qlora.pytraining_utils.py includes:
- dataset loading
- tokenizer setup
- model initialization
- training arguments
- evaluation hooks
All training scripts share this module for consistent behavior.
File: app_rag_compare_llamaindex.py
python app_rag_compare.py --docs_dir docs --question "Explain LoRA."File: app_rag_compare_langchain.py
Compatible with LangChain 0.2+ (Runnable / LCEL).
Compare FT / LoRA / QLoRA generations:
python compare_adapters.pyOutputs:
- aligned generations
- qualitative differences
- optional latency comparison
python eval_retrieval.py --data data/sample_eval.jsonlMetrics:
- recall@k
- precision@k
- hit-rate
python eval_models.py --data_path data/sample_eval.jsonlMetrics:
- BERTScore-F1
- exact-match accuracy
- relaxed-match accuracy
data/toy_qa.jsonl
data/sample_eval.jsonl
docs/sample1.txt
docs/sample2.txt
Useful for RAG demonstrations and baseline evaluations.
A runnable notebook is available under:
examples/FT-Lab.ipynb
This notebook:
- uses only dummy data
- demonstrates the end-to-end pipeline
- is designed for Colab / T4 / small VRAM
- can be fully replaced with real datasets
torch>=2.1.0
transformers>=4.39.0
accelerate>=0.27.0
sentencepiece>=0.1.99
einops>=0.7.0
datasets>=2.18.0
peft>=0.10.0
bitsandbytes>=0.42.0
langchain>=0.2.0
langchain-openai>=0.1.0
llama-index>=0.10.0
llama-index-embeddings-huggingface
sentence-transformers
python-dotenv>=1.0.0
pip install -r requirements.txt