💼 FinSight AI — Phi-4 Mini Finance Fine-Tuning

Fine-tuning Microsoft Phi-4 Mini 3.8B to answer complex SEC 10-K financial questions with 69% better accuracy than the base model — trained on free Kaggle GPUs in under 3 hours.

Why This Project Exists

Financial analysts spend hours reading SEC 10-K filings to extract insights — revenue trends, risk factors, operational efficiency signals. General-purpose LLMs answer these questions poorly because they lack deep financial domain grounding.

This project fine-tunes Microsoft's Phi-4 Mini (3.8B parameters) on real SEC 10-K Q&A data using QLoRA — a parameter-efficient method that trains only ~1% of the model's weights. The result: a compact, deployable finance expert that outperforms the base model by 69% on ROUGE-L, running on hardware anyone can access for free.

📊 Results

Metric	Base Model	Fine-Tuned	Improvement
ROUGE-1	0.4657	0.7523	+61.6%
ROUGE-2	0.3560	0.6106	+71.5%
ROUGE-L	0.4242	0.7168	+69.0%

Evaluated on held-out SEC 10-K Q&A pairs from the financial-qa-10K dataset.

💡 What the Fine-Tuning Actually Changed

Q: What does it mean when revenue grows but net income declines?

Base model:

"When revenue grows but net income declines, it means the company's expenses increased at a faster rate than its revenue..."

(Generic — no financial specificity, no actionable insight)

Fine-tuned model:

"This pattern indicates margin compression — rising costs outpacing revenue growth. In a 10-K, look at: COGS trends (gross margin erosion), SG&A increases (scaling inefficiency), higher interest expense (leverage risk), and one-time charges. Analysts treat persistent margin compression as a red flag for pricing power and operational efficiency — especially when it occurs despite top-line growth."

(Domain-specific, cites 10-K sections, analyst-grade reasoning)

🔧 Training Configuration

Base Model:        microsoft/Phi-4-mini-instruct (3.8B params)
Method:            QLoRA — 4-bit NF4 quantization
LoRA Config:       r=16, alpha=32, dropout=0.05
Trainable Params:  ~40M / 3.8B (≈1% of total)
Dataset:           virattt/financial-qa-10K (6,300 train samples)
Hardware:          Kaggle 2x Tesla T4 (free tier)
Training Time:     ~3 hours | 3 epochs
Final Train Loss:  1.07

Why QLoRA? Full fine-tuning a 3.8B model requires 30+ GB of VRAM. QLoRA quantizes the base model to 4-bit and trains small low-rank adapter layers, making it possible to fine-tune on 2x T4s (16GB each) — hardware available for free on Kaggle. This is a practical, reproducible approach for domain adaptation on a budget.

🏗️ Project Structure

phi4-finance-finetuning/
├── notebooks/
│   └── phi4_finance_finetuning.ipynb  # Full pipeline: prep → train → eval
├── app/
│   ├── app.py                         # FastAPI inference server
│   ├── requirements.txt
│   └── static/index.html              # FinSight AI dashboard
├── results/
│   └── eval_results.json              # ROUGE scores + evaluation output
└── README.md

🖥️ Run Locally

Prerequisites: Python 3.10+, 8GB+ RAM (16GB recommended)

# Clone the repo
git clone https://github.com/Emart29/phi4-finance-finetuning.git
cd phi4-finance-finetuning

# Install dependencies
cd app
pip install -r requirements.txt

# Start the inference server
uvicorn app:app --host 0.0.0.0 --port 8000

# Open the dashboard
# → http://localhost:8000

The app loads the fine-tuned model from Hugging Face automatically on first run. Subsequent runs use the cached model.

📓 Reproduce the Training

The full training pipeline is in notebooks/phi4_finance_finetuning.ipynb.

To run it yourself:

Open the notebook on Kaggle (free T4 GPUs)
Add your Hugging Face token as a Kaggle secret (HF_TOKEN)
Run all cells — total time ~3 hours

The notebook covers:

Dataset loading and prompt formatting
QLoRA config and LoRA adapter setup
Training with trl SFTTrainer
ROUGE evaluation against the base model
Pushing the adapter to Hugging Face Hub

🔗 Links

Resource	Link
🤗 Fine-tuned Model	Emar7/phi4-finance-finetuned
🌐 Live Demo	HuggingFace Spaces
📓 Training Notebook	Kaggle
📦 Base Model	microsoft/Phi-4-mini-instruct
📊 Dataset	virattt/financial-qa-10K

🛠️ Tech Stack

Category	Tools
Base Model	Microsoft Phi-4 Mini Instruct (3.8B)
Fine-tuning	QLoRA, PEFT, trl SFTTrainer
Quantization	bitsandbytes (4-bit NF4)
Evaluation	ROUGE-1/2/L (rouge-score)
Serving	FastAPI, Uvicorn
Hardware	Kaggle 2x Tesla T4 (free)
Model Hub	Hugging Face Hub

👤 Author

Emmanuel Nwanguma — ML Engineer & Data Scientist

📄 License

MIT License — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💼 FinSight AI — Phi-4 Mini Finance Fine-Tuning

Why This Project Exists

📊 Results

💡 What the Fine-Tuning Actually Changed

🔧 Training Configuration

🏗️ Project Structure

🖥️ Run Locally

📓 Reproduce the Training

🔗 Links

🛠️ Tech Stack

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
app		app
notebooks		notebooks
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

💼 FinSight AI — Phi-4 Mini Finance Fine-Tuning

Why This Project Exists

📊 Results

💡 What the Fine-Tuning Actually Changed

🔧 Training Configuration

🏗️ Project Structure

🖥️ Run Locally

📓 Reproduce the Training

🔗 Links

🛠️ Tech Stack

👤 Author

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages