Banking Intent Classifier

An end-to-end AI engineering project that classifies customer banking queries into 77 intent categories using transformer-based deep learning. Built as a capstone project while learning AI and ML engineering fundamentals for codeacademy.

Overview

When a customer contacts their bank and types something like "my card hasn't arrived yet" or "I was charged twice for the same transaction", a bank's support system needs to figure out what the customer actually wants so it can route them to the right place automatically. That's intent classification.

This project builds that system by training and comparing two models on the Banking77 dataset and wrapping the best model in a production ready inference pipeline with PII protection.

Results

Model	Accuracy	Macro F1	Trainable Params	Train Time
MLP Baseline	88.9%	0.889	5.27M	~2 min
LoRA-RoBERTa	92.5%	0.925	1.24M	~5 min

RoBERTa improved on 61 of 77 intent classes. The biggest gains were on semantically similar intents that share surface vocabulary but differ in meaning.

Project Structure

banking-intent-classifier/
│
├── banking_classification_BERT.ipynb   # main project notebook
├── requirements.txt                    # project dependencies
│
├── datasets/
    ├── banking77_train.csv             # 10,003 training queries
    └── banking77_test.csv             # 3,080 test queries

Setup

Prerequisites

Python 3.10 or higher
Google Colab with T4 GPU (recommended) or a local GPU machine

Clone the repository

git clone https://github.com/ovesa/banking-intent-classifier.git
cd banking-intent-classifier

Install dependencies

pip install -r requirements.txt
pip install peft

Run the notebook

Open banking_classification_BERT.ipynb in Google Colab or Jupyter and run cells from top to bottom. The first cell handles cloning and dependency installation automatically.

Quick Inference Demo

result = predict_intent(
    "My card was charged twice for the same transaction",
    model=lora_model,
    tokenizer=tokenizer,
    label_encoder=label_encoder,
    device=device
)

print(result['predicted_intent'])  # transaction_charged_twice
print(result['confidence'])        # 0.994

Tech Stack

Tool	Purpose
PyTorch	Model training and inference
Hugging Face Transformers	RoBERTa model and tokenizer
PEFT	LoRA configuration and adapter training
scikit-learn	TF-IDF vectorization, label encoding, evaluation metrics
pandas and numpy	Data loading and manipulation
matplotlib	EDA visualizations

Dataset

Banking77 dataset obtained from codeacademy for the Classifying Banking Intent From Customer Queries Capstone Project.

This project uses the Banking77 dataset originally published by PolyAI:

Casanueva et al. (2020). Efficient Intent Detection with Dual Sentence Encoders. Proceedings of the 2nd Workshop on NLP for ConvAI, ACL 2020. https://arxiv.org/abs/2003.04807

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
datasets		datasets
.gitignore		.gitignore
README.md		README.md
banking_classification_BERT.ipynb		banking_classification_BERT.ipynb
plot_style.py		plot_style.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Banking Intent Classifier

Overview

Results

Project Structure

Setup

Prerequisites

Clone the repository

Install dependencies

Run the notebook

Quick Inference Demo

Tech Stack

Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Banking Intent Classifier

Overview

Results

Project Structure

Setup

Prerequisites

Clone the repository

Install dependencies

Run the notebook

Quick Inference Demo

Tech Stack

Dataset

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages