Skip to content

Commit ee74caa

Browse files
committed
clean repo
1 parent 7b13980 commit ee74caa

File tree

7 files changed

+197
-43
lines changed

7 files changed

+197
-43
lines changed

Dockerfile

Whitespace-only changes.

README.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
# 🚀 Sentiment Analysis MLOps Pipeline
2+
3+
An **end-to-end MLOps project** demonstrating model deployment, observability, and performance optimization using **FastAPI**, **AWS S3**, and **GitHub Actions**.
4+
5+
This repository shows how to take an NLP model from **fine-tuning to scalable inference** — with dynamic quantization and CI/CD automation for real-world readiness.
6+
7+
---
8+
9+
## 🧠 Project Overview
10+
11+
This project serves a fine-tuned **DistilBERT sentiment classifier** through a production-grade API.
12+
It includes model loading from **AWS S3**, optional **quantized inference**, structured **logging**, **load testing**, and **CI/CD automation**.
13+
14+
Quantization reduced average latency by **≈60%**, proving the practical value of lightweight model optimization.
15+
16+
| Mode | Avg Latency (ms) | P95 Latency (ms) | Improvement |
17+
| -------------------- | ---------------- | ---------------- | ------------- |
18+
| Without Quantization | 3274.15 | 5912.65 ||
19+
| With Quantization | **1302.57** | **2581.50** |~60% faster |
20+
21+
---
22+
23+
## 🧩 Tech Stack
24+
25+
| Category | Tools Used |
26+
| ---------------- | ------------------------------------------ |
27+
| **Modeling** | Hugging Face Transformers (DistilBERT) |
28+
| **Serving** | FastAPI, Uvicorn |
29+
| **Deployment** | AWS EC2, S3 |
30+
| **Automation** | GitHub Actions (CI/CD) |
31+
| **Monitoring** | Custom structured logging (`logs/app.log`) |
32+
| **Testing** | Pytest |
33+
| **Load Testing** | Async load simulator with aiohttp |
34+
| **Optimization** | PyTorch Dynamic Quantization |
35+
36+
---
37+
38+
## ⚙️ Key Features
39+
40+
* 🔁 **Automated model fetch from S3** during startup
41+
* ⚙️ **Dynamic quantization toggle** for faster CPU inference
42+
* 📈 **Structured request logging** (latency, client IP, text length, sentiment)
43+
* 🧪 **Pytest-based CI pipeline** for stability
44+
* 🌐 **FastAPI endpoint** for real-time predictions
45+
* 📊 **Load simulator** to measure performance under concurrent requests
46+
47+
---
48+
49+
## 🧰 API Usage
50+
51+
### **Health Check**
52+
53+
```bash
54+
GET /ping
55+
```
56+
57+
✅ Response:
58+
59+
```json
60+
{"status": 200, "quantized": false}
61+
```
62+
63+
### **Predict Endpoint**
64+
65+
```bash
66+
POST /predict
67+
```
68+
69+
**Body:**
70+
71+
```json
72+
{
73+
"text": "The movie was absolutely fantastic!",
74+
"quantize": true
75+
}
76+
```
77+
78+
✅ Response:
79+
80+
```json
81+
{
82+
"sentiment": "positive",
83+
"latency_ms": "1320.45",
84+
"quantized": true
85+
}
86+
```
87+
88+
---
89+
90+
## 📦 Project Structure
91+
92+
```
93+
sentiment-mlops/
94+
├── .github/
95+
│ └── workflows/
96+
│ ├── ci.yml # Continuous Integration: pytest, lint checks
97+
│ └── cd.yml # Continuous Deployment: deploy to EC2
98+
99+
├── app/
100+
│ ├── serve.py # FastAPI app for serving predictions
101+
│ └── utils.py # Model loading, quantization, and inference logic
102+
103+
├── logs/
104+
│ ├── app.log # Application logs
105+
│ ├── latency-stats.txt # Performance summary
106+
│ └── simulator.log # Load test results
107+
108+
├── scripts/
109+
│ ├── load_simulator.py # Simulates concurrent requests for stress testing
110+
│ └── analyze_simulator_logs.py # Parses and visualizes latency results
111+
112+
├── deploy.sh # Shell script for EC2 deployment
113+
├── Makefile # Unified dev commands (test, run, deploy)
114+
├── requirements.txt # Project dependencies
115+
├── test_setup.py # Basic API and health-check tests
116+
└── README.md # Documentation (you’re reading it!)
117+
118+
```
119+
120+
---
121+
122+
## 🔄 CI/CD Pipeline
123+
124+
GitHub Actions automates:
125+
126+
* ✅ Environment setup (Python + dependencies)
127+
* ✅ Linting & testing via `pytest`
128+
* ✅ Failure alerts on PRs
129+
* ✅ (Optional) Deployment to EC2 after passing tests
130+
131+
This ensures your API is **always tested before merging**, just like real MLOps production pipelines.
132+
133+
---
134+
135+
## 📊 Load Testing
136+
137+
Run async load simulation to test stability under concurrent requests:
138+
139+
```bash
140+
python scripts/load_simulator.py
141+
```
142+
143+
Results are logged to `logs/simulator.log` and visualized with a latency-over-time plot.
144+
145+
---
146+
147+
## ⚡ Quantization Impact
148+
149+
| Metric | Without Quantization | With Quantization |
150+
| -------------------- | -------------------- | ----------------- |
151+
| **Min Latency (ms)** | 331.52 | 291.97 |
152+
| **Avg Latency (ms)** | 3274.15 | **1302.57** |
153+
| **P95 Latency (ms)** | 5912.65 | **2581.50** |
154+
| **Max Latency (ms)** | 9143.29 | **4830.67** |
155+
156+
👉 Demonstrates how **PyTorch dynamic quantization** reduces model size and speeds up inference — essential for CPU-based deployments.
157+
158+
---
159+
160+
## 🧪 Run Tests
161+
162+
```bash
163+
make test
164+
```
165+
166+
or
167+
168+
```bash
169+
PYTHONPATH=. pytest -v
170+
```
171+
172+
---
173+
174+
## ☁️ Deployment
175+
176+
The app is designed for **AWS EC2 deployment**.
177+
Model files are fetched from **S3** on first run and cached locally.
178+
179+
```bash
180+
uvicorn app.serve:app --host 0.0.0.0 --port 8000
181+
```
182+
183+
---
184+
185+
## 🧠 MLOps Pitch
186+
187+
This project demonstrates a **production-ready NLP deployment pipeline** with performance optimization and automation at its core.
188+
By integrating **AWS**, **FastAPI**, and **GitHub Actions**, it showcases how MLOps turns research models into **reliable, scalable services** — cutting latency by **60%** through smart model quantization.
189+
190+
---
191+
192+
## 👨‍💻 Author
193+
194+
**M. Farrukh Mehmood**
195+
196+
197+
🔗 [LinkedIn](https://www.linkedin.com/in/sfarrukhm) | [GitHub](https://github.com/sfarrukhm)

ab_output.txt

Lines changed: 0 additions & 42 deletions
This file was deleted.

payload.json

Lines changed: 0 additions & 1 deletion
This file was deleted.
File renamed without changes.

0 commit comments

Comments
 (0)