Skip to content

Commit 01e06bf

Browse files
Merge pull request #33 from jkrishna2511/gcp_compute_engine_lab_2
GCP Compute engine lab 2
2 parents 1e6f65d + 0d21a3f commit 01e06bf

19 files changed

+50451
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,3 +168,4 @@ cython_debug/
168168
# and can be added to the global gitignore or merged into this file. For a more nuclear
169169
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
170170
#.idea/
171+
.DS_Store

Labs/GCP_Labs/Compute_Engine_Labs/Lab2/IMDb_Reviews.csv

Lines changed: 50001 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
# **Google Cloud Platform Compute Engine - Lab 2**
2+
3+
## **Objective**
4+
5+
This lab focuses on intermediate-level skills in GCP Compute Engine, including creating custom VM images, using snapshots, working with instance templates and managed instance groups (MIGs), configuring networking settings, auto-scaling, load balancing, and load testing. Participants will set up a scalable sentiment analysis web service using FastAPI to classify IMDb reviews.
6+
7+
## **Lab Steps**
8+
9+
### **Step 1: Create a VM Instance and Set Up Environment**
10+
11+
We start by creating a virtual machine (VM) on Google Cloud Platform (GCP). This VM serves as the environment where we will install our dependencies and deploy our sentiment analysis service. Using a VM in GCP provides us with scalable, on-demand computing resources.
12+
13+
1. **Create a VM instance:**
14+
- Go to the Google Cloud Console: [https://console.cloud.google.com/](https://console.cloud.google.com/).
15+
- Navigate to Compute Engine > VM instances.
16+
- Click "Create Instance".
17+
- Configure the instance:
18+
- **Name:** `imdb-sentiment-analysis-vm`
19+
- **Region:** `us-central1`
20+
- **Zone:** `us-central1-a` (or any other compatible zone in the `us-central1` region)
21+
- **Machine type:** `e2-micro` (1 vCPU, 1 GB memory)
22+
- **Boot disk:** Debian GNU/Linux 10 (buster)
23+
- **Boot disk size:** 10 GB
24+
- Click "Create".
25+
![Create VM](assets/step-1-create-vm.png)
26+
27+
2. **Set up the environment:**
28+
- SSH into the VM instance.
29+
- Execute the environment setup script `Lab2/setup.sh` to install necessary packages and set up the virtual environment:
30+
31+
```sh
32+
sudo apt-get update
33+
sudo apt-get install -y python3 python3-pip python3.11-venv git
34+
mkdir -p /home/imdb-sentiment-analysis
35+
sudo chmod -R 777 /home/imdb-sentiment-analysis
36+
cd /home/imdb-sentiment-analysis
37+
git clone https://github.com/raminmohammadi/MLOps.git
38+
cd /home/imdb-sentiment-analysis/MLOps/Labs/GCP_Labs/Compute_Engine_Labs/Lab2
39+
python3 -m venv env
40+
. env/bin/activate
41+
pip install -r requirements.txt
42+
```
43+
44+
This step sets up our VM environment by installing Python and other necessary packages, setting up a Python virtual environment, and cloning the project repository. It ensures that our development environment is isolated and reproducible.
45+
46+
### **Step 2: Create and Use a VM Snapshot**
47+
48+
Snapshots are important as they allow you to back up the state of your VM instance, enabling quick recovery in case of failures. Using snapshots in GCP helps in maintaining business continuity and minimizing downtime.
49+
50+
1. **Create a snapshot of the VM instance:**
51+
- Stop the VM instance.
52+
- Go to Compute Engine > Snapshots.
53+
- Click "Create Snapshot".
54+
- Configure the snapshot:
55+
- **Name:** `imdb-sentiment-analysis-vm-snapshot`
56+
- **Source disk:** `imdb-sentiment-analysis-vm` boot disk
57+
- Click "Create".
58+
![Create Snapshot](assets/step-2-create-snapshot.png)
59+
60+
2. **Use the snapshot to restore the application in case of failure:**
61+
- Simulate a failure by deleting the VM instance and confirming the deletion.
62+
- Restore the snapshot to a new VM instance:
63+
- Go to Compute Engine > VM instances.
64+
- Click "Create Instance".
65+
- Select `New VM instance from snapshot`.
66+
- Choose the snapshot `imdb-sentiment-analysis-vm-snapshot`.
67+
- Configure the instance:
68+
- **Name:** `imdb-sentiment-analysis-vm-restored`
69+
- **Region:** `us-central1`
70+
- **Zone:** `us-central1-a` (or the same zone used for the custom VM image)
71+
- **Machine type:** `e2-micro` (1 vCPU, 1 GB memory)
72+
- Click "Create"
73+
![Restore Snapshot](assets/step-2-create-vm-from-snapshot.png)
74+
75+
This step demonstrates how to create a backup of the VM's state using a snapshot and how to restore the VM from the snapshot in case of failure. This ensures that our work can be recovered quickly and efficiently.
76+
77+
### **Step 3: Create a Custom VM Image**
78+
79+
Creating a custom image from a VM allows you to capture the VM's configuration and installed software, which can be reused to create new VM instances with the same setup. This is useful for ensuring consistency across multiple instances and for scaling.
80+
81+
1. **Create a custom image from the VM instance:**
82+
- Deactivate the virtual environment:
83+
84+
```sh
85+
deactivate
86+
```
87+
- Stop the VM instance.
88+
- Go to Compute Engine > Images.
89+
- Click "Create Image".
90+
- Configure the image:
91+
- **Name:** `imdb-sentiment-analysis-image`
92+
- **Source:** `Disk` > `imdb-sentiment-analysis-vm-restored` boot disk
93+
- Click "Create"
94+
95+
Creating a custom image from our VM ensures that any new instances created from this image will have the same configuration, software, and environment as the original, thus maintaining uniformity across deployments.
96+
97+
### **Step 4: Configure Networking and Security**
98+
99+
Virtual Private Cloud (VPC) and subnets segment your network into smaller, manageable parts. This helps in organizing and securing your network resources. Firewall rules control the traffic flow to and from your instances, ensuring only authorized access is allowed.
100+
101+
1. **Set up VPC and subnets:**
102+
- Navigate to VPC Network > VPC Networks.
103+
- Click "Create VPC Network".
104+
- Configure the VPC:
105+
- **Name:** `imdb-sentiment-analysis-vpc`
106+
- **Subnets:** Custom
107+
- Click "Add subnet" to add a new subnet:
108+
- **Name:** `imdb-sentiment-analysis-vpc-subnet`
109+
- **Region:** `us-central1`
110+
- **IP address range:** `10.0.0.0/24`
111+
- **Purpose:** Private
112+
- Click "Done".
113+
- Click "Create".
114+
115+
2. **Create a proxy-only subnet:**
116+
- Navigate to VPC Network > VPC Networks.
117+
- Click on the `imdb-sentiment-analysis-vpc` network you created.
118+
- Click "Add subnet".
119+
- Configure the subnet:
120+
- **Name:** `imdb-sentiment-analysis-proxy-subnet`
121+
- **Region:** `us-central1`
122+
- **IP address range:** `10.0.1.0/24`
123+
- **Purpose:** Regional Managed Proxy
124+
- Click "Done".
125+
- Click "Save".
126+
127+
3. **Set up firewall rules:**
128+
- Navigate to VPC Network > Firewall Rules.
129+
- Click "Create Firewall Rule".
130+
- Configure the rule:
131+
- **Name:** `imdb-sentiment-analysis-vpc-allow-custom`
132+
- **Network:** `imdb-sentiment-analysis-vpc`
133+
- **Direction of traffic:** Ingress
134+
- **Action on match:** Allow
135+
- **Targets:** All instances in the network
136+
- **Source IP ranges:** `0.0.0.0/0`
137+
- **Protocols and ports:**
138+
- Specified protocols and ports:
139+
- `tcp:22` (SSH)
140+
- `tcp:80` (HTTP)
141+
- `tcp:8000` (for FastAPI)
142+
- Click "Create".
143+
144+
Setting up a VPC, subnets, and firewall rules organizes and secures our network, ensuring that only legitimate traffic reaches our instances. This helps in protecting our application and data from unauthorized access.
145+
146+
### **Step 5: Create an Instance Template and Managed Instance Group (MIG)**
147+
148+
An instance template allows you to define a configuration for VM instances that can be reused. A managed instance group (MIG) uses an instance template to create and manage a group of identical instances, providing scalability and high availability. This helps in managing large-scale deployments efficiently.
149+
150+
1. **Create a startup script:**
151+
Create a shell script that activates the virtual environment and runs the FastAPI server. Save this script as `startup-script.sh`:
152+
153+
```sh
154+
#!/bin/bash
155+
156+
# Navigate to the project directory
157+
cd /home/imdb-sentiment-analysis/MLOps/Labs/GCP_Labs/Compute_Engine_Labs/Lab2
158+
159+
# Activate the virtual environment
160+
. env/bin/activate
161+
162+
# Start the FastAPI service
163+
nohup python3 imdb_sentiment_analysis_service.py > /tmp/startup.log 2>&1 &
164+
```
165+
166+
2. **Create an instance template:**
167+
- Navigate to Compute Engine > Instance Templates.
168+
- Click "Create Instance Template".
169+
- Configure the template:
170+
- **Name:** `imdb-sentiment-analysis-template`
171+
- **Machine type:** `e2-micro` (1 vCPU, 1 GB memory)
172+
- **Boot disk:** Custom image `imdb-sentiment-analysis-image`
173+
- **Management, security, disks, networking, sole tenancy:**
174+
- Click "Networking"
175+
- In the "Networking" tab, select the VPC network `imdb-sentiment-analysis-vpc` and subnet `imdb-sentiment-analysis-vpc-subnet` you created earlier
176+
177+
- In the "Management, security, disks, networking, sole tenancy" section, find the "Automation" tab.
178+
- In the "Startup script" section, paste the contents of `startup-script.sh`.
179+
- Click "Create".
180+
![Create Instance Template](assets/step-5-create-instance-template.png)
181+
182+
3. **Create a managed instance group (MIG):**
183+
- Go to Compute Engine > Instance Groups.
184+
- Click "Create Instance Group".
185+
- Configure the group:
186+
- **Name:** `imdb-mig`
187+
- **Location:** Single zone
188+
- **Zone:** `us-central1-c` (or the same zone used for the custom VM image)
189+
- **Instance template:** `imdb-sentiment-analysis-template`
190+
- **Autoscaling policy:**
191+
- Target CPU utilization: 60%
192+
- Minimum number of instances: 1
193+
- Maximum number of instances: 3
194+
- Click "Create".
195+
![Create Managed Instance Group](assets/step-5-create-mig.png)
196+
197+
This step ensures that we have a template to create identical instances and a managed group that can scale based on load. The startup script ensures that each instance starts the FastAPI server automatically.
198+
199+
### **Step 6: Configure Load Balancer**
200+
201+
A load balancer distributes incoming traffic across multiple instances, ensuring high availability and reliability by sending requests only to healthy instances. This helps in managing traffic effectively, providing a better user experience.
202+
203+
1. **Set up a load balancer:**
204+
- Navigate to **Network Services** > **Load balancing**.
205+
- Click "Create Load Balancer".
206+
- Select `HTTP(S) Load Balancing`.
207+
- Click `Start configuration`.
208+
- Choose `Global`.
209+
210+
2. **Configure the backend service:**
211+
- Select `Backend services & backend buckets`.
212+
- Click `Create a backend service`.
213+
- Configure the backend service:
214+
- **Name:** `imdb-backend-service`
215+
- **Backend type:** `Instance group`
216+
- **Instance group:** `imdb-mig`
217+
- **Port numbers:** `8000`
218+
- Configure the health check:
219+
- **Protocol:** `HTTP`
220+
- **Port:** `8000`
221+
- **Request path:** `/health` (should respond with status: ok)
222+
- Click `Create`.
223+
224+
3. **Configure the frontend:**
225+
- Click `Frontends`.
226+
- Click `Create a frontend IP and port`.
227+
- Configure the frontend:
228+
- **Name:** `imdb-frontend`
229+
- **Protocol:** `HTTP`
230+
- **IP version:** `IPv4`
231+
- **Port:** `80`
232+
- Click `Done`.
233+
234+
4. **Finalize and create the load balancer:**
235+
- Review the configuration.
236+
- Click `Create`.
237+
238+
Using a load balancer ensures that our application can handle high traffic loads by distributing requests across multiple instances, providing a seamless user experience.
239+
240+
### **Step 7: Auto-scaling and Load Testing**
241+
242+
Auto-scaling automatically adjusts the number of VM instances in a group based on load or other metrics, ensuring that your application can handle varying amounts of traffic. Load testing simulates user traffic to ensure the system can handle real-world usage.
243+
244+
1. **Verify the setup:**
245+
- The startup script in the MIG will automatically activate the virtual environment and start the FastAPI server.
246+
- You can SSH into a VM instance from the MIG to check the logs and verify that the server is running.
247+
- Check the log file `/tmp/startup.log` for any errors.
248+
249+
2. **Test the load balancer:**
250+
- Send HTTP requests to the load balancer's IP to the `/predict/` endpoint with sample reviews.
251+
![Test Load Balancer](assets/step-7-test-load-balancer.png)
252+
253+
3. **Load Testing with Locust:**
254+
- Install locust load testing package:
255+
- `pip install locust`
256+
- Create a file named `load_test.py` with the following code:
257+
258+
```python
259+
from locust import HttpUser, TaskSet, task, between
260+
261+
class UserBehavior(TaskSet):
262+
@task
263+
def predict(self):
264+
self.client.post("/predict/", json={"review": "This movie was fantastic!"})
265+
266+
class WebsiteUser(HttpUser):
267+
tasks = [UserBehavior]
268+
wait_time = between(1, 2)
269+
```
270+
271+
- Run the load test:
272+
```sh
273+
locust -f load_test.py
274+
```
275+
- Open the Locust web interface running at `0.0.0.0:8089` and start the test with the load balancer's IP address.
276+
![Load Testing with Locust](assets/step-7-locust-load-test.png)
277+
278+
4. **Monitor Auto-scaling:**
279+
280+
- With the simulated load of 200 users and a spawn rate of 200 requests per second, observe scaling to 3 instances.
281+
- Monitor the instances and performance through the GCP Console to see if new instances are created and traffic is balanced.
282+
- After the load test, observe the scale-down behavior when the traffic decreases.
283+
![Monitor Auto-scaling](assets/step-8-auto-scaling.png)
284+
285+
This step ensures that our application can handle varying loads by automatically scaling the number of instances and distributing the load effectively.
286+
287+
### **Sentiment Analysis Service Code Explanation**
288+
289+
The provided FastAPI code sets up a sentiment analysis service for IMDb reviews. Here's a breakdown of the key components:
290+
291+
```python
292+
from fastapi import FastAPI
293+
from pydantic import BaseModel
294+
import pandas as pd
295+
from sklearn.feature_extraction.text import TfidfVectorizer
296+
from sklearn.linear_model import LogisticRegression
297+
import uvicorn
298+
299+
app = FastAPI()
300+
301+
# Load IMDb dataset
302+
data = pd.read_csv('IMDb_Reviews.csv')
303+
304+
# Preprocess data
305+
X = data['review']
306+
y = data['sentiment']
307+
308+
# Vectorize text data
309+
vectorizer = TfidfVectorizer(max_features=5000)
310+
X_train = vectorizer.fit_transform(X)
311+
model = LogisticRegression()
312+
model.fit(X_train, y)
313+
314+
class Review(BaseModel):
315+
review: str
316+
317+
@app.post("/predict/")
318+
def predict_sentiment(review: Review):
319+
X_new = vectorizer.transform([review.review])
320+
prediction = model.predict(X_new)
321+
return {"sentiment": prediction[0]}
322+
323+
@app.get("/health")
324+
def health_check():
325+
return {"status": "ok"}
326+
327+
if __name__ == "__main__":
328+
uvicorn.run(app, host="0.0.0.0", port=8000)
329+
```
330+
331+
- **FastAPI Setup:** The `FastAPI` instance (`app`) is created to define routes for the application.
332+
- **Data Loading:** The IMDb dataset is loaded into a pandas DataFrame.
333+
- **Data Preprocessing:** The reviews (`X`) and their corresponding sentiments (`y`) are extracted.
334+
- **Text Vectorization:** The text reviews are vectorized using `TfidfVectorizer` to convert them into numerical data suitable for machine learning.
335+
- **Model Training:** A logistic regression model is trained on the vectorized reviews.
336+
- **Prediction Endpoint:** The `/predict/` endpoint takes a review as input, vectorizes it, and returns the predicted sentiment.
337+
- **Health Check Endpoint:** The `/health` endpoint returns a simple status message indicating the service is running.
338+
339+
This setup allows users to send HTTP POST requests to the `/predict/` endpoint with a review and receive a sentiment prediction in response.
508 KB
Loading
377 KB
Loading
450 KB
Loading
427 KB
Loading
343 KB
Loading
250 KB
Loading
456 KB
Loading

0 commit comments

Comments
 (0)