Skip to content

Commit f627041

Browse files
committed
Add lambda article
1 parent ba61b0f commit f627041

File tree

6 files changed

+378
-1
lines changed

6 files changed

+378
-1
lines changed
Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
# Saving Money With AWS Lambdas
2+
3+
![AWS Billing Dashboard](/static/aws-billing.png)
4+
5+
## I host this website and my Jenkins setup on AWS, and have detailed in the past how I [automate the setup of this](/2022-01-15_terraform-and-ansible). However, running an EC2 24/7 to handle occasional CI builds for personal projects didn't seem like the best idea, and I wanted to see if I could save money by thinking the process through a bit. This explains how I used AWS lambdas to automate the started of an EC2 through the AWS CLI, and how to set this up within GitHub and Jenkins to drop your single EC2 instance bill by around 90-95%.
6+
7+
I host this website and my Jenkins setup on AWS, and have detailed in the past how I [automate the setup of this](/2022-01-15_terraform-and-ansible). However, running an EC2 24/7 to handle occasional CI builds for personal projects didn't seem like the best idea, and I wanted to see if I could save money by thinking the process through a bit. This explains how I used AWS lambdas to automate the started of an EC2 through the AWS CLI, and how to set this up within GitHub and Jenkins to drop your single EC2 instance bill by around 90-95%.
8+
9+
### The Plan
10+
11+
To make sure this was going to cover everything I used the Jenkins instance for, along with saving some money, it had to have some relatively clear requirements:
12+
13+
1. It can be triggered by a user (me) or via a webhook on GitHub
14+
2. It must shut itself down when not performing any activity, but should try to avoid shutting down mid-build (this would be fairly undesirable)
15+
3. In the case of being started by some other means (e.g. manually through the console), it should still be able to shut itself down
16+
4. It should be able to start itself periodically to run scheduled jobs once a day
17+
18+
There was going to be a few additions to my AWS account to make this happen, ultimately the plan was going to look like the following:
19+
20+
![Plan for AWS EC2 Instance Starting Automation, Including Lambdas](/static/aws-start-plan.png)
21+
22+
- **API Gateway** - Any requests would head through an API Gateway first, which would serve a separate DNS that could be used to handle API requests, and that I could use in the first instance to ensure the instance was online
23+
- **On-demand Lambda** - Checks if the instance is online or scheduled to start, starts it and returns a holding page that refreshes automatically. If the instance is already online, it updates the scheduled stop time to keep it online for another 30 mins and transparently serves the request
24+
- **Cloudwatch Event** - Triggers the Cron Lambda once per day at 5am
25+
- **Cron Lambda** - Works the same as the On-demand Lambda but only triggers the start, does not handle API requests
26+
- **Start/stop Storage** - Records when the instance was scheduled to start and stop, and when it has actually started and stopped
27+
- **Jenkins Instance Crons** - When the instance first boots, it makes sure there is a scheduled stop time in the database and record a default one for 30 mins time if not. It then runs every 5 mins to check if the scheduled stop time has passed, and shuts the instance down if it has
28+
29+
The process for starting will be as follows:
30+
31+
1. A request comes from either me or a GitHub Webhook
32+
2. The On-demand Lambda records that it's starting the instance (and the default scheduled stop time), triggers a start through the AWS API and returns a holding page
33+
3. The Lambda is re-requested and checks the instance state, waiting until it is online to pass the request through
34+
4. When ready, the Lambda passes through the request and either the user or GitHub receives their response
35+
5. GitHub won't wait and resend the request, so to get around this problem, all the GitHub multi-branch pipelines are set to refresh when the Jenkins instance comes online - this means it picks up any new branches or PRs and the builds are triggered immediately
36+
6. In the background, the Jenkins instance checks every 5 mins to see if it is time to shut down yet
37+
7. If any further GitHub webhooks or user requests are made, the On-demand Lambda updates the scheduled stop time so the instance will stay online to complete the new build
38+
8. Once the scheduled stop time has elapsed, the instance shuts itself down
39+
40+
Now that the plan is in place, let's walk through the On-demand Lambda code and how it runs.
41+
42+
### Lambda Code
43+
44+
The Lambda code itself is a Python file with deliberately few dependencies. It depends on the data provided by the API Gateway during the initial request. A point of reference is the [Amazon documentation for Lambdas and API Gateways](https://docs.aws.amazon.com/lambda/latest/dg/services-apigateway.html#apigateway-proxy), as this shows how the event is presented to the Lambda function.
45+
46+
```python
47+
import boto3
48+
import datetime
49+
import time
50+
import urllib.parse
51+
import urllib3
52+
53+
HOST_OUTPUT = 'some-example.myjenkins.com'
54+
EC2_INSTANCE_NAME = 'myjenkins'
55+
DYNAMO_TABLE = 'myjenkins-ec2-status'
56+
57+
def lambda_handler(event, context):
58+
59+
# Collect query string and event parameters
60+
queryString = '' if not event['queryStringParameters'] else '?' + urllib.parse.urlencode(event['queryStringParameters'])
61+
print(f"Received request: {event['httpMethod']} https://{event['headers']['Host']}{event['path']}{queryString}")
62+
instance_id = None
63+
instance_state = None
64+
65+
# Check if instance is started and running
66+
print('Loading boto client libraries')
67+
ec2 = boto3.resource('ec2', region_name='eu-west-1')
68+
ec2_client = boto3.client('ec2', region_name='eu-west-1')
69+
dynamodb = boto3.resource('dynamodb')
70+
71+
# Get current status from DynamoDB
72+
print('Loading DynamoDB table and entry')
73+
table = dynamodb.Table(DYNAMO_TABLE)
74+
existing = table.get_item(
75+
Key={
76+
'InstanceName': EC2_INSTANCE_NAME
77+
}
78+
)
79+
80+
print('Loading instance details')
81+
instances = ec2.instances.filter(
82+
Filters=[{'Name': 'tag:Name', 'Values': [EC2_INSTANCE_NAME]}]
83+
)
84+
85+
# Only one instance will be returned, so this loop will only occur once but is required for the iterable instances object
86+
for instance in instances:
87+
instance_id = instance.id
88+
instance_state = instance.state['Code']
89+
90+
# If instance is not started, start it
91+
print(f'Instance ID: {instance_id}, state: {instance_state}')
92+
if instance_state > 16:
93+
print('Instance needs to be started')
94+
start = datetime.datetime.now()
95+
stop = start + datetime.timedelta(minutes=30)
96+
97+
# Update or create the DynamoDB table entry
98+
if existing != None and 'Item' in existing:
99+
print('Updating DynamoDB entry')
100+
table.update_item(
101+
Key={
102+
'InstanceName': EC2_INSTANCE_NAME
103+
},
104+
UpdateExpression='SET StartRequestedAt=:start, StartCompletedAt=:empty, StopRequestedAt=:stop, StopCompletedAt=:empty',
105+
ExpressionAttributeValues={
106+
':start': start.strftime('%Y-%m-%d %H:%M:%S'),
107+
':stop': stop.strftime('%Y-%m-%d %H:%M:%S'),
108+
':empty': ''
109+
}
110+
)
111+
else:
112+
print('Creating DynamoDB entry')
113+
table.put_item(
114+
Item={
115+
'InstanceName': EC2_INSTANCE_NAME,
116+
'StartRequestedAt': start,
117+
'StartCompletedAt': '',
118+
'StopRequestedAt': stop,
119+
'StopCompletedAt': ''
120+
}
121+
)
122+
print('Starting instance')
123+
instance.start()
124+
else:
125+
126+
# Instance is already online, but this is a new request so the stop time must be updated
127+
print('Updating stop time based on new request')
128+
print('Updating DynamoDB entry')
129+
start = datetime.datetime.now()
130+
stop = start + datetime.timedelta(minutes=30)
131+
table.update_item(
132+
Key={
133+
'InstanceName': EC2_INSTANCE_NAME
134+
},
135+
UpdateExpression='SET StopRequestedAt=:stop, StopCompletedAt=:empty',
136+
ExpressionAttributeValues={
137+
':stop': stop.strftime('%Y-%m-%d %H:%M:%S'),
138+
':empty': ''
139+
}
140+
)
141+
142+
# Describe the instance to check its deeper status
143+
response = None if instance_state != 16 else ec2_client.describe_instance_status(InstanceIds=[instance_id])
144+
if response is None or len(response['InstanceStatuses']) < 1 or response['InstanceStatuses'][0]['InstanceStatus']['Status'] != 'ok':
145+
print('Instance is not ready, received this state data:')
146+
147+
# Depending on request type, either return a refresh or simply wait for the instance to be available
148+
if event['httpMethod'] == 'GET':
149+
print('Returning GET HTML response')
150+
return {
151+
'statusCode': 200,
152+
'headers': {
153+
'Content-Type': 'text/html'
154+
},
155+
'body': '<html><head><meta http-equiv="refresh" content="30"><title>Instance Start/Stop</title></head><body><h1>Instance is starting, please wait...</h1><h2>This page will refresh every 30 seconds...</h2></body></html>'
156+
}
157+
else:
158+
# Wait for instance to become available
159+
print('Waiting for instance to be OK')
160+
while response is None or len(response['InstanceStatuses']) < 1 or response['InstanceStatuses'][0]['InstanceStatus']['Status'] != 'ok':
161+
time.sleep(5)
162+
response = ec2_client.describe_instance_status(InstanceIds=[instance_id])
163+
164+
print('Instance is running, returning response')
165+
166+
# Forward the request either as a GET, or proxy the POST response back (for GitHUb Webhooks)
167+
if event['httpMethod'] == 'GET':
168+
print(f'Returning redirect response to {HOST_OUTPUT}')
169+
return {
170+
'statusCode': 302,
171+
'headers': {
172+
'Location': 'https://' + HOST_OUTPUT + event['path'] + queryString
173+
}
174+
}
175+
176+
print(f'POSTing to end state and returning response to {HOST_OUTPUT}')
177+
http = urllib3.PoolManager()
178+
response = http.request('POST', 'https://' + HOST_OUTPUT + event['path'] + queryString, headers=event['headers'], body=event['body'])
179+
return {
180+
'statusCode': response.status,
181+
'headers': {
182+
'Content-Type': 'application/json'
183+
},
184+
'body': response.data
185+
}
186+
```
187+
188+
### Deploying with Terraform
189+
190+
Once your code is ready you can deploy it using a Terraform setup similar to the following (this isn't a complete TF file). This presumes you already have a Route53 entry available for your API Gateway and the EC2 instance itself. The EC2 instance will also need permissions to read/write to the DynamoDB table and to be able to shut itself down.
191+
192+
```hcl
193+
resource "aws_dynamodb_table" "start_stop" {
194+
name = "myjenkins-ec2-status"
195+
hash_key = "InstanceName"
196+
billing_mode = "PAY_PER_REQUEST"
197+
198+
attribute {
199+
name = "InstanceName"
200+
type = "S"
201+
}
202+
}
203+
204+
resource "aws_iam_role" "lambda_start_stop" {
205+
name = "iam-role-lambda-start-stop"
206+
207+
assume_role_policy = <<POLICY
208+
{
209+
"Version": "2012-10-17",
210+
"Statement": [
211+
{
212+
"Action": "sts:AssumeRole",
213+
"Principal": {
214+
"Service": "lambda.amazonaws.com"
215+
},
216+
"Effect": "Allow"
217+
}
218+
]
219+
}
220+
POLICY
221+
}
222+
223+
resource "aws_iam_policy" "lambda_start_stop" {
224+
name = "iam-policy-lambda-start-stop"
225+
policy = <<POLICY
226+
{
227+
"Version": "2012-10-17",
228+
"Statement": [
229+
{
230+
"Effect": "Allow",
231+
"Action": [
232+
"ec2:StartInstances",
233+
"ec2:StopInstances",
234+
"dynamodb:PutItem",
235+
"dynamodb:DescribeTable",
236+
"dynamodb:GetItem",
237+
"dynamodb:UpdateItem"
238+
],
239+
"Resource": [
240+
"arn:aws:ec2:eu-west-1:890879110541:instance/*",
241+
"${aws_dynamodb_table.start_stop.arn}"
242+
]
243+
},
244+
{
245+
"Effect": "Allow",
246+
"Action": [
247+
"ec2:DescribeInstances",
248+
"ec2:DescribeInstanceStatus"
249+
],
250+
"Resource": "*"
251+
}
252+
]
253+
}
254+
POLICY
255+
}
256+
257+
resource "aws_iam_role_policy_attachment" "lambda_start_stop" {
258+
role = aws_iam_role.lambda_start_stop.name
259+
policy_arn = aws_iam_policy.lambda_start_stop.arn
260+
}
261+
262+
data "archive_file" "lambda_ci_start_on_request" {
263+
type = "zip"
264+
source_file = "lambda/ci-start-on-request/lambda_function.py"
265+
output_path = "lambda_ci_start_on_request.zip"
266+
}
267+
268+
resource "aws_lambda_function" "lambda_ci_start_on_request" {
269+
filename = "lambda_ci_start_on_request.zip"
270+
function_name = "instance-ci-start-on-request-handler"
271+
role = aws_iam_role.lambda_start_stop.arn
272+
handler = "lambda_function.lambda_handler"
273+
runtime = "python3.9"
274+
timeout = 300
275+
276+
source_code_hash = data.archive_file.lambda_ci_start_on_request.output_base64sha256
277+
}
278+
279+
resource "aws_lambda_permission" "lambda_apigw_permission" {
280+
statement_id = "AllowExecutionFromApiGW"
281+
action = "lambda:InvokeFunction"
282+
function_name = aws_lambda_function.lambda_ci_start_on_request.function_name
283+
principal = "apigateway.amazonaws.com"
284+
}
285+
286+
resource "aws_acm_certificate" "ci_gateway" {
287+
domain_name = "some-example-gw.myjenkins.com"
288+
validation_method = "DNS"
289+
290+
lifecycle {
291+
create_before_destroy = true
292+
}
293+
}
294+
295+
resource "aws_apigatewayv2_domain_name" "ci_gateway" {
296+
domain_name = "some-example-gw.myjenkins.com"
297+
298+
domain_name_configuration {
299+
certificate_arn = aws_acm_certificate.ci_gateway.arn
300+
endpoint_type = "REGIONAL"
301+
security_policy = "TLS_1_2"
302+
}
303+
}
304+
305+
resource "aws_apigatewayv2_api" "ci_gateway" {
306+
name = "ci-gateway"
307+
protocol_type = "HTTP"
308+
}
309+
310+
resource "aws_apigatewayv2_integration" "ci_gateway" {
311+
api_id = aws_apigatewayv2_api.ci_gateway.id
312+
integration_type = "AWS_PROXY"
313+
314+
integration_method = "POST"
315+
integration_uri = aws_lambda_function.lambda_ci_start_on_request.invoke_arn
316+
}
317+
318+
resource "aws_apigatewayv2_route" "ci_gateway" {
319+
api_id = aws_apigatewayv2_api.ci_gateway.id
320+
route_key = "$default"
321+
target = "integrations/${aws_apigatewayv2_integration.ci_gateway.id}"
322+
}
323+
324+
resource "aws_apigatewayv2_deployment" "ci_gateway" {
325+
api_id = aws_apigatewayv2_api.ci_gateway.id
326+
description = "$default"
327+
328+
lifecycle {
329+
create_before_destroy = true
330+
}
331+
332+
triggers = {
333+
redeployment = sha1(join(",", tolist([
334+
jsonencode(aws_apigatewayv2_integration.ci_gateway),
335+
jsonencode(aws_apigatewayv2_route.ci_gateway),
336+
])))
337+
}
338+
339+
depends_on = [aws_apigatewayv2_route.ci_gateway]
340+
}
341+
342+
resource "aws_apigatewayv2_stage" "ci_gateway" {
343+
api_id = aws_apigatewayv2_api.ci_gateway.id
344+
name = "$default"
345+
auto_deploy = true
346+
}
347+
348+
resource "aws_apigatewayv2_api_mapping" "example" {
349+
api_id = aws_apigatewayv2_api.ci_gateway.id
350+
domain_name = aws_apigatewayv2_domain_name.ci_gateway.id
351+
stage = aws_apigatewayv2_stage.ci_gateway.id
352+
}
353+
```
354+
355+
### Configuring Jenkins and GitHub
356+
357+
The final pieces are to ensure that Jenkins and GitHub are able to work effectively with the new setup.
358+
359+
To configure Jenkins, it's best to ensure any multi-branch pipelines connected to GitHub are using a frequent scan of the remote repository, so it can easily detect new branches and PRs when the Jenkins instance comes online:
360+
361+
![Jenkins Configuration for Multi-branch Pipeline Showing Scan Triggers](/static/jenkins-scan-triggers.png)
362+
363+
For GitHub, ensure your webhook for the repository is using the new DNS that connects to the API Gateway:
364+
365+
![GitHub Webhook Configuration](/static/github-webhook.png)
366+
367+
And that's it! This should be enough top give you an idea of how this could be adapted to your setup.
368+
369+
### Potential Improvements
370+
371+
While the solution I have has been saving me a fair bit of money on my Jenkins setup so far, there are some improvements I'm going to find some time to add when I have a bit more free time:
372+
373+
- **Single DNS**: using a single DNS by efficiently passing through the Lambda when the instance is online would be a better user experience overall
374+
- **Faster Startups**: the current startup detection is not fast, as it waits for the EC2 instance status to be reported through the AWS API, checking the API directly might be better
375+
- **Faster Shutdowns**: tying the shutdowns more directly to the Jenkins builds would allow the system to shut down more promptly when finished to save even more pennies
376+
377+
If you have any comments or want to know more about any of this, please feel free to ping me on [Mastodon](https://howdee.social/@jamiefdhurst), which tends to be the best way to get in contact with me these days. I don't have any plans to add comments onto the blog anytime soon.

blog/static/aws-billing.png

42.9 KB
Loading

blog/static/aws-start-plan.png

76.5 KB
Loading

blog/static/github-webhook.png

173 KB
Loading
134 KB
Loading

blog/templates/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ <h1><a href="/{{ article.get_name() }}">{{ article.get_title() }}</a></h1>
1212
<div class="summary">
1313
{% if article.get_image() %}<p>{{ article.get_image()|safe }}</p>{% endif %}
1414
{% if article.get_summary() %}
15-
<p>{{ article.get_summary() }}</p><p><a href="/{{ article.get_name() }}">Read More</a></p>
15+
<p>{{ article.get_summary()|safe }}</p><p><a href="/{{ article.get_name() }}">Read More</a></p>
1616
{% else %}
1717
<p><a href="/{{ article.get_name() }}">Read</a></p>
1818
{% endif %}

0 commit comments

Comments
 (0)