Skip to content

Commit feb7421

Browse files
authored
Merge pull request #20 from aelzeiny/ami-helper
Ami helper
2 parents 54c74ce + 2332b4a commit feb7421

File tree

3 files changed

+74
-22
lines changed

3 files changed

+74
-22
lines changed

.github/workflows/main.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
22
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
33

4-
name: AWS Airflow Executor Tester
4+
name: AWS Airflow Executors
55

66
on:
77
push:

batch_ami_helper.py

+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
"""
2+
It turns out that the #1 source of grief is setting up AWS correctly.
3+
This file will help folks debug their AWS Permissions
4+
"""
5+
import logging
6+
import time
7+
import unittest
8+
from airflow_aws_executors.batch_executor import AwsBatchExecutor
9+
from airflow.utils.state import State
10+
11+
12+
class BatchAMIHelper(unittest.TestCase):
13+
def setUp(self):
14+
self._log = logging.getLogger("FargateAMIHelper")
15+
self.executor = AwsBatchExecutor()
16+
self.executor.start()
17+
18+
def test_boto_submit_job(self):
19+
self.executor._submit_job(None, ['airflow', 'version'], None)
20+
21+
def test_boto_describe_job(self):
22+
job_id = self.executor._submit_job(None, ['airflow', 'version'], None)
23+
self.executor._describe_tasks([job_id])
24+
25+
def test_boto_terminate_job(self):
26+
job_id = self.executor._submit_job(None, ['airflow', 'version'], None)
27+
self.executor.batch.terminate_job(
28+
jobId=job_id,
29+
reason='Testing AMI permissions'
30+
)
31+
32+
def test_sample_airflow_task(self):
33+
job_id = self.executor._submit_job(None, ['airflow', 'version'], None)
34+
job = None
35+
while job is None or job.get_job_state() == State.QUEUED:
36+
responses = self.executor._describe_tasks([job_id])
37+
assert responses, 'No response received'
38+
job = responses[0]
39+
time.sleep(1)
40+
41+
self.assertEqual(job.get_job_state(), State.SUCCESS, 'AWS Batch Job did not run successfully!')
42+

getting_started_batch.md

+31-21
Original file line numberDiff line numberDiff line change
@@ -48,38 +48,48 @@ Build this image and upload it to a private repository. You may want to use Dock
4848
the container. *BE SURE YOU HAVE THIS*! It's heavily important that commands like
4949
`["airflow", "run", <dag_id>, <task_id>, <execution_date>]` are accepted by your container's entrypoint script.
5050

51-
4. Run through the AWS Batch Creation Wizard on AWS Console. The executor does not have any
51+
4. Run through the AWS Batch Creation Wizard on AWS Console. The executor does not have any
5252
prerequisites to how you create your Job Queue or Compute Environment. Go nuts; have at it. I'll refer you to the
5353
[AWS Docs' Getting Started with Batch](https://docs.aws.amazon.com/batch/latest/userguide/Batch_GetStarted.html).
5454
You will need to assign the right IAM roles for the remote S3 logging.
5555
Also, your dynamically provisioned EC2 instances do not need to be connected to the public internet,
5656
private subnets in private VPCs are encouraged. However, be sure that all instances has access to your Airflow MetaDB.
5757

58-
5. When creating a Job Definition choose the 'container' type and point to the private repo. The 'commands' array is
58+
5. When creating a Job Definition choose the 'container' type and point to the private repo. The 'commands' array is
5959
optional on the task-definition level. At runtime, Airflow commands will be injected here by the AwsBatchExecutor!
6060

61-
6. Let's go back to that machine in step #1 that's running the Scheduler. We'll use the same docker container as
61+
6. Let's go back to that machine in step #1 that's running the Scheduler. We'll use the same docker container as
6262
before; except we'll do something like `docker run ... airflow webserver` and `docker run ... airflow scheduler`.
6363
Here are the minimum IAM roles that the executor needs to launch tasks, feel free to tighten the resources around the
6464
job-queues and compute environments that you'll use.
65-
```json
66-
{
67-
"Version": "2012-10-17",
68-
"Statement": [
69-
{
70-
"Sid": "AirflowBatchRole",
71-
"Effect": "Allow",
72-
"Action": [
73-
"batch:SubmitJob",
74-
"batch:DescribeJobs",
75-
"batch:TerminateJob"
76-
],
77-
"Resource": "*"
78-
}
79-
]
80-
}
81-
```
82-
7. You're done. Configure & launch your scheduler. However, maybe you did something real funky with your AWS Batch compute
65+
```json
66+
{
67+
"Version": "2012-10-17",
68+
"Statement": [
69+
{
70+
"Sid": "AirflowBatchRole",
71+
"Effect": "Allow",
72+
"Action": [
73+
"batch:SubmitJob",
74+
"batch:DescribeJobs",
75+
"batch:TerminateJob"
76+
],
77+
"Resource": "*"
78+
}
79+
]
80+
}
81+
```
82+
7. Let's test your AMI configuration before we launch the webserver or scheduler.
83+
[Copy & Run this python file somewhere on the machine that has your scheduler](batch_ami_helper.py).
84+
To simplify, this file will make sure that your scheduler and your cluster have correct AMI permissions. Your scheduler
85+
should be able to launch, describe, plus terminate Batch jobs, and connect to your Airflow Meta DB of choice.
86+
Meanwhile, your cluster should have the ability to pull the docker container, and also
87+
connect to your Airflow MetaDB of choice.
88+
```bash
89+
python3 -m unittest batch_ami_helper.py
90+
```
91+
92+
9. You're done. Configure & launch your scheduler. However, maybe you did something real funky with your AWS Batch compute
8393
environment. The good news is that you have full control over how the executor submits jobs.
8494
See the [#Extensibility](./readme.md) section in the readme. Thank you for taking the time to set this up!
8595

0 commit comments

Comments
 (0)