Skip to content

Commit 5f94669

Browse files
Merge pull request #25 from alan-eu/fernst/update_airflow_210
Update local MWAA to 2.10
2 parents 2f14747 + d5df31d commit 5f94669

14 files changed

Lines changed: 677 additions & 624 deletions

README.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,15 @@
1+
## Note
2+
Starting from Airflow version 2.9, MWAA has open-sourced the original Docker image used in our production deployments. You can refer to our open-source image repository at https://github.com/aws/amazon-mwaa-docker-images to create a local environment identical to that of MWAA.
3+
You can also continue to use the MWAA Local Runner for testing and packaging requirements for all Airflow versions supported on MWAA.
4+
15
# About aws-mwaa-local-runner
26

37
This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.
48

5-
*Please note: MWAA/AWS/DAG/Plugin issues should be raised through AWS Support or the Airflow Slack #airflow-aws channel. Issues here should be focused on this local-runner repository.*
9+
_Please note: MWAA/AWS/DAG/Plugin issues should be raised through AWS Support or the Airflow Slack #airflow-aws channel. Issues here should be focused on this local-runner repository._
610

11+
_Please note: The dynamic configurations which are dependent on the class of an environment are
12+
aligned with the Large environment class in this repository._
713

814
## About the CLI
915

@@ -14,7 +20,7 @@ The CLI builds a Docker container image locally that’s similar to a MWAA produ
1420
```text
1521
dags/
1622
example_lambda.py
17-
example_dag_with_taskflow_api.py
23+
example_dag_with_taskflow_api.py
1824
example_redshift_data_execute_sql.py
1925
docker/
2026
config/
@@ -34,7 +40,7 @@ docker/
3440
Dockerfile
3541
plugins/
3642
README.md
37-
requirements/
43+
requirements/
3844
requirements.txt
3945
.gitignore
4046
CODE_OF_CONDUCT.md
@@ -102,7 +108,7 @@ The following section describes where to add your DAG code and supporting files.
102108

103109
#### Requirements.txt
104110

105-
1. Add Python dependencies to `requirements/requirements.txt`.
111+
1. Add Python dependencies to `requirements/requirements.txt`.
106112
2. To test a requirements.txt without running Apache Airflow, use the following script:
107113

108114
```bash
@@ -117,7 +123,7 @@ Collecting aws-batch (from -r /usr/local/airflow/dags/requirements.txt (line 1))
117123
Downloading https://files.pythonhosted.org/packages/5d/11/3aedc6e150d2df6f3d422d7107ac9eba5b50261cf57ab813bb00d8299a34/aws_batch-0.6.tar.gz
118124
Collecting awscli (from aws-batch->-r /usr/local/airflow/dags/requirements.txt (line 1))
119125
Downloading https://files.pythonhosted.org/packages/07/4a/d054884c2ef4eb3c237e1f4007d3ece5c46e286e4258288f0116724af009/awscli-1.19.21-py2.py3-none-any.whl (3.6MB)
120-
100% |████████████████████████████████| 3.6MB 365kB/s
126+
100% |████████████████████████████████| 3.6MB 365kB/s
121127
...
122128
...
123129
...
@@ -136,7 +142,7 @@ For example usage see [Installing Python dependencies using PyPi.org Requirement
136142

137143
#### Custom plugins
138144

139-
- There is a directory at the root of this repository called plugins.
145+
- There is a directory at the root of this repository called plugins.
140146
- In this directory, create a file for your new custom plugin.
141147
- Add any Python dependencies to `requirements/requirements.txt`.
142148

@@ -165,7 +171,7 @@ The following section contains common questions and answers you may encounter wh
165171
### Can I test execution role permissions using this repository?
166172

167173
- You can setup the local Airflow's boto with the intended execution role to test your DAGs with AWS operators before uploading to your Amazon S3 bucket. To setup aws connection for Airflow locally see [Airflow | AWS Connection](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html)
168-
To learn more, see [Amazon MWAA Execution Role](https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-create-role.html).
174+
To learn more, see [Amazon MWAA Execution Role](https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-create-role.html).
169175
- You can set AWS credentials via environment variables set in the `docker/config/.env.localrunner` env file. To learn more about AWS environment variables, see [Environment variables to configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) and [Using temporary security credentials with the AWS CLI](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html#using-temp-creds-sdk-cli). Simply set the relevant environment variables in `.env.localrunner` and `./mwaa-local-env start`.
170176

171177
### How do I add libraries to requirements.txt and test install?

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.8.1
1+
2.10.1

docker/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ LABEL maintainer="amazon"
88

99
# Airflow
1010
## Version specific ARGs
11-
ARG AIRFLOW_VERSION=2.8.1
12-
ARG WATCHTOWER_VERSION=3.0.1
13-
ARG PROVIDER_AMAZON_VERSION=8.16.0
11+
ARG AIRFLOW_VERSION=2.10.1
12+
ARG WATCHTOWER_VERSION=3.3.1
13+
ARG PROVIDER_AMAZON_VERSION=8.28.0
1414

1515
## General ARGs
1616
ARG AIRFLOW_USER_HOME=/usr/local/airflow

docker/config/airflow.cfg

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ sensitive_var_conn_names =
157157
# Task Slot counts for ``default_pool``. This setting would not have any effect in an existing
158158
# deployment where the ``default_pool`` is already created. For existing deployments, users can
159159
# change the number of slots using Webserver, API or the CLI
160-
default_pool_task_slot_count = 10000
160+
default_pool_task_slot_count = 200
161161

162162
[database]
163163
# Collation for ``dag_id``, ``task_id``, ``key`` columns in case they have different encoding.
@@ -342,7 +342,7 @@ backend = airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBac
342342
# See documentation for the secrets backend you are using. JSON is expected.
343343
# Example for AWS Systems Manager ParameterStore:
344344
# ``{{"connections_prefix": "/airflow/connections", "profile_name": "default"}}``
345-
backend_kwargs = {"connections_prefix" : "airflow-prod/connection", "variables_prefix" : "airflow-prod/variable", "config_prefix": "airflow-prod/config"}
345+
backend_kwargs = {"connections_prefix" : "airflow-prod/connection", "variables_prefix" : "airflow-prod/variable", "config_prefix": "airflow-prod/config", "connections_lookup_pattern":"^(?!aws_default$).*$"}
346346

347347
[cli]
348348
# In what way should the cli access the API. The LocalClient will use the
@@ -815,7 +815,7 @@ catchup_by_default = True
815815
# complexity of query predicate, and/or excessive locking.
816816
# Additionally, you may hit the maximum allowable query length for your db.
817817
# Set this to 0 for no limit (not advised)
818-
max_tis_per_query = 512
818+
max_tis_per_query = 16
819819

820820
# Should the scheduler issue ``SELECT ... FOR UPDATE`` in relevant queries.
821821
# If this is set to False then you should not run more than a single
@@ -832,7 +832,7 @@ max_dagruns_per_loop_to_schedule = 20
832832
# Should the Task supervisor process perform a "mini scheduler" to attempt to schedule more tasks of the
833833
# same DAG. Leaving this on will mean tasks in the same DAG execute quicker, but might starve out other
834834
# dags in some circumstances
835-
schedule_after_task_execution = True
835+
schedule_after_task_execution = False
836836

837837
# The scheduler can run multiple processes in parallel to parse dags.
838838
# This defines how many processes will run.
@@ -1031,3 +1031,21 @@ shards = 5
10311031

10321032
# comma separated sensor classes support in smart_sensor.
10331033
sensors_enabled = NamedHivePartitionSensor
1034+
1035+
[usage_data_collection]
1036+
# Airflow integrates `Scarf <https://about.scarf.sh/>`__ to collect basic platform and usage data
1037+
# during operation. This data assists Airflow maintainers in better understanding how Airflow is used.
1038+
# Insights gained from this telemetry are critical for prioritizing patches, minor releases, and
1039+
# security fixes. Additionally, this information supports key decisions related to the development road map.
1040+
# Check the FAQ doc for more information on what data is collected.
1041+
#
1042+
# Deployments can opt-out of analytics by setting the ``enabled`` option
1043+
# to ``False``, or the ``SCARF_ANALYTICS=false`` environment variable.
1044+
# Individual users can easily opt-out of analytics in various ways documented in the
1045+
# `Scarf Do Not Track docs <https://docs.scarf.sh/gateway/#do-not-track>`__.
1046+
1047+
# Enable or disable usage data collection and sending.
1048+
#
1049+
# Variable: AIRFLOW__USAGE_DATA_COLLECTION__ENABLED
1050+
#
1051+
enabled = False

0 commit comments

Comments
 (0)