Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS SageMaker Unified Studio Workflow Operator #45726

Merged
merged 45 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
1118295
Add sagemaker_unified_studio notebook operator, sensor, triggers, and…
agupta01 Dec 17, 2024
44e45c8
Fix sagemaker_unified_studio unit tests
agupta01 Dec 17, 2024
6c627ef
Add basic system test for sagemaker_unified_studio
agupta01 Dec 17, 2024
33c4a3d
Add setup/teardown stubs for sagemaker_unified_studio system test
agupta01 Dec 19, 2024
a1e0766
Add more specifics to SUS system test
agupta01 Jan 7, 2025
5ee6aad
Update name of SUS helper
agupta01 Jan 13, 2025
e378358
Cleanup and format SUS system test
agupta01 Jan 14, 2025
b369756
Merge branch 'apache:main' into main
agupta01 Jan 14, 2025
209cb3c
Merge branch 'apache:main' into main
agupta01 Jan 16, 2025
092c786
Merge branch 'apache:main' into main
agupta01 Jan 17, 2025
55f5486
Merge branch 'apache:main' into main
agupta01 Jan 22, 2025
c640efa
Fix notebook path in SUS system test
agupta01 Jan 22, 2025
41bd997
Merge branch 'apache:main' into main
agupta01 Jan 22, 2025
7acc649
Merge branch 'apache:main' into main
agupta01 Jan 27, 2025
b7ad9db
Update SUS docs
agupta01 Jan 27, 2025
7ca5022
Update SUS docs to include vETL
agupta01 Jan 27, 2025
13a10ce
Clarity updates on SUS docs
agupta01 Jan 28, 2025
756fcad
Merge branch 'apache:main' into main
agupta01 Jan 30, 2025
ffd3363
Merge branch 'apache:main' into main
agupta01 Feb 10, 2025
beaadda
Update SUS operator files after providers refactor
agupta01 Feb 10, 2025
654f4e1
Add failure on timeout
agupta01 Feb 11, 2025
1450dc6
Merge branch 'apache:main' into main
agupta01 Feb 11, 2025
ad56227
Set public sagemaker studio SDK as dependency for SUS operator
agupta01 Feb 11, 2025
1faaf92
Remove private _openapi usage from SageMaker Studio SDK
agupta01 Feb 13, 2025
ae9e2a0
Add extra link for SUS operator
agupta01 Feb 14, 2025
f1721c9
Merge branch 'apache:main' into main
agupta01 Feb 20, 2025
40a2fb5
Move SUS operator unit tests to new location
agupta01 Feb 20, 2025
237f78d
Update SUS system test to use executor_config for environment variables
agupta01 Feb 20, 2025
d60d2e9
Fix linting and formatting
agupta01 Feb 21, 2025
b23c3d3
Fix system test for localexecutor
agupta01 Feb 21, 2025
8e02fbd
Fix formatting
agupta01 Feb 21, 2025
af2dc1a
Merge branch 'apache:main' into main
agupta01 Feb 27, 2025
f001638
Merge branch 'apache:main' into main
agupta01 Feb 27, 2025
99e820e
Update sagemaker-studio lower bound dependency
agupta01 Feb 27, 2025
7373926
Fix SMUS system test
agupta01 Feb 27, 2025
b581b2e
Fix broken link in SMUS documentation
agupta01 Feb 27, 2025
7001393
Merge branch 'apache:main' into main
agupta01 Feb 28, 2025
f52762c
Merge branch 'apache:main' into main
agupta01 Feb 28, 2025
1307ed3
Fix pre-commit violations
agupta01 Feb 28, 2025
de85e5c
Convert tests to pytest
agupta01 Feb 28, 2025
692f3f8
Register hook + add license file
agupta01 Feb 28, 2025
96468ee
Merge branch 'apache:main' into main
agupta01 Feb 28, 2025
3bd17ca
Merge branch 'apache:main' into main
agupta01 Mar 3, 2025
ffcd453
Merge branch 'main' into main
o-nikolas Mar 4, 2025
2e05bcf
Merge branch 'apache:main' into main
agupta01 Mar 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ connectTimeoutMS
connexion
containerConfiguration
containerd
ContainerEntrypoint
ContainerGroup
containerinstance
ContainerPort
Expand Down Expand Up @@ -693,6 +694,7 @@ Gantt
gantt
gapic
gapped
gb
gbq
gcc
gcloud
Expand Down Expand Up @@ -826,6 +828,7 @@ ImageAnnotatorClient
imageORfile
imagePullPolicy
imagePullSecrets
ImageUri
imageVersion
Imap
imap
Expand Down Expand Up @@ -859,6 +862,7 @@ InstanceFlexibilityPolicy
InstanceGroupConfig
InstanceSelection
instanceTemplates
InstanceType
instantiation
integrations
interdependencies
Expand All @@ -876,6 +880,7 @@ IPv4
ipv4
IPv6
ipv6
ipynb
iPython
irreproducible
IRSA
Expand Down Expand Up @@ -1050,6 +1055,7 @@ masterType
Matomo
matomo
Maxime
MaxRuntimeInSeconds
mb
md
mediawiki
Expand Down Expand Up @@ -1373,6 +1379,8 @@ Qubole
qubole
QuboleCheckHook
Quboles
querybook
Querybooks
queryParameters
querystring
queueing
Expand Down Expand Up @@ -1887,8 +1895,10 @@ views
virtualenv
virtualenvs
vm
VolumeKmsKeyId
VolumeMount
volumeMounts
VolumeSizeInGB
vpc
WaiterModel
wape
Expand Down
1 change: 1 addition & 0 deletions generated/provider_dependencies.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
"jsonpath_ng>=1.5.3",
"python3-saml>=1.16.0",
"redshift_connector>=2.0.918",
"sagemaker-studio>=1.0.9",
"watchtower>=3.0.0,!=3.3.0,<4"
],
"devel-deps": [
Expand Down
1 change: 1 addition & 0 deletions providers/amazon/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ PIP package Version required
``PyAthena`` ``>=3.0.10``
``jmespath`` ``>=0.7.0``
``python3-saml`` ``>=1.16.0``
``sagemaker-studio`` ``>=1.0.9``
========================================== ======================

Cross provider package dependencies
Expand Down
60 changes: 60 additions & 0 deletions providers/amazon/docs/operators/sagemakerunifiedstudio.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

===============================
Amazon SageMaker Unified Studio
===============================

`Amazon SageMaker Unified Studio <https://aws.amazon.com/sagemaker/unified-studio/>`__ is a unified development experience that
brings together AWS data, analytics, artificial intelligence (AI), and machine learning (ML) services.
It provides a place to build, deploy, execute, and monitor end-to-end workflows from a single interface.
This helps drive collaboration across teams and facilitate agile development.

Airflow provides operators to orchestrate Notebooks, Querybooks, and Visual ETL jobs within SageMaker Unified Studio Workflows.

Prerequisite Tasks
------------------

To use these operators, you must do a few things:

* Create a SageMaker Unified Studio domain and project, following the instruction in `AWS documentation <https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/getting-started.html>`__.
* Within your project:
* Navigate to the "Compute > Workflow environments" tab, and click "Create" to create a new MWAA environment.
* Create a Notebook, Querybook, or Visual ETL job and save it to your project.

Operators
---------

.. _howto/operator:SageMakerNotebookOperator:

Create an Amazon SageMaker Unified Studio Workflow
==================================================

To create an Amazon SageMaker Unified Studio workflow to orchestrate your notebook, querybook, and visual ETL runs you can use
:class:`~airflow.providers.amazon.aws.operators.sagemaker_unified_studio.SageMakerNotebookOperator`.

.. exampleinclude:: /../../providers/amazon/tests/system/amazon/aws/example_sagemaker_unified_studio.py
:language: python
:dedent: 4
:start-after: [START howto_operator_sagemaker_unified_studio_notebook]
:end-before: [END howto_operator_sagemaker_unified_studio_notebook]


Reference
---------

* `What is Amazon SageMaker Unified Studio <https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/what-is-sagemaker-unified-studio.html>`__
20 changes: 19 additions & 1 deletion providers/amazon/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,12 @@ integrations:
how-to-guide:
- /docs/apache-airflow-providers-amazon/operators/sagemaker.rst
tags: [aws]
- integration-name: Amazon SageMaker Unified Studio
external-doc-url: https://aws.amazon.com/sagemaker/unified-studio/
logo: /docs/integration-logos/[email protected]
how-to-guide:
- /docs/apache-airflow-providers-amazon/operators/sagemakerunifiedstudio.rst
tags: [aws]
- integration-name: Amazon SecretsManager
external-doc-url: https://aws.amazon.com/secrets-manager/
logo: /docs/integration-logos/[email protected]
Expand Down Expand Up @@ -402,6 +408,9 @@ operators:
- integration-name: Amazon SageMaker
python-modules:
- airflow.providers.amazon.aws.operators.sagemaker
- integration-name: Amazon SageMaker Unified Studio
python-modules:
- airflow.providers.amazon.aws.operators.sagemaker_unified_studio
- integration-name: Amazon Simple Notification Service (SNS)
python-modules:
- airflow.providers.amazon.aws.operators.sns
Expand Down Expand Up @@ -503,6 +512,9 @@ sensors:
- integration-name: Amazon SageMaker
python-modules:
- airflow.providers.amazon.aws.sensors.sagemaker
- integration-name: Amazon SageMaker Unified Studio
python-modules:
- airflow.providers.amazon.aws.sensors.sagemaker_unified_studio
- integration-name: Amazon Simple Queue Service (SQS)
python-modules:
- airflow.providers.amazon.aws.sensors.sqs
Expand Down Expand Up @@ -627,6 +639,9 @@ hooks:
- integration-name: Amazon SageMaker
python-modules:
- airflow.providers.amazon.aws.hooks.sagemaker
- integration-name: Amazon SageMaker Unified Studio
python-modules:
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio
- integration-name: Amazon Simple Email Service (SES)
python-modules:
- airflow.providers.amazon.aws.hooks.ses
Expand Down Expand Up @@ -699,6 +714,9 @@ triggers:
- integration-name: Amazon SageMaker
python-modules:
- airflow.providers.amazon.aws.triggers.sagemaker
- integration-name: Amazon SageMaker Unified Studio
python-modules:
- airflow.providers.amazon.aws.triggers.sagemaker_unified_studio
- integration-name: AWS Glue
python-modules:
- airflow.providers.amazon.aws.triggers.glue
Expand Down Expand Up @@ -734,7 +752,6 @@ triggers:
python-modules:
- airflow.providers.amazon.aws.triggers.dms


transfers:
- source-integration-name: Amazon DynamoDB
target-integration-name: Amazon Simple Storage Service (S3)
Expand Down Expand Up @@ -837,6 +854,7 @@ extra-links:
- airflow.providers.amazon.aws.links.glue.GlueJobRunDetailsLink
- airflow.providers.amazon.aws.links.logs.CloudWatchEventsLink
- airflow.providers.amazon.aws.links.sagemaker.SageMakerTransformJobLink
- airflow.providers.amazon.aws.links.sagemaker_unified_studio.SageMakerUnifiedStudioLink
- airflow.providers.amazon.aws.links.step_function.StateMachineDetailsLink
- airflow.providers.amazon.aws.links.step_function.StateMachineExecutionsDetailsLink
- airflow.providers.amazon.aws.links.comprehend.ComprehendPiiEntitiesDetectionLink
Expand Down
1 change: 1 addition & 0 deletions providers/amazon/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ dependencies = [
"PyAthena>=3.0.10",
"jmespath>=0.7.0",
"python3-saml>=1.16.0",
"sagemaker-studio>=1.0.9",
]

# The optional dependencies should be modified in place in the generated file
Expand Down
Loading