-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: add deployer API for airflow, config params for sfn/batch/k8s #3106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
npow
wants to merge
5
commits into
master
Choose a base branch
from
npow/core-deployer-changes-clean
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
ee41919
feat: add deployer API for airflow, config params for sfn/batch/k8s, …
89c52c1
style: fix black formatting, remove .omc files
d1fb3a2
chore: remove .omc files from tracking
a368b2b
test: remove xfails now that deployer code is included
446e38e
test: remove BATCH_CLIENT_PARAMS xfail, now in core
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,187 @@ | ||
| """ | ||
| Thin wrapper around the Airflow 2.x REST API (api/v1). | ||
|
|
||
| All methods raise ``AirflowClientError`` on non-2xx responses so callers | ||
| don't have to inspect status codes themselves. | ||
| """ | ||
|
|
||
| import json | ||
| import time | ||
| import urllib.request | ||
| import urllib.error | ||
| import base64 | ||
| from typing import Any, Dict, List, Optional | ||
|
|
||
| from .exception import AirflowException | ||
|
|
||
|
|
||
| class AirflowClientError(AirflowException): | ||
| headline = "Airflow REST API error" | ||
|
|
||
|
|
||
| class AirflowClient: | ||
| """ | ||
| Minimal Airflow REST API client (Airflow >= 2.0). | ||
|
|
||
| Parameters | ||
| ---------- | ||
| rest_api_url : str | ||
| Base URL of the Airflow REST API, e.g. ``http://localhost:8090/api/v1``. | ||
| username : str | ||
| Basic-auth username (default: ``"admin"``). | ||
| password : str | ||
| Basic-auth password (default: ``"admin"``). | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| rest_api_url: str, | ||
| username: str = "admin", | ||
| password: str = "admin", | ||
| ): | ||
| self._base = rest_api_url.rstrip("/") | ||
| credentials = base64.b64encode( | ||
| ("%s:%s" % (username, password)).encode() | ||
| ).decode() | ||
| self._auth_header = "Basic %s" % credentials | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # Internal helpers | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def _request( | ||
| self, | ||
| method: str, | ||
| path: str, | ||
| body: Optional[Dict] = None, | ||
| ) -> Any: | ||
| url = "%s/%s" % (self._base, path.lstrip("/")) | ||
| data = json.dumps(body).encode() if body is not None else None | ||
| req = urllib.request.Request( | ||
| url, | ||
| data=data, | ||
| method=method, | ||
| headers={ | ||
| "Authorization": self._auth_header, | ||
| "Content-Type": "application/json", | ||
| "Accept": "application/json", | ||
| }, | ||
| ) | ||
| try: | ||
| with urllib.request.urlopen(req) as resp: | ||
| raw = resp.read() | ||
| return json.loads(raw) if raw else {} | ||
| except urllib.error.HTTPError as e: | ||
| body = e.read().decode(errors="replace") | ||
| raise AirflowClientError( | ||
| "Airflow API %s %s returned HTTP %d: %s" % (method, url, e.code, body) | ||
| ) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # DAG operations | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def get_dag(self, dag_id: str) -> Dict: | ||
| """Return DAG metadata dict, or None if not found.""" | ||
| try: | ||
| return self._request("GET", "dags/%s" % dag_id) | ||
| except AirflowClientError as e: | ||
| if "HTTP 404" in str(e): | ||
| return None | ||
| raise | ||
|
|
||
| def patch_dag(self, dag_id: str, **fields) -> Dict: | ||
| """Patch DAG fields (e.g. ``is_paused=False``).""" | ||
| return self._request("PATCH", "dags/%s" % dag_id, body=fields) | ||
|
|
||
| def delete_dag(self, dag_id: str) -> bool: | ||
| """Delete a DAG. Returns True on success.""" | ||
| try: | ||
| self._request("DELETE", "dags/%s" % dag_id) | ||
| return True | ||
| except AirflowClientError: | ||
| return False | ||
|
|
||
| def list_dags(self, tags: Optional[List[str]] = None) -> List[Dict]: | ||
| """List all visible DAGs, optionally filtered by tags.""" | ||
| params = "" | ||
| if tags: | ||
| params = "?" + "&".join("tags=%s" % t for t in tags) | ||
| result = self._request("GET", "dags%s" % params) | ||
| return result.get("dags", []) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # DAG run operations | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def trigger_dag_run( | ||
| self, | ||
| dag_id: str, | ||
| conf: Optional[Dict] = None, | ||
| run_id: Optional[str] = None, | ||
| ) -> Dict: | ||
| """Trigger a DAG run. Returns the dag_run dict.""" | ||
| body: Dict[str, Any] = {} | ||
| if conf: | ||
| body["conf"] = conf | ||
| if run_id: | ||
| body["dag_run_id"] = run_id | ||
| return self._request("POST", "dags/%s/dagRuns" % dag_id, body=body) | ||
|
|
||
| def get_dag_run(self, dag_id: str, dag_run_id: str) -> Dict: | ||
| """Return dag_run dict for a specific run.""" | ||
| return self._request("GET", "dags/%s/dagRuns/%s" % (dag_id, dag_run_id)) | ||
|
|
||
| def list_dag_runs(self, dag_id: str, limit: int = 25) -> List[Dict]: | ||
| """List recent dag runs for a DAG.""" | ||
| result = self._request( | ||
| "GET", | ||
| "dags/%s/dagRuns?limit=%d&order_by=-execution_date" % (dag_id, limit), | ||
| ) | ||
| return result.get("dag_runs", []) | ||
|
|
||
| def patch_dag_run(self, dag_id: str, dag_run_id: str, **fields) -> Dict: | ||
| """Patch a dag run (e.g. set state to 'failed' to terminate it).""" | ||
| return self._request( | ||
| "PATCH", | ||
| "dags/%s/dagRuns/%s" % (dag_id, dag_run_id), | ||
| body=fields, | ||
| ) | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # Utility | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def wait_for_dag( | ||
| self, | ||
| dag_id: str, | ||
| timeout: int = 120, | ||
| polling_interval: int = 5, | ||
| ) -> Dict: | ||
| """ | ||
| Poll until the DAG is visible in Airflow (after kubectl-cp / file copy). | ||
|
|
||
| Returns the DAG metadata dict when found. | ||
|
|
||
| Raises | ||
| ------ | ||
| TimeoutError | ||
| If the DAG is not discovered within *timeout* seconds. | ||
| """ | ||
| deadline = time.time() + timeout | ||
| while time.time() < deadline: | ||
| try: | ||
| dag = self.get_dag(dag_id) | ||
| except OSError: | ||
| # Transient connection error (e.g. RemoteDisconnected) — | ||
| # the webserver may still be starting up. Retry silently. | ||
| time.sleep(polling_interval) | ||
| continue | ||
| if dag is not None: | ||
| return dag | ||
| time.sleep(polling_interval) | ||
| raise TimeoutError( | ||
| "DAG '%s' did not appear in Airflow within %d seconds. " | ||
| "Ensure the DAG file was copied to the dags folder and " | ||
| "that the Airflow scheduler is running." % (dag_id, timeout) | ||
| ) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIRFLOW_REST_API_USERNAMEandAIRFLOW_REST_API_PASSWORDdefault to"admin"and"admin". While these are meant to be overridden for real deployments, the defaults could cause accidental authentication against a remote Airflow instance using well-known credentials. Consider defaulting toNoneand raising a clear error at call-time when they are not set.