diff --git a/Readme.md b/Readme.md index 58af483..baab905 100644 --- a/Readme.md +++ b/Readme.md @@ -1,11 +1,11 @@ # Introduction -This is an experimental CLI. It will allow you to manage user code deployments for a dagster instance that's deployed to kubernetes. It can package your code branch into a docker container, upload it to your ACR and update your existing Dagster instance (on kubernetes) to have your user code deployment. +This experimental CLI allows you to manage user code deployments for a Dagster instance deployed on Kubernetes. It packages your code branch into a Docker container, uploads it to your container registry, and updates your existing Dagster instance to enable your user code deployment. # Pre-requisites -* Kubectl + valid kubectl config +* Kubectl with a valid config * Helm3 * Podman * Python3.10+ @@ -14,29 +14,30 @@ This is an experimental CLI. It will allow you to manage user code deployments f # Installation * `pip install dagster-uc` -* Create a configuration file in the root of your repository or in your home directory named `.config_user_code_deployments.yaml`, similar to this example. You can also create one by doing `dagster-uc init-config -f '.config_user_code_deployments.yaml'` +* Create a configuration file named `.config_user_code_deployments.yaml` in the root of your repository or your home directory. You can also create one by running `dagster-uc init-config -f '.config_user_code_deployments.yaml'`. ```yaml dev: cicd: false code_path: dagster_pipelines/repo.py + image_prefix: 'team-alpha' container_registry: myacr.azurecr.io dagster_gui_url: null dagster_version: 1.8.4 docker_root: . dockerfile: ./Dockerfile environment: dev - image_prefix: 'team-alpha' kubernetes_context: "my-kubernetes-context" + namespace: dagster-dev limits: cpu: '2' memory: 2Gi - namespace: . node: small repository_root: . requests: cpu: '1' memory: 1Gi + use_project_name: true use_az_login: false user_code_deployment_env: - name: ON_K8S @@ -51,7 +52,93 @@ dev: verbose: false ``` -# Usage +# Instructions + +* To deploy the currently checked out Git branch, run `dagster-uc deployment deploy`. +* To see all possible commands, run `dagster-uc --help` + +## Environment Configuration + +Dagster-uc allows you to have specific user-code deployement configurations per environment. This enables different configurations for your Kubernetes cluster, container registry, resource usage, etc. + +The default environment used is `dev`, so you need to have `dev` in your configuration file. Other environment names are up to you. An example structure: + +```yaml +dev: + container_registry: dev-project.azurecr.io + ... +acc: + container_registry: acc-project.azurecr.io. + ... +prd: + container_registry: prd-project.azurecr.io + ... +``` + +Specify the environment with `dagster-uc --environment prd deployment deploy`, or `dagster-uc -e prd deployment deploy` to use the prd config for the deployment. + +### Overriding Config Settings Through Environment Variables + +It's possible to dynamically set different values for fields in one of the environment configurations, while loading the config. This can be achieved through environment variables, examples: + +* `export CICD=TRUE` +* `export VERBOSE=TRUE` + +## Branches + +Dagster-uc deploys a Git branch as a code location to Dagster. When `cicd: true` is set in the config_user_code_deployments.yaml, the deployment name of the code location is derived from the `environment` config variable. + +If `cicd: false` the deployment name is derived from the Git branch. The branch name is transformed by replacing non-alphanumeric characters with hyphens and removing any leading or trailing hyphens. + +Example: Git branch `feat: my amazing feature` becomes deployment `feat-my-amazing-feature` + +### Multiple Deployments of the Same Branch + +During deployment, you can provide a `--deployment-name-suffix` to add a suffix to your deployment name. This is useful for testing by deploying the same branch twice with different configurations. + +### Multi-Project Deployment in One Dagster Instance + +With the `use_project_name` flag in the dagster-uc configuration file, you can prefix the project name to the user-code deployment. The project name is taken from the `pyproject.toml`, so you need to call dagster-uc in the same directory as the `.toml` file. + +For example, if the project name in `pyproject.toml` is `my_dummy_project`, the deployment name will be `my-dummy-project--feat-my-amazing-feature`. + +> **Important:** Internally the deployment name will use `--` to seperate the project and branch, which is visible on Kubernetes and the container registry. However in the dagster UI it appears as `project:branch` + +## Containers + +Dagster-uc creates a container image from your existing codebase during deployment. + +An example Dockerfile: + +```Dockerfile +FROM python:3.11-slim +ARG BRANCH_NAME +ARG DIR="APP" +WORKDIR $DIR +COPY my_project my_project # Contains all code +COPY pyproject.toml uv.lock README.md ./ +RUN --mount=type=cache,target=/root/.cache/uv \ + uv sync --no-dev --link-mode=copy +ENV PATH="/$DIR/.venv/bin:$PATH" +ENV BRANCH_NAME=${BRANCH_NAME} +``` + +Set `code_path` in the configuration to the path of the Python executable containing the Dagster definitions. This is used to start the gRPC server. For more details, see the [Dagster K8S docs](https://docs.dagster.io/guides/deploy/deployment-options/kubernetes/deploying-to-kubernetes). + +Set `image_prefix` to prefix all the build images. Useful for grouping images under a prefix. + +### Versioning + +Dagster-uc deploys each image with a version number as a tag. The versioning is done by checking the latest version of that image in the container registry, and then increment by one. + +Without this, using the same image tag would cause Dagster to pull the latest image of that tag during existing jobs, potentially causing data inconsistencies. + +> **Important:** Use a custom garbage collection policy to remove old branches or keep only the last X tag versions to prevent your container registry from growing too large in size. + +### Requirements + +Dagster-uc passes a `build-arg=BRANCH_NAME` to the image building step.This is useful because you can script the use of the `BRANCH_NAME` environment variable in your Dagster project code to perform different tasks, such as using a custom IO manager or different secrets. The branch name is either the Git branch or the environment when `cicd` is `true`. + +## Kubernetes -* In order to deploy the currently checked out git branch, run `dagster-uc deployment deploy` -* In order to see all possible commands, run `dagster-uc --help` +Instruct dagster-uc to use the correct `kubernetes_context` that can access your `namespace`. Additionally, configure the pod to use specific compute resource `requests` and `limits`, and set secrets as environment variables using `user_code_deployment_env_secrets` or plain environment variables using `user_code_deployment_env`. diff --git a/dagster_uc/uc_handler.py b/dagster_uc/uc_handler.py index f76b28f..19c8c34 100644 --- a/dagster_uc/uc_handler.py +++ b/dagster_uc/uc_handler.py @@ -431,10 +431,10 @@ def get_deployment_name( # noqa: D102 branch += deployment_name_suffix branch = re.sub(r"[^a-zA-Z0-9]+", "-", branch).strip("-") # Strips double -- name = f"{project_name}--{branch}" if project_name is not None else branch - else: branch = self.config.environment name = f"{project_name}--{branch}" if project_name is not None else branch + return DagsterDeployment( full_name=name, branch_name=branch,