diff --git a/docs/docs.json b/docs/docs.json index d80134f7311b..aef1f3b39d35 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -282,6 +282,7 @@ "group": "Extensibility", "pages": [ "v3/advanced/api-client", + "v3/advanced/customize-base-job-templates", "v3/advanced/custom-blocks", "v3/advanced/developing-a-custom-worker", "v3/advanced/experimental-plugins" diff --git a/docs/v3/advanced/customize-base-job-templates.mdx b/docs/v3/advanced/customize-base-job-templates.mdx new file mode 100644 index 000000000000..d9b45b7f50cb --- /dev/null +++ b/docs/v3/advanced/customize-base-job-templates.mdx @@ -0,0 +1,248 @@ +--- +title: Customizing Base Job Templates +description: Learn how to customize Kubernetes base job templates for work pools +--- + +This guide provides comprehensive examples for customizing the Kubernetes base job template. These examples demonstrate common configuration patterns for environment variables, secrets, resource limits, and image pull secrets. + +## Understanding the Base Job Template Structure + +The base job template uses a two-part structure: + +1. **variables**: Define configurable parameters with defaults and descriptions +2. **job_configuration**: Reference variables using `{{ variable_name }}` syntax to apply them to the Kubernetes job manifest + + +Variables defined in the `variables` section must be explicitly referenced in `job_configuration` using `{{ variable_name }}` syntax to take effect. If you customize the template and remove a variable reference from `job_configuration`, that variable's value will not be passed to the worker, even if it's defined in `variables`. + + +## Accessing the Base Job Template + +You can customize the base job template in two ways: + +1. **Through the UI**: Navigate to your work pool → **Advanced** tab → Edit the JSON representation +2. **Through the CLI**: Get the default template to use as a starting point: +```bash +prefect work-pool get-default-base-job-template --type kubernetes +``` + +## Common Configuration Patterns + +### Environment Variables + +Configure environment variables to pass configuration to your flow runs: +```json +{ + "variables": { + "env": { + "title": "Environment Variables", + "description": "Environment variables to set in the container", + "default": {}, + "type": "object", + "additionalProperties": {"type": "string"} + } + }, + "job_configuration": { + "job_manifest": { + "spec": { + "template": { + "spec": { + "containers": [ + { + "name": "prefect-job", + "env": "{{ env }}" + } + ] + } + } + } + } + } +} +``` + +### Secret References + +Reference Kubernetes secrets to inject sensitive data: +```json +{ + "variables": { + "secret_name": { + "title": "Secret Name", + "description": "Name of the Kubernetes secret containing credentials", + "default": null, + "type": "string" + } + }, + "job_configuration": { + "job_manifest": { + "spec": { + "template": { + "spec": { + "containers": [ + { + "name": "prefect-job", + "envFrom": [ + { + "secretRef": { + "name": "{{ secret_name }}" + } + } + ] + } + ] + } + } + } + } + } +} +``` + +### Image Pull Secrets + +Configure authentication for private container registries: +```json +{ + "variables": { + "image_pull_secrets": { + "title": "Image Pull Secrets", + "description": "Names of Kubernetes secrets for pulling images from private registries", + "default": [], + "type": "array", + "items": {"type": "string"} + } + }, + "job_configuration": { + "job_manifest": { + "spec": { + "template": { + "spec": { + "imagePullSecrets": "{{ image_pull_secrets }}" + } + } + } + } + } +} +``` + +### Resource Limits and Requests + +Set CPU and memory resource constraints: +```json +{ + "variables": { + "cpu_request": { + "title": "CPU Request", + "description": "CPU allocation to request for this pod", + "default": "100m", + "type": "string" + }, + "cpu_limit": { + "title": "CPU Limit", + "description": "Maximum CPU allocation for this pod", + "default": "1000m", + "type": "string" + }, + "memory_request": { + "title": "Memory Request", + "description": "Memory allocation to request for this pod", + "default": "256Mi", + "type": "string" + }, + "memory_limit": { + "title": "Memory Limit", + "description": "Maximum memory allocation for this pod", + "default": "1Gi", + "type": "string" + } + }, + "job_configuration": { + "job_manifest": { + "spec": { + "template": { + "spec": { + "containers": [ + { + "name": "prefect-job", + "resources": { + "requests": { + "cpu": "{{ cpu_request }}", + "memory": "{{ memory_request }}" + }, + "limits": { + "cpu": "{{ cpu_limit }}", + "memory": "{{ memory_limit }}" + } + } + } + ] + } + } + } + } + } +} +``` + +## Combining Multiple Configurations + + +These examples show individual configurations. In practice, you'll combine multiple configurations in a single base job template. Remember that any modifications replace the entire default configuration, so include all necessary fields when customizing. + + +When combining configurations, merge the `variables` and `job_configuration` sections. For example, to combine environment variables with resource limits: +```json +{ + "variables": { + "env": { + "title": "Environment Variables", + "description": "Environment variables to set in the container", + "default": {}, + "type": "object", + "additionalProperties": {"type": "string"} + }, + "cpu_request": { + "title": "CPU Request", + "description": "CPU allocation to request for this pod", + "default": "100m", + "type": "string" + }, + "memory_request": { + "title": "Memory Request", + "description": "Memory allocation to request for this pod", + "default": "256Mi", + "type": "string" + } + }, + "job_configuration": { + "job_manifest": { + "spec": { + "template": { + "spec": { + "containers": [ + { + "name": "prefect-job", + "env": "{{ env }}", + "resources": { + "requests": { + "cpu": "{{ cpu_request }}", + "memory": "{{ memory_request }}" + } + } + } + ] + } + } + } + } + } +} +``` + +## Next Steps + +- Learn more about [Kubernetes work pools](/v3/deploy/infrastructure-concepts/work-pools/) +- See [how to run flows on Kubernetes](/v3/how-to-guides/deployment_infra/kubernetes/) +- Explore [overriding job variables](/v3/deploy/infrastructure-concepts/customize/) \ No newline at end of file diff --git a/docs/v3/how-to-guides/deployment_infra/kubernetes.mdx b/docs/v3/how-to-guides/deployment_infra/kubernetes.mdx index 3bace2805944..294985b3901d 100644 --- a/docs/v3/how-to-guides/deployment_infra/kubernetes.mdx +++ b/docs/v3/how-to-guides/deployment_infra/kubernetes.mdx @@ -29,14 +29,13 @@ If you already have one, skip ahead to the next section. Node pools can be backed by either EC2 instances or FARGATE. Choose FARGATE so there's less to manage. The following command takes around 15 minutes and must not be interrupted: - - ```bash +```bash # Replace the cluster name with your own value eksctl create cluster --fargate --name # Authenticate to the cluster. aws eks update-kubeconfig --name - ``` +``` @@ -47,8 +46,7 @@ If you already have one, skip ahead to the next section. To deploy the cluster, your project must have a VPC network configured. First, authenticate to GCP by setting the following configuration options: - - ```bash +```bash # Authenticate to gcloud gcloud auth login @@ -56,12 +54,11 @@ If you already have one, skip ahead to the next section. # Replace the project name with your GCP project name gcloud config set project gcloud config set compute/zone - ``` +``` Next, deploy the cluster. This command takes ~15 minutes to complete. Once the cluster has been created, authenticate to the cluster. - - ```bash +```bash # Create cluster # Replace the cluster name with your own value gcloud container clusters create --num-nodes=1 \ @@ -69,17 +66,15 @@ If you already have one, skip ahead to the next section. # Authenticate to the cluster gcloud container clusters --region - ``` +``` **GCP potential errors** - ``` ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=Service account "000000000000-compute@developer.gserviceaccount.com" is disabled. ``` - You must enable the default service account in the IAM console, or specify a different service account with the appropriate permissions. - ``` creation failed: Constraint constraints/compute.vmExternalIpAccess violated for project 000000000000. Add instance projects//zones/us-east1-b/instances/gke-gke-guide-1-default-pool-c369c84d-wcfl to the constraint to use external IP with it." ``` @@ -97,15 +92,13 @@ page within IAM. or use the Cloud Shell directly from the Azure portal [shell.azure.com](https://shell.azure.com). First, authenticate to Azure if not already done. - - ```bash +```bash az login - ``` +``` Next, deploy the cluster - this command takes ~4 minutes to complete. Once the cluster is created, authenticate to the cluster. - - ```bash +```bash # Create a Resource Group at the desired location, e.g. westus az group create --name --location @@ -118,7 +111,7 @@ page within IAM. # Verify the connection by listing the cluster nodes kubectl get nodes - ``` +``` @@ -134,8 +127,7 @@ If you already have a registry, skip ahead to the next section. Create a registry using the AWS CLI and authenticate the docker daemon to that registry: - - ```bash +```bash # Replace the image name with your own value aws ecr create-repository --repository-name @@ -143,13 +135,12 @@ If you already have a registry, skip ahead to the next section. # Replace the region and account ID with your own values aws ecr get-login-password --region | docker login \ --username AWS --password-stdin .dkr.ecr..amazonaws.com - ``` +``` Create a registry using the gcloud CLI and authenticate the docker daemon to that registry: - - ```bash +```bash # Create artifact registry repository to host your custom image # Replace the repository name with your own value; it can be the # same name as your image @@ -158,13 +149,12 @@ If you already have a registry, skip ahead to the next section. # Authenticate to artifact registry gcloud auth configure-docker us-docker.pkg.dev - ``` +``` Create a registry using the Azure CLI and authenticate the docker daemon to that registry: - - ```bash +```bash # Name must be a lower-case alphanumeric # Tier SKU can easily be updated later, e.g. az acr update --name --sku Standard az acr create --resource-group \ @@ -177,8 +167,7 @@ If you already have a registry, skip ahead to the next section. # You can verify AKS can now reach ACR az aks check-acr --resource-group RESOURCE-GROUP-NAME> --name --acr .azurecr.io - - ``` +``` @@ -267,19 +256,17 @@ executing the flow runs. Select the **Advanced** tab and edit the JSON representation of the base job template. For example, to set a CPU request, add the following section under variables: - - ```json +```json "cpu_request": { "title": "CPU Request", "description": "The CPU allocation to request for this pod.", "default": "default", "type": "string" }, - ``` +``` Next add the following to the first `containers` item under `job_configuration`: - - ```json +```json ... "containers": [ { @@ -292,7 +279,7 @@ executing the flow runs. } ], ... - ``` +``` Running deployments with this work pool will request the specified CPU. @@ -303,6 +290,8 @@ Give the work pool a name and save. Your new Kubernetes work pool should appear in the list of work pools. +For comprehensive examples of customizing the base job template, see [Customizing Base Job Templates](/v3/advanced/customize-base-job-templates). + ## Create a Prefect Cloud API key If you already have a Prefect Cloud API key, you can skip these steps. @@ -325,7 +314,6 @@ The recommended method for deploying a worker is with the [Prefect Helm Chart](h ### Add the Prefect Helm repository Add the Prefect Helm repository to your Helm client: - ```bash helm repo add prefect https://prefecthq.github.io/prefect-helm helm repo update @@ -334,13 +322,11 @@ helm repo update ### Create a namespace Create a new namespace in your Kubernetes cluster to deploy the Prefect worker: - ```bash kubectl create namespace prefect ``` ### Create a Kubernetes secret for the Prefect API key - ```bash kubectl create secret generic prefect-api-key \ --namespace=prefect --from-literal=key=your-prefect-cloud-api-key @@ -350,7 +336,6 @@ kubectl create secret generic prefect-api-key \ Create a `values.yaml` file to customize the Prefect worker configuration. Add the following contents to the file: - ```yaml worker: cloudApiConfig: @@ -369,7 +354,6 @@ For example: \ Define your requirements in a `requirements.txt` file: - ``` prefect>=3.0.0 prefect-docker>=0.4.0 @@ -504,7 +484,6 @@ prefect-kubernetes>=0.3.1 ``` The directory should now look something like this: - ``` . ├── prefect.yaml @@ -522,7 +501,6 @@ an option with the Python deployment creation method. Use the `run_shell_script` command to grab the SHA and pass it to the `tag` parameter of `build_docker_image`: - ```yaml build: - prefect.deployments.steps.run_shell_script: @@ -539,7 +517,6 @@ build: ``` Set the SHA as a tag for easy identification in the UI: - ```yaml definitions: tags: &common_tags @@ -557,7 +534,6 @@ Before deploying the flows to Prefect, you need to authenticate through the Pref You also need to ensure that all of your flow's dependencies are present at `deploy` time. This example uses a virtual environment to ensure consistency across environments. - ```bash # Create a virtualenv & activate it virtualenv prefect-demo @@ -580,22 +556,19 @@ You have configured our `prefect.yaml` file to get the image name from the - - ```bash +```bash export PREFECT_IMAGE_NAME=.dkr.ecr..amazonaws.com/ - ``` +``` - - ```bash +```bash export PREFECT_IMAGE_NAME=us-docker.pkg.dev/// - ``` +``` - - ```bash +```bash export PREFECT_IMAGE_NAME=.azurecr.io/ - ``` +``` @@ -606,10 +579,9 @@ flows with `prefect deploy --all` or deploy them individually by name: `prefect ## Run the flows Once the deployments are successfully created, you can run them from the UI or the CLI: - ```bash prefect deployment run hello/default prefect deployment run hello/arthur ``` -You can now check the status of your two deployments in the UI. +You can now check the status of your two deployments in the UI. \ No newline at end of file