-
Notifications
You must be signed in to change notification settings - Fork 548
Add Modal orchestrator with step operator and orchestrator flavors #3733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 14 commits
a599f69
b122c63
cd6b59f
58d72c5
9a0f2a5
7cf8fd5
60d227a
45ed008
0b6b9f9
d8318b1
b7400b6
ee15117
85aeef4
01f7ef2
6deba69
481972b
12ac7c4
788ae84
8cdb3c2
f2218b1
e9e2574
74bf8ca
331a164
2770359
f3c0159
3173f6a
37d4a57
e8b6b42
32a56df
7616eef
3e7fbc5
5f22bb0
376d7e4
04d0204
34f717a
bc1f488
32b5571
31b2c13
4a90c3d
312d490
1392ec0
f9c0818
11d63b3
75e0615
4fa278f
79a449a
a499755
2ae1e8b
72a9f72
9dc7bfd
5be346d
c12a37f
9a6daaf
0b5311a
5f4c207
623c5fb
27af8a3
0b37850
b5935f6
4be1fc7
be58a00
1b2cfbe
c739f88
5d7a44d
7685c46
d9fbeb2
4fb95b2
c410179
6387d42
083320f
15e27a8
4f4ab4a
5f4c69a
bbbe9df
0208989
83d1bbc
b7d8cc2
bb9790b
5b09c58
2e618d7
44121d1
099264e
8d6bfec
8ec1c08
d272580
0160c16
1490d39
6f23076
0533723
a638328
092ed52
f15ce9c
9053fc1
e131b22
322f1e2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
@@ -0,0 +1,375 @@ | ||||
--- | ||||
description: Orchestrating your pipelines to run on Modal's serverless cloud platform. | ||||
--- | ||||
|
||||
# Modal Orchestrator | ||||
|
||||
Using the ZenML `modal` integration, you can orchestrate and scale your ML pipelines on [Modal's](https://modal.com/) serverless cloud platform with minimal setup and maximum efficiency. | ||||
|
||||
The Modal orchestrator is designed for speed and cost-effectiveness, running entire pipelines in single serverless functions to minimize cold starts and optimize resource utilization. | ||||
|
||||
{% hint style="warning" %} | ||||
This component is only meant to be used within the context of a [remote ZenML deployment scenario](https://docs.zenml.io/getting-started/deploying-zenml/). Usage with a local ZenML deployment may lead to unexpected behavior! | ||||
{% endhint %} | ||||
|
||||
## When to use it | ||||
|
||||
You should use the Modal orchestrator if: | ||||
htahir1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
|
||||
* you want a serverless solution that scales to zero when not in use. | ||||
* you're looking for fast pipeline execution with minimal cold start overhead. | ||||
* you want cost-effective ML pipeline orchestration without managing infrastructure. | ||||
* you need easy access to GPUs and high-performance computing resources. | ||||
* you prefer a simple setup process without complex Kubernetes configurations. | ||||
|
||||
## How to deploy it | ||||
|
||||
The Modal orchestrator runs on Modal's cloud infrastructure, so you don't need to deploy or manage any servers. You just need: | ||||
|
||||
1. A [Modal account](https://modal.com/) (free tier available) | ||||
2. Modal CLI installed and authenticated | ||||
3. A [remote ZenML deployment](https://docs.zenml.io/getting-started/deploying-zenml/) for production use | ||||
|
||||
## How to use it | ||||
|
||||
To use the Modal orchestrator, we need: | ||||
|
||||
* The ZenML `modal` integration installed. If you haven't done so, run: | ||||
```shell | ||||
zenml integration install modal | ||||
``` | ||||
* [Docker](https://www.docker.com) installed and running. | ||||
* A [remote artifact store](../artifact-stores/README.md) as part of your stack. | ||||
* A [remote container registry](../container-registries/README.md) as part of your stack. | ||||
* Modal CLI installed and authenticated: | ||||
```shell | ||||
pip install modal | ||||
|
||||
modal setup | ||||
``` | ||||
|
||||
### Setting up the orchestrator | ||||
|
||||
You can register the orchestrator with or without explicit Modal credentials: | ||||
|
||||
**Option 1: Using Modal CLI authentication (recommended for development)** | ||||
|
||||
```shell | ||||
# Register the orchestrator (uses Modal CLI credentials) | ||||
zenml orchestrator register <ORCHESTRATOR_NAME> \ | ||||
--flavor=modal \ | ||||
--synchronous=true | ||||
|
||||
# Register and activate a stack with the new orchestrator | ||||
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set | ||||
``` | ||||
|
||||
**Option 2: Using Modal API token (recommended for production)** | ||||
|
||||
```shell | ||||
# Register the orchestrator with explicit credentials | ||||
zenml orchestrator register <ORCHESTRATOR_NAME> \ | ||||
--flavor=modal \ | ||||
--token=<MODAL_TOKEN> \ | ||||
--workspace=<MODAL_WORKSPACE> \ | ||||
--synchronous=true | ||||
Comment on lines
100
to
106
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should use |
||||
|
||||
# Register and activate a stack with the new orchestrator | ||||
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set | ||||
``` | ||||
|
||||
You can get your Modal token from the [Modal dashboard](https://modal.com/settings/tokens). | ||||
|
||||
{% hint style="info" %} | ||||
ZenML will build a Docker image called `<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>` which includes your code and use it to run your pipeline steps in Modal functions. Check out [this page](https://docs.zenml.io/how-to/customize-docker-builds/) if you want to learn more about how ZenML builds these images and how you can customize them. | ||||
{% endhint %} | ||||
|
||||
You can now run any ZenML pipeline using the Modal orchestrator: | ||||
|
||||
```shell | ||||
python file_that_runs_a_zenml_pipeline.py | ||||
``` | ||||
|
||||
### Modal UI | ||||
|
||||
Modal provides an excellent web interface where you can monitor your pipeline runs in real-time, view logs, and track resource usage. | ||||
|
||||
You can access the Modal dashboard at [modal.com/apps](https://modal.com/apps) to see your running and completed functions. | ||||
|
||||
### Configuration overview | ||||
strickvl marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
|
||||
The Modal orchestrator uses two types of settings following ZenML's standard pattern: | ||||
|
||||
1. **`ResourceSettings`** (standard ZenML) - for hardware resource quantities: | ||||
- `cpu_count` - Number of CPU cores | ||||
- `memory` - Memory allocation (e.g., "16GB") | ||||
- `gpu_count` - Number of GPUs to allocate | ||||
|
||||
2. **`ModalOrchestratorSettings`** (Modal-specific) - for Modal platform configuration: | ||||
- `gpu` - GPU type specification (e.g., "T4", "A100", "H100") | ||||
- `region` - Cloud region preference | ||||
- `cloud` - Cloud provider selection | ||||
- `execution_mode` - How to run the pipeline | ||||
- `timeout`, `min_containers`, `max_containers` - Performance settings | ||||
|
||||
{% hint style="info" %} | ||||
**GPU Configuration**: Use `ResourceSettings.gpu_count` to specify how many GPUs you need, and `ModalOrchestratorSettings.gpu` to specify what type of GPU. Modal will combine these automatically (e.g., `gpu_count=2` + `gpu="A100"` becomes `"A100:2"`). | ||||
{% endhint %} | ||||
|
||||
### Additional configuration | ||||
|
||||
Here's how to configure both types of settings: | ||||
|
||||
```python | ||||
from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
ModalOrchestratorSettings | ||||
) | ||||
from zenml.config import ResourceSettings | ||||
|
||||
# Configure Modal-specific settings | ||||
modal_settings = ModalOrchestratorSettings( | ||||
gpu="A100", # GPU type (optional) | ||||
region="us-east-1", # Preferred region | ||||
cloud="aws", # Cloud provider | ||||
execution_mode="pipeline", # or "per_step" | ||||
timeout=3600, # 1 hour timeout | ||||
min_containers=1, # Keep warm containers | ||||
max_containers=10, # Scale up to 10 containers | ||||
) | ||||
|
||||
# Configure hardware resources (quantities) | ||||
resource_settings = ResourceSettings( | ||||
cpu_count=16, # Number of CPU cores | ||||
memory="32GB", # 32GB RAM | ||||
gpu_count=1 # Number of GPUs (combined with gpu type below) | ||||
) | ||||
|
||||
@pipeline( | ||||
settings={ | ||||
"orchestrator": modal_settings, | ||||
"resources": resource_settings | ||||
} | ||||
) | ||||
def my_modal_pipeline(): | ||||
# Your pipeline steps here | ||||
... | ||||
``` | ||||
|
||||
### Resource configuration | ||||
|
||||
{% hint style="info" %} | ||||
**Pipeline-Level Resources**: The Modal orchestrator uses pipeline-level resource settings to configure the Modal function for the entire pipeline. All steps share the same Modal function resources. Configure resources at the `@pipeline` level for best results. | ||||
{% endhint %} | ||||
|
||||
You can configure pipeline-wide resource requirements using `ResourceSettings` for hardware resources and `ModalOrchestratorSettings` for Modal-specific configurations: | ||||
|
||||
```python | ||||
from zenml.config import ResourceSettings | ||||
from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
ModalOrchestratorSettings | ||||
) | ||||
|
||||
# Configure resources at the pipeline level (recommended) | ||||
@pipeline( | ||||
settings={ | ||||
"resources": ResourceSettings( | ||||
cpu_count=16, | ||||
memory="32GB", | ||||
gpu_count=1 # These resources apply to the entire pipeline | ||||
), | ||||
"orchestrator": ModalOrchestratorSettings( | ||||
gpu="A100", # GPU type for the entire pipeline | ||||
region="us-west-2" | ||||
) | ||||
} | ||||
) | ||||
def my_pipeline(): | ||||
first_step() # Runs with pipeline resources: 16 CPU, 32GB RAM, 1x A100 | ||||
second_step() # Runs with same resources: 16 CPU, 32GB RAM, 1x A100 | ||||
... | ||||
|
||||
@step | ||||
def first_step(): | ||||
# Uses pipeline-level resource configuration | ||||
... | ||||
|
||||
@step | ||||
def second_step(): | ||||
# Uses same pipeline-level resource configuration | ||||
... | ||||
``` | ||||
|
||||
### Execution modes | ||||
|
||||
The Modal orchestrator supports two execution modes: | ||||
|
||||
1. **`pipeline` (default)**: Runs the entire pipeline in a single Modal function for maximum speed and cost efficiency | ||||
|
||||
2. **`per_step`**: Runs each step in a separate Modal function call for granular control and debugging | ||||
|
||||
{% hint style="info" %} | ||||
**Resource Sharing**: Both execution modes use the same Modal function with the same resource configuration (from pipeline-level settings). The difference is whether steps run sequentially in one function call (`pipeline`) or as separate function calls (`per_step`). | ||||
{% endhint %} | ||||
|
||||
```python | ||||
# Fast execution (default) - entire pipeline in one function | ||||
modal_settings = ModalOrchestratorSettings( | ||||
execution_mode="pipeline" | ||||
) | ||||
|
||||
# Granular execution - each step separate (useful for debugging) | ||||
modal_settings = ModalOrchestratorSettings( | ||||
execution_mode="per_step" | ||||
) | ||||
``` | ||||
|
||||
### Using GPUs | ||||
|
||||
Modal makes it easy to use GPUs for your ML workloads. Use `ResourceSettings` to specify the number of GPUs and `ModalOrchestratorSettings` to specify the GPU type: | ||||
htahir1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
|
||||
```python | ||||
from zenml.config import ResourceSettings | ||||
from zenml.integrations.modal.flavors.modal_orchestrator_flavor import ( | ||||
ModalOrchestratorSettings | ||||
) | ||||
|
||||
@step( | ||||
settings={ | ||||
"resources": ResourceSettings( | ||||
gpu_count=1 # Number of GPUs to allocate | ||||
), | ||||
"orchestrator": ModalOrchestratorSettings( | ||||
gpu="A100", # GPU type: "T4", "A10G", "A100", "H100" | ||||
region="us-east-1" | ||||
) | ||||
} | ||||
) | ||||
def train_model(): | ||||
# Your GPU-accelerated training code | ||||
# Modal will provision 1x A100 GPU (gpu_count=1 + gpu="A100") | ||||
import torch | ||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | ||||
print(f"Using device: {device}") | ||||
... | ||||
``` | ||||
|
||||
Available GPU types include: | ||||
- `T4` - Cost-effective for inference and light training | ||||
- `A10G` - Balanced performance for training and inference | ||||
- `A100` - High-performance for large model training | ||||
- `H100` - Latest generation for maximum performance | ||||
|
||||
**Examples of GPU configurations (applied to entire pipeline):** | ||||
|
||||
```python | ||||
# Pipeline with GPU - configure on first step or pipeline level | ||||
@pipeline( | ||||
settings={ | ||||
"resources": ResourceSettings(gpu_count=1), | ||||
"orchestrator": ModalOrchestratorSettings(gpu="A100") | ||||
} | ||||
) | ||||
def gpu_pipeline(): | ||||
# All steps in this pipeline will have access to 1x A100 GPU | ||||
step_one() | ||||
step_two() | ||||
|
||||
# Multiple GPUs - configure at pipeline level | ||||
@pipeline( | ||||
settings={ | ||||
"resources": ResourceSettings(gpu_count=4), | ||||
"orchestrator": ModalOrchestratorSettings(gpu="A100") | ||||
} | ||||
) | ||||
def multi_gpu_pipeline(): | ||||
# All steps in this pipeline will have access to 4x A100 GPUs | ||||
training_step() | ||||
evaluation_step() | ||||
``` | ||||
|
||||
### Synchronous vs Asynchronous execution | ||||
|
||||
You can choose whether to wait for pipeline completion or run asynchronously: | ||||
|
||||
```python | ||||
# Wait for completion (default) | ||||
modal_settings = ModalOrchestratorSettings( | ||||
synchronous=True | ||||
) | ||||
|
||||
# Fire-and-forget execution | ||||
modal_settings = ModalOrchestratorSettings( | ||||
synchronous=False | ||||
) | ||||
``` | ||||
|
||||
### Authentication with different environments | ||||
|
||||
For production deployments, you can specify different Modal environments: | ||||
Comment on lines
519
to
533
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe could have a little info box in this section (or maybe even above, linking down here) to say that you might want to have two different stacks, each associated with a different modal environment, one for prod and the other for development etc etc. |
||||
|
||||
```python | ||||
modal_settings = ModalOrchestratorSettings( | ||||
environment="production", # or "staging", "dev", etc. | ||||
workspace="my-company" | ||||
) | ||||
``` | ||||
|
||||
### Warm containers for faster execution | ||||
|
||||
Modal orchestrator uses persistent apps with warm containers to minimize cold starts: | ||||
|
||||
```python | ||||
modal_settings = ModalOrchestratorSettings( | ||||
min_containers=2, # Keep 2 containers warm | ||||
max_containers=20, # Scale up to 20 containers | ||||
) | ||||
|
||||
@pipeline( | ||||
settings={ | ||||
"orchestrator": modal_settings | ||||
} | ||||
) | ||||
def my_pipeline(): | ||||
... | ||||
``` | ||||
|
||||
This ensures your pipelines start executing immediately without waiting for container initialization. | ||||
strickvl marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||
|
||||
## Best practices | ||||
|
||||
1. **Use pipeline mode for production**: The default `pipeline` execution mode runs your entire pipeline in one function, minimizing overhead and cost. | ||||
|
||||
2. **Separate resource and orchestrator settings**: Use `ResourceSettings` for hardware (CPU, memory, GPU count) and `ModalOrchestratorSettings` for Modal-specific configurations (GPU type, region, etc.). | ||||
|
||||
3. **Configure appropriate timeouts**: Set realistic timeouts for your workloads: | ||||
```python | ||||
modal_settings = ModalOrchestratorSettings( | ||||
timeout=7200 # 2 hours | ||||
) | ||||
``` | ||||
|
||||
4. **Choose the right region**: Select regions close to your data sources to minimize transfer costs and latency. | ||||
|
||||
5. **Use appropriate GPU types**: Match GPU types to your workload requirements - don't use A100s for simple inference tasks. | ||||
|
||||
6. **Monitor resource usage**: Use Modal's dashboard to track your resource consumption and optimize accordingly. | ||||
|
||||
## Troubleshooting | ||||
|
||||
### Common issues | ||||
|
||||
1. **Authentication errors**: Ensure your Modal token is correctly configured and has the necessary permissions. | ||||
|
||||
2. **Image build failures**: Check that your Docker registry credentials are properly configured in your ZenML stack. | ||||
|
||||
3. **Resource limits**: If you hit resource limits, consider breaking large steps into smaller ones or requesting quota increases from Modal. | ||||
|
||||
4. **Network timeouts**: For long-running steps, ensure your timeout settings are appropriate. | ||||
|
||||
### Getting help | ||||
|
||||
- Check the [Modal documentation](https://modal.com/docs) for platform-specific issues | ||||
- Monitor your functions in the [Modal dashboard](https://modal.com/apps) | ||||
- Use `zenml logs` to view detailed pipeline execution logs | ||||
|
||||
For more information and a full list of configurable attributes of the Modal orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-modal.html#zenml.integrations.modal.orchestrators). | ||||
|
||||
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure> | ||||
|
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some representative screenshot of the Modal UI in here to make the docs a bit friendlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think its fine without