-
Notifications
You must be signed in to change notification settings - Fork 548
Unified Serving Capture, Memory-Only Isolation, and Realtime Hardening #3927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
This commit introduces a new capture configuration system that simplifies the handling of capture modes in ZenML. The `Capture` class is now used to define capture settings, allowing for more explicit and typed configurations. The pipeline decorator and runtime management have been updated to support this new structure, enhancing clarity and usability. Additionally, the `MemoryStepRuntime` and `RealtimeStepRuntime` classes have been improved to better manage runtime states and error reporting, including the implementation of a circuit breaker for resilience under load. This refactor aims to streamline the serving architecture and improve the overall performance and maintainability of the codebase.
This commit introduces significant enhancements to the ZenML serving architecture, focusing on a unified capture configuration and memory-only execution mode. The `Capture` class has been simplified to a single typed API, streamlining the capture process and eliminating confusion around capture modes. Key changes include the introduction of memory-only serving, which ensures no database or filesystem writes occur, and the implementation of a robust realtime runtime with improved resource management and error handling. Additionally, request parameter validation has been enhanced to ensure safe merging and type coercion, while logging and metrics have been refined for better observability. These updates aim to provide a more efficient and user-friendly experience for serving pipelines, paving the way for future enhancements and production readiness.
This commit introduces a new Alembic migration that creates the `pipeline_endpoint` table and modifies the `pipeline_deployment` table to include additional capture-related columns. The new schema supports enhanced capture configurations for pipeline endpoints, improving the overall functionality and flexibility of the ZenML framework. The migration includes the following changes: - Creation of the `pipeline_endpoint` table with relevant fields and constraints. - Addition of columns for capturing various aspects of pipeline deployments, such as memory usage, logs, and metrics. This update lays the groundwork for improved pipeline management and monitoring capabilities.
This commit updates the weather pipeline example to include a memory-only capture configuration using the `Capture` class. This enhancement allows for improved management of pipeline execution without persisting data to a database or filesystem. Additionally, the `run_entity_manager.py` file has been modified to utilize `field(default_factory=...)` for better initialization of dataclass fields, ensuring that default values are generated correctly. The `step_launcher.py` file has also been updated to handle memory-only stubs gracefully during execution interruptions. These changes contribute to a more robust and flexible pipeline serving architecture, aligning with recent refactors in the ZenML framework.
This commit introduces a new `DefaultStepRuntime` class, consolidating the runtime logic previously scattered across multiple files. The `MemoryStepRuntime` and `DefaultStepRuntime` have been separated into their respective files, enhancing modularity and maintainability. Additionally, the `weather_pipeline.py` example has been updated to ensure proper execution of the pipeline. These changes aim to streamline the step execution process and improve the overall structure of the ZenML codebase, aligning with recent architectural enhancements.
This commit introduces a new document, `beta_todos.md`, outlining a comprehensive checklist for post-beta hardening efforts. The checklist includes key areas such as serving runtime, artifact write semantics, request parameter schema, monitoring, and resource management. Each section details specific tasks aimed at enhancing production readiness and scalability. This addition serves as a roadmap for future improvements and ensures that all necessary steps are documented for achieving a robust and reliable deployment of the ZenML framework.
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Broken Links ReportSummary
Details
📂 Full file paths
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub. |
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
Describe changes
This PR delivers a focused, pragmatic refactor to make serving reliable and easy to reason about for a beta release. It simplifies capture configuration to a single typed
Capture
, unifies the execution path, introduces memory-only isolation, and hardens the realtime runtime with bounded resources and better shutdown behavior.Summary
Capture(memory_only, code, logs, metadata, visualizations, metrics)
.Below is a high-level architecture and request flow for clarity.
Motivation
Key Behavioral Changes
Capture
type; dicts/strings disallowed in code paths.File-Level Changes (Selected)
Capture & Compiler
src/zenml/capture/config.py
: SingleCapture
dataclass; removed BatchCapture/RealtimeCapture/CapturePolicy.src/zenml/config/compiler.py
: Normalizes typed capture into canonical deployment fields.src/zenml/models/v2/core/pipeline_deployment.py
: Adds canonical capture fields to deployment models.src/zenml/zen_stores/schemas/pipeline_deployment_schemas.py
: Adds DB columns for canonical capture fields.Orchestrator
src/zenml/orchestrators/step_launcher.py
:_validate_and_merge_request_params
(allowlist + type coercion + size caps).memory://
URIs.src/zenml/orchestrators/run_entity_manager.py
: In-memory step_run stub with minimal config (enable_*
,substitutions
).src/zenml/orchestrators/utils.py
: Serving context helpers and docstrings; removed request-level override plumbing.Execution Runtimes
src/zenml/execution/step_runtime.py
:MemoryStepRuntime
: instance-scoped store/locks; per-run cleanup; no globals.DefaultStepRuntime.store_output_artifacts
: defensive batch create (retries/backoff), response count validation; TODO for server-side atomicity.src/zenml/execution/realtime_runtime.py
:Serving Service & Docs
src/zenml/deployers/serving/service.py
: Serving context handling; parameter exposure; cleanup.docs/book/serving/*
: Updated to single Capture, serving async default, memory-only warning/behavior.examples/serving/README.md
: Updated to reflect new serving model; memory-only usage.Configuration & Tuning
ZENML_RT_CACHE_TTL_SECONDS
(default 60),ZENML_RT_CACHE_MAX_ENTRIES
(default 256)ZENML_RT_ERR_REPORT_INTERVAL
(default 15), circuit breaker envs unchangedTesting & Validation
/runs
or/artifact_versions
calls; explicit log:[Memory-only] … in-process handoff (no runs/artifacts)
.Risk & Mitigations
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes