Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 61 additions & 8 deletions plans/2026-01-06-custom-deployment-sdks.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,14 +242,67 @@ This matches how Prefect's server validates parameters (`actions.py:287-304`).
- `SDKData` - Complete data needed for generation (flows, work pools, metadata)

**Status**:
- [ ] Naming utilities created
- [ ] Safe identifier conversion (ASCII, keywords, digits)
- [ ] Safe class name conversion (PascalCase)
- [ ] Reserved name detection and avoidance
- [ ] Collision detection and suffix generation
- [ ] Data models created
- [ ] Unit tests for edge cases (emoji, all-unicode, empty, keywords)
- [ ] Unit tests pass
- [x] Naming utilities created
- [x] Safe identifier conversion (ASCII, keywords, digits)
- [x] Safe class name conversion (PascalCase)
- [x] Reserved name detection and avoidance
- [x] Collision detection and suffix generation
- [x] Data models created
- [x] Unit tests for edge cases (emoji, all-unicode, empty, keywords)
- [x] Unit tests pass

**Phase 2 Implementation Notes** (deviations from plan):

1. **Unicode handling differs from plan**
- Plan: "Strip/replace non-ASCII characters with underscores"
- Implementation: Unicode separators/punctuation (em-dash, non-breaking space) become underscores; other non-ASCII chars are dropped after NFKD normalization
- Rationale: Prevents word-merging (e.g., `a—b` → `a_b` not `ab`) while allowing accented chars to normalize (é → e)
- Result: `café-data` → `cafe_data` (not `caf_data`), `🚀-deploy` → `deploy` (not `_deploy`)

2. **Class names don't get underscore suffix for keywords**
- Plan: `class` → `Class_`
- Implementation: `class` → `Class`
- Rationale: Python is case-sensitive, so `Class` is valid. PascalCase naturally avoids keywords.

3. **Expanded reserved names beyond plan**
- Plan: Flow=`{flows}`, Deployment=`{run, run_async}`
- Implementation: Flow=`{flows, deployments, DeploymentName}`, Deployment=`{run, run_async, with_options, with_infra}`, Module=`{all}`
- Rationale: Prevents conflicts with Phase 3 generated SDK surface

4. **Reserved names stored in normalized form**
- Plan doesn't specify
- Implementation: Reserved sets use normalized names (e.g., `"all"` not `"__all__"`) since `safe_identifier()` normalizes before checking
- Rationale: Otherwise `safe_identifier("__all__", ..., "module")` would return `"all"` (not avoided)

5. **WorkPoolInfo.type renamed to pool_type**
- Plan: `WorkPoolInfo` has `type` field
- Implementation: Field named `pool_type`
- Rationale: Avoids shadowing Python built-in `type`

6. **SDKData.deployment_names is derived, not stored**
- Plan: `SDKData` has `deployment_names` as stored field
- Implementation: Computed property derived from `flows`
- Rationale: Single source of truth; prevents data divergence

7. **Deterministic ordering added**
- Plan doesn't specify ordering
- Implementation: `deployment_names` and `all_deployments()` return sorted results
- Rationale: Ensures deterministic code generation regardless of API response order

8. **Additional SDKData convenience methods**
- Plan doesn't specify
- Implementation: Added `all_deployments()`, `flow_count`, `deployment_count`, `work_pool_count`
- Rationale: Simplifies template rendering and statistics reporting

9. **SDKGenerationMetadata.api_url added**
- Plan doesn't include this field
- Implementation: Added `api_url` field
- Rationale: Better traceability of SDK generation source

10. **German ß limitation**
- Plan doesn't address
- Implementation: ß is dropped (NFKD doesn't decompose it to "ss"), so `straße` → `strae`
- Documented as known limitation

---

Expand Down
139 changes: 139 additions & 0 deletions src/prefect/_sdk/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
"""
Data models for SDK generation.

This module contains the internal data models used to represent workspace data
fetched from the Prefect API, which are then passed to the template renderer.

Note: These models are internal to SDK generation and not part of the public API.
"""

from dataclasses import dataclass, field
from typing import Any


@dataclass
class WorkPoolInfo:
"""Information about a work pool needed for SDK generation.

Attributes:
name: The work pool name as it appears in Prefect.
pool_type: The work pool type (e.g., "kubernetes", "docker", "process").
job_variables_schema: JSON Schema dict for the work pool's job variables.
This is the full schema object (e.g., {"type": "object", "properties": {...}})
from base_job_template["variables"]. Can be empty dict if no job
variables are defined.
"""

name: str
pool_type: str
job_variables_schema: dict[str, Any] = field(default_factory=dict)


@dataclass
class DeploymentInfo:
"""Information about a deployment needed for SDK generation.

Attributes:
name: The deployment name (just the deployment part, not flow/deployment).
flow_name: The name of the flow this deployment belongs to.
full_name: The full deployment name in "flow-name/deployment-name" format.
parameter_schema: JSON Schema dict for the flow's parameters.
This comes from the deployment's parameter_openapi_schema field.
Can be empty dict or None if the flow has no parameters.
work_pool_name: Name of the work pool this deployment uses.
Can be None if the deployment doesn't use a work pool.
description: Optional deployment description for docstrings.
"""

name: str
flow_name: str
full_name: str
parameter_schema: dict[str, Any] | None = None
work_pool_name: str | None = None
description: str | None = None


@dataclass
class FlowInfo:
"""Information about a flow and its deployments.

Groups deployments by their parent flow for organized SDK generation.

Attributes:
name: The flow name.
deployments: List of deployments belonging to this flow.
"""

name: str
deployments: list[DeploymentInfo] = field(default_factory=list)


@dataclass
class SDKGenerationMetadata:
"""Metadata about the SDK generation process.

Attributes:
generation_time: ISO 8601 timestamp of when the SDK was generated.
prefect_version: Version of Prefect used for generation.
workspace_name: Name of the workspace (if applicable).
api_url: The Prefect API URL used.
"""

generation_time: str
prefect_version: str
workspace_name: str | None = None
api_url: str | None = None


@dataclass
class SDKData:
"""Complete data needed for SDK generation.

This is the top-level container passed to the template renderer.

Attributes:
metadata: Generation metadata (time, version, workspace).
flows: Dictionary mapping flow names to FlowInfo objects.
work_pools: Dictionary mapping work pool names to WorkPoolInfo objects.
"""

metadata: SDKGenerationMetadata
flows: dict[str, FlowInfo] = field(default_factory=dict)
work_pools: dict[str, WorkPoolInfo] = field(default_factory=dict)

@property
def deployment_count(self) -> int:
"""Total number of deployments across all flows."""
return sum(len(flow.deployments) for flow in self.flows.values())

@property
def flow_count(self) -> int:
"""Number of flows."""
return len(self.flows)

@property
def work_pool_count(self) -> int:
"""Number of work pools."""
return len(self.work_pools)

@property
def deployment_names(self) -> list[str]:
"""List of all deployment full names (derived from flows).

Returns names sorted alphabetically for deterministic code generation.
"""
names: list[str] = []
for flow in self.flows.values():
for deployment in flow.deployments:
names.append(deployment.name)
return sorted(names)
Comment on lines +120 to +129

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action Required

1. deployment_names returns wrong values 📘 Rule Violation

SDKData.deployment_names is documented as returning deployment full_name values, but it
  actually appends deployment.name, making the property’s behavior inconsistent with its own
  docstring.
• This is misleading/self-non-documenting and can cause downstream code generation to use incomplete
  identifiers (deployment-only names) instead of flow/deployment names.
• The mismatch also undermines deterministic output expectations because the returned values are not
  the intended ones.
Agent Prompt
## Issue description
`SDKData.deployment_names` claims to return deployment full names, but it currently appends `deployment.name`.

## Issue Context
This makes the code misleading and can break downstream consumers expecting `flow-name/deployment-name` strings.

## Fix Focus Areas
- src/prefect/_sdk/models.py[120-129]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


def all_deployments(self) -> list[DeploymentInfo]:
"""Return a flat list of all deployments.

Returns deployments sorted by full_name for deterministic code generation.
"""
deployments: list[DeploymentInfo] = []
for flow in self.flows.values():
deployments.extend(flow.deployments)
return sorted(deployments, key=lambda d: d.full_name)
Loading
Loading