-
Notifications
You must be signed in to change notification settings - Fork 12
feat: Extra Node Selector and Tolerations #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Extra Node Selector and Tolerations #65
Conversation
📝 WalkthroughWalkthroughAdded optional node role scheduling via a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @snakemake_executor_plugin_kubernetes/__init__.py:
- Around line 220-251: Wrap the parsing of resources_dict["node_taint"] (the
toleration = str(...), key, rest = toleration.split("=", 1), value, effect =
rest.split(":", 1)) in a try/except that catches ValueError and raises a
WorkflowError with an informative message about the expected "key=value:effect"
format; keep the existing checks for empty parts and effect validity, and when
constructing the kubernetes.client.V1Toleration use consistent keyword spacing
(e.g., key=key, operator="Equal", value=value, effect=effect) and retain the
self.logger.debug call to log the added toleration.
🧹 Nitpick comments (2)
snakemake_executor_plugin_kubernetes/__init__.py (2)
206-209: Consider documenting or making the node selector label key configurable.The implementation hardcodes
"role"as the node selector key. While this may work for specific cluster setups, Kubernetes doesn't enforce a standard "role" label. Consider:
- Documenting that users must label their nodes with
role=<value>for this feature to work- Alternatively, making the label key configurable (e.g.,
node_role_labelsetting) to support different cluster labeling schemesExample of configurable approach
If configurability is desired, you could allow users to specify the full label in the resource:
# Example: node_selector_label="custom-pool=snakemake-exec" if "node_selector_label" in resources_dict: label_spec = str(resources_dict["node_selector_label"]) if "=" in label_spec: key, value = label_spec.split("=", 1) node_selector[key] = value else: node_selector[label_spec] = "true"However, the current implementation is acceptable if the
rolelabel is standard in your environment.
229-238: Optional: Consider the static analysis hint about exception messages.The static analysis tool suggests avoiding long messages outside the exception class (TRY003). However, for this use case, the contextual error messages are appropriate and clear. You may safely ignore this hint, or if you prefer to address it, you could define custom exception subclasses.
This is a low-priority style suggestion and doesn't affect functionality. Based on coding guidelines for this project.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_kubernetes/__init__.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakemake_executor_plugin_kubernetes/__init__.py
🪛 Ruff (0.14.10)
snakemake_executor_plugin_kubernetes/__init__.py
230-233: Avoid specifying long messages outside the exception class
(TRY003)
235-238: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
snakemake_executor_plugin_kubernetes/__init__.py (1)
220-251: Integration with existing GPU tolerations is correct.The node_taint toleration handling integrates well with the existing GPU toleration logic (lines 252-298). Both properly initialize
pod_spec.tolerationsif needed and can coexist, allowing pods to tolerate both custom taints and GPU-specific taints.
…s missing characters in the format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_kubernetes/__init__.py (1)
220-258: LGTM! Error handling and validation are solid.The implementation correctly handles the
node_taintresource with proper error handling for parsing failures and comprehensive validation. The try/except block at lines 227-234 addresses the parsing concern from the previous review.Optionally, you can improve the exception chaining by using
raise ... from Noneto indicate that the ValueError is intentionally being replaced:🔎 Optional improvement for exception chaining
try: key, rest = toleration.split("=", 1) value, effect = rest.split(":", 1) except ValueError: - raise WorkflowError( + raise WorkflowError( f"Invalid node_taint format: '{toleration}'. " "Expected format: key=value:effect" - ) + ) from NoneThis makes it clearer that the ValueError is expected and being intentionally replaced with a more user-friendly WorkflowError.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_kubernetes/__init__.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
snakemake_executor_plugin_kubernetes/__init__.py
🪛 Ruff (0.14.10)
snakemake_executor_plugin_kubernetes/__init__.py
231-234: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
231-234: Avoid specifying long messages outside the exception class
(TRY003)
237-240: Avoid specifying long messages outside the exception class
(TRY003)
242-245: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
snakemake_executor_plugin_kubernetes/__init__.py (1)
206-209: No changes needed — the hardcoded "role" label key is intentional.The code comment explicitly documents that this feature is for scheduling on GKE burst-pool nodes using the
rolelabel key (e.g.,role=snakemake-exec). The hardcoding is not a flexibility issue but rather the correct implementation for this specific use case. Therolelabel is the standard convention for GKE burst pools, and the design is properly documented in the code.Likely an incorrect or invalid review comment.
Added additional functionality in the plugin to account for new resource keys: node_role and node_taint.
node_role: Allows users to define in either default-resources or in rule specific resources to schedule jobs on labeled nodes.
cluster configuration: ie. node-labels=role=snakemake-exec
snakemake resource usage: ie. node_role=snakemake-exec
node_taint: Introduces a new resource key that enables users to specify custom taints on their nodes in order to craft the necessary tolerations for the pod to be scheduled on the tainted node(s). This can be initialized in either default-resources or within rule specific resources. The expected format of this resource is key=value:effect. It is essential to note that the only supported operator is "=" but perhaps in the future this can be expanded upon. Also, there are error mechanisms in place to account for proper formatting and ensuring that the effect is one of the supported NoSchedule, PreferNoSchedule, NoExecute.
cluster configuration: ie. node-taints=workload=snakemake:NoSchedule
snakemake resource usage: ie. node_taint=workload=snakemake:NoSchedule
These extra resources are useful for custom node taints and selectors on a GKE cluster so jobs can land properly on specific nodes/pools and have the associated tolerations.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.