-
Notifications
You must be signed in to change notification settings - Fork 13
periodic sync upstream KF to midstream ODH #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
periodic sync upstream KF to midstream ODH #291
Conversation
* add support to async job for huggingface source This enables the model storage async job to have as the source a hf URI in the following forms: hf://repo-name/model-name hf://repo-name/model-name:hash Signed-off-by: Jon Burdo <[email protected]> * use uri as source type for hf Signed-off-by: Jon Burdo <[email protected]> --------- Signed-off-by: Jon Burdo <[email protected]>
This is a temporary file written to in some Makefiles when running clusters locally for testing. Signed-off-by: Jon Burdo <[email protected]>
* chore(async-job): add script to setup and run sample job Signed-off-by: Eric Dobroveanu <[email protected]> * chore: adjust readiness probe for faster tests Signed-off-by: Eric Dobroveanu <[email protected]> * test(async-job): convert bash-based test to python-based Signed-off-by: Eric Dobroveanu <[email protected]> * test(async-job): add readme for integration tests Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): ensure correct make target is run in GH action Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): update lockfile and convert to use boto3 Signed-off-by: Eric Dobroveanu <[email protected]> * test(async-job): simplify the integration tests Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): remove unused job-values.yaml Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): ensure async job has a separate env var from mr service Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): adjust e2e tests to be able to build the images Signed-off-by: Eric Dobroveanu <[email protected]> * chore(async-job): move env vars to the top level Signed-off-by: Eric Dobroveanu <[email protected]> --------- Signed-off-by: Eric Dobroveanu <[email protected]>
* ci: fix root Make image/push Signed-off-by: Matteo Mortari <[email protected]> * to trigger ci Signed-off-by: Matteo Mortari <[email protected]> * revert 8356c82 Signed-off-by: Matteo Mortari <[email protected]> * fix async-job Makefile Signed-off-by: Matteo Mortari <[email protected]> * invoke script with expected IMG var Signed-off-by: Matteo Mortari <[email protected]> --------- Signed-off-by: Matteo Mortari <[email protected]>
Resolving conflict On .github/workflows/async-upload-test.yml Makefile Signed-off-by: Matteo Mortari <[email protected]>
|
@tarilabs: you cannot LGTM your own PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughThis update introduces support for a new source type ("uri") in the async-upload job, enabling Hugging Face model downloads, and adds comprehensive integration tests for the async-upload workflow within Kubernetes. It refactors Makefile and workflow image variable logic, updates ignore files, and enhances documentation and test configuration for integration scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Tester as Integration Test
participant K8s as Kubernetes Cluster
participant MR as Model Registry
participant MinIO as S3 Storage
participant Job as Async-Upload Job
Tester->>MR: Create model, version, placeholder artifact
Tester->>MinIO: Upload model file
Tester->>K8s: Apply async-upload Job manifest
K8s->>Job: Schedule Job pod
Job->>MinIO: Download model file from S3
Job->>MR: Update artifact with URI/state
Tester->>K8s: Poll for Job completion
Tester->>MR: Verify artifact updated
Tester->>K8s: Cleanup Job resource
sequenceDiagram
participant Job as Async-Upload Job
participant HF as Hugging Face Hub
Job->>Job: Parse config (source type: "uri")
Job->>HF: Download model via huggingface_hub
Job->>Job: Store model in destination
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tarilabs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2 similar comments
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tarilabs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tarilabs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (5)
jobs/async-upload/samples/sample_job_s3_to_oci.yaml (1)
5-7: Hard-codednamespace: defaultlimits sample portabilityEmbedding
namespace: defaultforces users to edit the file before applying it in any namespace-scoped environment (e.g., GitOps overlays, multi-tenant clusters).Optional refactor:
- namespace: default + # namespace: defaultor document that users should patch the namespace when applying.
Also applies to: 17-19, 28-30
.github/workflows/async-upload-test.yml (1)
78-80: Fix misleading step name.The step name says "Execute Sample Job E2E test" but it's actually running integration tests via
make test-integration. Consider updating the name for clarity.- - name: Execute Sample Job E2E test + - name: Execute Sample Job Integration testjobs/async-upload/Makefile (1)
21-21: Track the consistency issue in an issueThe TODO mentions an important consistency issue (RHOAIENG-30453). Consider creating a GitHub issue to track this if not already done.
Would you like me to create an issue to track this IMG tag consistency problem?
jobs/async-upload/tests/integration/test_integration_async_upload.py (2)
224-224: Use explicit exception chainingFollow Python best practices for exception handling.
Apply this diff:
- raise Exception(f"Failed to upload to MinIO: {e}") + raise Exception(f"Failed to upload to MinIO: {e}") from e
296-296: Remove unnecessary f-string prefixThe string doesn't contain any placeholders.
- key = f"my-model/mnist-8.onnx" + key = "my-model/mnist-8.onnx"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
jobs/async-upload/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.github/workflows/async-upload-test.yml(2 hunks).gitignore(1 hunks)Makefile(1 hunks)clients/python/.gitignore(0 hunks)jobs/async-upload/.dockerignore(1 hunks)jobs/async-upload/Makefile(3 hunks)jobs/async-upload/job/config.py(8 hunks)jobs/async-upload/job/download.py(2 hunks)jobs/async-upload/pyproject.toml(2 hunks)jobs/async-upload/samples/sample_job_s3_to_oci.yaml(3 hunks)jobs/async-upload/tests/conftest.py(2 hunks)jobs/async-upload/tests/integration/README.md(1 hunks)jobs/async-upload/tests/integration/__init__.py(1 hunks)jobs/async-upload/tests/integration/test_integration_async_upload.py(1 hunks)manifests/kustomize/base/model-registry-deployment.yaml(1 hunks)
💤 Files with no reviewable changes (1)
- clients/python/.gitignore
🧰 Additional context used
🪛 Ruff (0.12.2)
jobs/async-upload/tests/integration/test_integration_async_upload.py
3-3: json imported but unused
Remove unused import: json
(F401)
9-9: typing.Dict imported but unused
Remove unused import
(F401)
9-9: typing.Any imported but unused
Remove unused import
(F401)
13-13: yaml imported but unused
Remove unused import: yaml
(F401)
224-224: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
245-245: Local variable mr_host_url is assigned to but never used
Remove assignment to unused variable mr_host_url
(F841)
296-296: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (18)
jobs/async-upload/tests/integration/__init__.py (1)
1-1: Docstring-only module looks fineNo further action required.
jobs/async-upload/.dockerignore (1)
1-2: Good call excluding the runtime PID fileExcluding
.port-forwards.pidprevents unnecessary cache invalidation during image builds.manifests/kustomize/base/model-registry-deployment.yaml (1)
68-71: Verify increased readiness-probe frequency
periodSecondswas reduced from the default 60 s to 20 s.
If/readyz/isDirtyperforms I/O-heavy checks, a 3× increase in probe rate could create avoidable load spikes or DOS the service under stress.Consider benchmarking the endpoint latency and, if needed, tune
failureThreshold/timeoutSecondsaccordingly..gitignore (1)
27-29: Consistent ignore ruleAdding
.port-forwards.pidat repo root keeps the ignore logic in one place; 👍Makefile (1)
34-40: LGTM! Well-structured image variable logic.The conditional logic properly prioritizes explicitly set
IMGvalues while providing sensible fallbacks. This improves flexibility for different deployment scenarios and CI/CD workflows.jobs/async-upload/pyproject.toml (2)
30-31: LGTM! Proper pytest marker configuration.The integration marker addition is correctly formatted and aligns with the test configuration changes in conftest.py.
18-20: Dependency versions are up to date and secure
- kubernetes: latest 33.1.0 falls within “^33.0.0”
- requests: latest 2.32.4 falls within “^2.31.0”
- No known vulnerabilities reported for these versions (consider adding a
safety checkstep in CI)No changes required.
.github/workflows/async-upload-test.yml (1)
20-29: LGTM! Well-organized environment variables.The separation of async job image variables (
JOB_IMG_*) from model registry server variables (IMG_*) provides clear distinction and aligns with the new image management approach.jobs/async-upload/job/config.py (1)
22-23: LGTM! Proper integration of new URI source type.The addition of "uri" to the source type choices and the corresponding argument is well-integrated with the existing argument structure.
jobs/async-upload/tests/conftest.py (2)
11-33: LGTM! Well-structured integration test configuration.The integration test support follows the same pattern as the existing e2e test configuration, providing consistent behavior for:
- Selective test execution with
--integration- Proper skip markers and reasons
- Clean separation between test types
The conditional logic correctly handles all scenarios for running integration, e2e, or default tests.
43-48: LGTM! Consistent command-line option addition.The
--integrationoption follows the same pattern as the existing--e2eoption with appropriate defaults and help text.jobs/async-upload/tests/integration/README.md (3)
16-16: Documentation inconsistency with implementationThe README states "No external CLI tools required - everything is pure Python!" but the actual test implementation uses
subprocesscalls to executekubectlcommands.Likely an incorrect or invalid review comment.
32-39: Update documentation to reflect actual implementationThe "Pure Python Approach" section is misleading. The integration test actually uses:
- Subprocess calls to execute
kubectlcommandskubectl apply -kwhich invokes kustomize CLI- Shell commands for job management
Either update the documentation to accurately reflect the implementation or refactor the test to truly use a pure Python approach.
Likely an incorrect or invalid review comment.
73-73: Correct the misleading statement about kustomizeThe documentation states "pure Python YAML patching (no kustomize CLI)" but the test actually uses
kubectl apply -kwhich invokes the kustomize CLI.Likely an incorrect or invalid review comment.
jobs/async-upload/job/download.py (1)
47-75: Well-implemented Hugging Face download functionThe function has excellent validation logic and error handling. Good attribution to the kserve source.
jobs/async-upload/Makefile (2)
1-13: Good separation of job and server image variablesThe clear distinction between
JOB_IMG_*for the async-upload job andIMGfor the model registry server improves maintainability.
70-83: Well-structured integration test targetsThe new targets follow established patterns and correctly pass the job image URI to the tests.
jobs/async-upload/tests/integration/test_integration_async_upload.py (1)
167-200: Well-implemented job status pollingGood implementation with proper timeout handling and status checking.
| def _validate_uri_config(cfg: Dict[str, Any]) -> None: | ||
| if not ("uri" in cfg and isinstance(cfg["uri"], str)): | ||
| raise ValueError("uri must be set to a string") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance URI validation for security.
The current validation only checks if the URI is a string, which is insufficient for security. Consider adding proper URL validation and scheme restrictions.
def _validate_uri_config(cfg: Dict[str, Any]) -> None:
+ import urllib.parse
if not ("uri" in cfg and isinstance(cfg["uri"], str)):
raise ValueError("uri must be set to a string")
+
+ # Validate URI format and allowed schemes
+ try:
+ parsed = urllib.parse.urlparse(cfg["uri"])
+ if not parsed.scheme:
+ raise ValueError("URI must include a scheme (e.g., https://)")
+ if parsed.scheme not in ["http", "https", "hf"]: # Add allowed schemes
+ raise ValueError(f"URI scheme '{parsed.scheme}' is not allowed")
+ except Exception as e:
+ raise ValueError(f"Invalid URI format: {e}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _validate_uri_config(cfg: Dict[str, Any]) -> None: | |
| if not ("uri" in cfg and isinstance(cfg["uri"], str)): | |
| raise ValueError("uri must be set to a string") | |
| def _validate_uri_config(cfg: Dict[str, Any]) -> None: | |
| import urllib.parse | |
| if not ("uri" in cfg and isinstance(cfg["uri"], str)): | |
| raise ValueError("uri must be set to a string") | |
| # Validate URI format and allowed schemes | |
| try: | |
| parsed = urllib.parse.urlparse(cfg["uri"]) | |
| if not parsed.scheme: | |
| raise ValueError("URI must include a scheme (e.g., https://)") | |
| if parsed.scheme not in ["http", "https", "hf"]: # Add allowed schemes | |
| raise ValueError(f"URI scheme '{parsed.scheme}' is not allowed") | |
| except Exception as e: | |
| raise ValueError(f"Invalid URI format: {e}") |
🤖 Prompt for AI Agents
In jobs/async-upload/job/config.py around lines 194 to 197, the
_validate_uri_config function only checks if the "uri" key exists and is a
string, which is insufficient for security. Enhance the validation by verifying
that the URI is a properly formatted URL and restrict allowed schemes (e.g.,
only allow "http" and "https"). Use a URL parsing library or regex to validate
the URI format and raise a ValueError if the URI is invalid or uses a disallowed
scheme.
|
|
||
| # TODO: Maybe clean this up, its a little manual | ||
| # Override with command-line arguments if provided. configargparse will prioritize CLI > ENV | ||
| cfg["source"]["uri"] = args.source_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider URI sanitization for logging.
The URI is assigned directly without sanitization. If this contains sensitive information (like tokens), it should be sanitized in the logging function.
Update the _sanitize_config_for_logging function to handle URI sanitization:
def _sanitize_config_for_logging(cfg: Dict[str, Any]) -> Dict[str, Any]:
"""
Create a sanitized copy of the config for logging purposes, masking sensitive values.
"""
import copy
sanitized = copy.deepcopy(cfg)
# ... existing sanitization code ...
+ # Mask sensitive parts of URIs (e.g., tokens in query parameters)
+ if sanitized["source"].get("uri"):
+ import urllib.parse
+ try:
+ parsed = urllib.parse.urlparse(sanitized["source"]["uri"])
+ if parsed.query:
+ # Mask query parameters that might contain tokens
+ sanitized["source"]["uri"] = f"{parsed.scheme}://{parsed.netloc}{parsed.path}?***"
+ except:
+ sanitized["source"]["uri"] = "***"
return sanitized📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cfg["source"]["uri"] = args.source_uri | |
| def _sanitize_config_for_logging(cfg: Dict[str, Any]) -> Dict[str, Any]: | |
| """ | |
| Create a sanitized copy of the config for logging purposes, masking sensitive values. | |
| """ | |
| import copy | |
| sanitized = copy.deepcopy(cfg) | |
| # ... existing sanitization code ... | |
| # Mask sensitive parts of URIs (e.g., tokens in query parameters) | |
| if sanitized["source"].get("uri"): | |
| import urllib.parse | |
| try: | |
| parsed = urllib.parse.urlparse(sanitized["source"]["uri"]) | |
| if parsed.query: | |
| # Mask query parameters that might contain tokens | |
| sanitized["source"]["uri"] = f"{parsed.scheme}://{parsed.netloc}{parsed.path}?***" | |
| except: | |
| sanitized["source"]["uri"] = "***" | |
| return sanitized |
🤖 Prompt for AI Agents
In jobs/async-upload/job/config.py at line 343, the source URI is assigned
directly without sanitization, which risks exposing sensitive information in
logs. Update the _sanitize_config_for_logging function to detect and sanitize
the URI field by removing or masking sensitive parts such as tokens or
credentials before logging. This ensures that any logged configuration data does
not leak sensitive information contained in the URI.
| uri = config["source"]["uri"] | ||
| if uri.startswith(HF_URI_PREFIX): | ||
| download_from_hf(config["source"]["uri"], config["storage"]["path"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove redundant URI extraction
The URI is already extracted on line 84 but then accessed again from config on line 86.
Apply this diff:
- download_from_hf(config["source"]["uri"], config["storage"]["path"])
+ download_from_hf(uri, config["storage"]["path"])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| uri = config["source"]["uri"] | |
| if uri.startswith(HF_URI_PREFIX): | |
| download_from_hf(config["source"]["uri"], config["storage"]["path"]) | |
| uri = config["source"]["uri"] | |
| if uri.startswith(HF_URI_PREFIX): | |
| download_from_hf(uri, config["storage"]["path"]) |
🤖 Prompt for AI Agents
In jobs/async-upload/job/download.py around lines 84 to 86, the variable uri is
extracted from config["source"]["uri"] on line 84 but then the code redundantly
accesses config["source"]["uri"] again on line 86. To fix this, replace
config["source"]["uri"] on line 86 with the already extracted uri variable to
avoid redundant dictionary access.
| @@ -0,0 +1,365 @@ | |||
| """Integration tests for async-upload job functionality.""" | |||
|
|
|||
| import json | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports
Static analysis correctly identified unused imports that should be removed.
Apply this diff:
-import json
import os
import tempfile
import time
import uuid
from pathlib import Path
-from typing import Dict, Any
import pytest
import requests
-import yamlAlso applies to: 9-9, 13-13
🧰 Tools
🪛 Ruff (0.12.2)
3-3: json imported but unused
Remove unused import: json
(F401)
🤖 Prompt for AI Agents
In jobs/async-upload/tests/integration/test_integration_async_upload.py at lines
3, 9, and 13, there are unused import statements that should be removed to clean
up the code. Identify and delete these unused imports from the specified lines
to improve code quality and maintainability.
| def apply_job_with_strategic_merge( | ||
| rm_id: str, | ||
| mv_id: str, | ||
| ma_id: str, | ||
| job_name: str, | ||
| container_image_uri: str, | ||
| k8s_client | ||
| ) -> str: | ||
| """Apply job using Kustomize strategic merge patches.""" | ||
| import subprocess | ||
| import tempfile | ||
| import time | ||
|
|
||
| # Strategic merge patch template - only patch the image and env vars, keep original job name | ||
| patch_template = f"""apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: my-async-upload-job | ||
| spec: | ||
| template: | ||
| spec: | ||
| containers: | ||
| - name: async-upload | ||
| image: {container_image_uri} | ||
| env: | ||
| - name: MODEL_SYNC_MODEL_ID | ||
| value: "{rm_id}" | ||
| - name: MODEL_SYNC_MODEL_VERSION_ID | ||
| value: "{mv_id}" | ||
| - name: MODEL_SYNC_MODEL_ARTIFACT_ID | ||
| value: "{ma_id}" | ||
| """ | ||
|
|
||
| # Get the path to the sample job file | ||
| base_job_path = Path(__file__).parent.parent.parent / "samples" / "sample_job_s3_to_oci.yaml" | ||
|
|
||
| # Kustomization template using relative path and modern patches syntax | ||
| kustomization_template = """apiVersion: kustomize.config.k8s.io/v1beta1 | ||
| kind: Kustomization | ||
| resources: | ||
| - sample_job_s3_to_oci.yaml | ||
| patches: | ||
| - path: patch.yaml | ||
| target: | ||
| kind: Job | ||
| name: my-async-upload-job | ||
| """ | ||
|
|
||
| with tempfile.TemporaryDirectory() as temp_dir: | ||
| temp_path = Path(temp_dir) | ||
|
|
||
| # Copy the base job file into temp directory | ||
| import shutil | ||
| base_job_copy = temp_path / "sample_job_s3_to_oci.yaml" | ||
| shutil.copy2(base_job_path, base_job_copy) | ||
|
|
||
| # Write the patch file | ||
| patch_file = temp_path / "patch.yaml" | ||
| with open(patch_file, "w") as f: | ||
| f.write(patch_template) | ||
|
|
||
| # Write the kustomization file | ||
| kustomize_file = temp_path / "kustomization.yaml" | ||
| with open(kustomize_file, "w") as f: | ||
| f.write(kustomization_template) | ||
|
|
||
| # Delete existing job if it exists (Jobs are immutable) | ||
| delete_result = subprocess.run( | ||
| ["kubectl", "delete", "job", "my-async-upload-job", "-n", "default", "--ignore-not-found=true"], | ||
| capture_output=True, | ||
| text=True, | ||
| check=False | ||
| ) | ||
| if delete_result.returncode == 0: | ||
| print(f"Deleted existing job: {delete_result.stdout.strip()}") | ||
| # Wait a moment for deletion to complete | ||
| time.sleep(3) | ||
|
|
||
| # Apply resources using kubectl apply -k | ||
| result = subprocess.run( | ||
| ["kubectl", "apply", "-k", "."], | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=temp_path, | ||
| check=False | ||
| ) | ||
|
|
||
| if result.returncode != 0: | ||
| raise Exception(f"kubectl apply failed: {result.stderr}") | ||
|
|
||
| # Return the original job name since we're not changing it | ||
| return "my-async-upload-job" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using Kubernetes Python client instead of subprocess
This function uses subprocess calls to kubectl, which contradicts the README's claim of a "pure Python approach". Consider refactoring to use the Kubernetes Python client that's already available in the fixtures.
Would you like me to help refactor this to use the Kubernetes Python client for applying resources?
🤖 Prompt for AI Agents
In jobs/async-upload/tests/integration/test_integration_async_upload.py from
lines 70 to 164, the function apply_job_with_strategic_merge uses subprocess
calls to kubectl for deleting and applying Kubernetes resources, which conflicts
with the README's claim of a pure Python approach. Refactor this function to use
the Kubernetes Python client instead: load the YAML manifests, patch the Job
resource programmatically by modifying the container image and environment
variables, delete the existing Job if present using the client API, and create
or replace the Job resource via the client. This removes the need for subprocess
calls and external kubectl commands, aligning with the pure Python approach.
| rm_id: str, | ||
| mv_id: str, | ||
| ma_id: str, | ||
| job_name: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused job_name parameter
The job_name parameter is never used - the function always returns the hardcoded "my-async-upload-job".
Either use the parameter or remove it:
def apply_job_with_strategic_merge(
rm_id: str,
mv_id: str,
ma_id: str,
- job_name: str,
container_image_uri: str,
k8s_client
) -> str:Also applies to: 163-163
🤖 Prompt for AI Agents
In jobs/async-upload/tests/integration/test_integration_async_upload.py at lines
74 and 163, the job_name parameter is declared but never used since the function
returns a hardcoded string. Remove the unused job_name parameter from the
function signature to clean up the code and avoid confusion.
| """ | ||
|
|
||
| # Configuration | ||
| mr_host_url = os.environ.get("MR_HOST_URL", "http://localhost:8080") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused variable
The mr_host_url variable is assigned but never used.
Remove the unused variable:
- mr_host_url = os.environ.get("MR_HOST_URL", "http://localhost:8080")
container_image_uri = os.environ.get(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mr_host_url = os.environ.get("MR_HOST_URL", "http://localhost:8080") | |
| container_image_uri = os.environ.get( |
🧰 Tools
🪛 Ruff (0.12.2)
245-245: Local variable mr_host_url is assigned to but never used
Remove assignment to unused variable mr_host_url
(F841)
🤖 Prompt for AI Agents
In jobs/async-upload/tests/integration/test_integration_async_upload.py at line
245, the variable mr_host_url is assigned but never used. Remove the entire line
that assigns mr_host_url to clean up the code and avoid unused variable
warnings.
| actual_job_name = apply_job_with_strategic_merge( | ||
| rm_id=rm.id, | ||
| mv_id=mv.id, | ||
| ma_id=ma.id, | ||
| job_name=job_name, | ||
| container_image_uri=container_image_uri, | ||
| k8s_client=k8s_client | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix function call after removing job_name parameter
If you accept the suggestion to remove the unused job_name parameter from apply_job_with_strategic_merge, update this call accordingly.
actual_job_name = apply_job_with_strategic_merge(
rm_id=rm.id,
mv_id=mv.id,
ma_id=ma.id,
- job_name=job_name,
container_image_uri=container_image_uri,
k8s_client=k8s_client
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| actual_job_name = apply_job_with_strategic_merge( | |
| rm_id=rm.id, | |
| mv_id=mv.id, | |
| ma_id=ma.id, | |
| job_name=job_name, | |
| container_image_uri=container_image_uri, | |
| k8s_client=k8s_client | |
| ) | |
| actual_job_name = apply_job_with_strategic_merge( | |
| rm_id=rm.id, | |
| mv_id=mv.id, | |
| ma_id=ma.id, | |
| container_image_uri=container_image_uri, | |
| k8s_client=k8s_client | |
| ) |
🤖 Prompt for AI Agents
In jobs/async-upload/tests/integration/test_integration_async_upload.py around
lines 301 to 308, the function call to apply_job_with_strategic_merge still
includes the job_name parameter, which has been removed from the function
definition. Remove the job_name argument from this call to match the updated
function signature and avoid errors.
f83d358
into
opendatahub-io:main
Signed-off-by: Alex Creasy <[email protected]>
labels are applied
/ok-to-test
/lgtm
/approve
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Documentation