-
Notifications
You must be signed in to change notification settings - Fork 151
[async-job] E2E Test with Sample Job #1326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
5f30dc6
180ed2b
f1683a0
da6bd7a
08a7afd
04650d3
d5535c9
82b9161
71969a0
26ddd2a
f28329e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| .port-forwards.pid |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,15 +1,22 @@ | ||||||
| IMG_VERSION ?= latest | ||||||
| IMG_REGISTRY ?= ghcr.io | ||||||
| IMG_ORG ?= kubeflow | ||||||
| IMG_NAME ?= model-registry/job/async-upload | ||||||
| IMG ?= $(IMG_REGISTRY)/$(IMG_ORG)/$(IMG_NAME):$(IMG_VERSION) | ||||||
| JOB_IMG_VERSION ?= latest | ||||||
| JOB_IMG_REGISTRY ?= ghcr.io | ||||||
| JOB_IMG_ORG ?= kubeflow | ||||||
| JOB_IMG_NAME ?= model-registry/job/async-upload | ||||||
| JOB_IMG ?= $(JOB_IMG_REGISTRY)/$(JOB_IMG_ORG)/$(JOB_IMG_NAME):$(JOB_IMG_VERSION) | ||||||
| BUILD_IMAGE ?= true # whether to build the MR server image | ||||||
| CLUSTER_NAME ?= mr-e2e | ||||||
|
|
||||||
| # MR Server Params | ||||||
| IMG_VERSION ?= latest | ||||||
| IMG ?= ghcr.io/kubeflow/model-registry/server:$(IMG_VERSION) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm really tempted to
Suggested change
but I don't want to diverge on the scope of the PR too much :) |
||||||
|
|
||||||
| .PHONY: deploy-latest-mr | ||||||
| deploy-latest-mr: | ||||||
| cd ../../ && \ | ||||||
| $(if $(filter true,$(BUILD_IMAGE)),\ | ||||||
| IMG_VERSION=${IMG_VERSION} make image/build ARGS="--load$(if ${DEV_BUILD}, --target dev-build)" && \ | ||||||
| IMG_VERSION=${IMG_VERSION} IMG=${IMG} make image/build ARGS="--load$(if ${DEV_BUILD}, --target dev-build)" && \ | ||||||
| ,\ | ||||||
| docker pull $(IMG) && \ | ||||||
|
||||||
| ) \ | ||||||
| LOCAL=1 ./scripts/deploy_on_kind.sh | ||||||
| kubectl port-forward -n kubeflow services/model-registry-service 8080:8080 & echo $$! >> .port-forwards.pid | ||||||
|
|
@@ -28,8 +35,8 @@ deploy-local-registry: | |||||
|
|
||||||
| .PHONY: dev-load-image | ||||||
| dev-load-image: | ||||||
| docker buildx build --load -t $(IMG) . | ||||||
| kind load docker-image $(IMG) -n mr-e2e | ||||||
| docker buildx build --load -t $(JOB_IMG) . | ||||||
| kind load docker-image $(JOB_IMG) -n $(CLUSTER_NAME) | ||||||
|
|
||||||
| .PHONY: test | ||||||
| test: | ||||||
|
|
@@ -60,6 +67,20 @@ test-e2e-cleanup: | |||||
| rm -f .port-forwards.pid; \ | ||||||
| fi | ||||||
|
|
||||||
| .PHONY: test-integration | ||||||
| test-integration: deploy-latest-mr deploy-local-registry deploy-test-minio dev-load-image | ||||||
| @echo "Starting test-integration" | ||||||
| -$(MAKE) test-integration-run; STATUS=$$? | ||||||
| $(MAKE) test-e2e-cleanup | ||||||
| @exit $$STATUS | ||||||
|
|
||||||
| .PHONY: test-integration-run | ||||||
| test-integration-run: | ||||||
| @echo "Ensuring all extras are installed..." | ||||||
| poetry install --all-extras --with integration | ||||||
| @echo "Running integration tests..." | ||||||
| CONTAINER_IMAGE_URI=$(JOB_IMG) poetry run pytest --integration tests/integration/ -vs | ||||||
|
|
||||||
| .PHONY: install | ||||||
| install: | ||||||
| poetry install | ||||||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,6 +3,7 @@ apiVersion: v1 | |
| kind: Secret | ||
| metadata: | ||
| name: my-s3-credentials | ||
| namespace: default | ||
| stringData: | ||
| AWS_ACCESS_KEY_ID: minioadmin | ||
| AWS_SECRET_ACCESS_KEY: minioadmin | ||
|
|
@@ -14,6 +15,7 @@ apiVersion: v1 | |
| kind: Secret | ||
| metadata: | ||
| name: my-oci-credentials | ||
| namespace: default | ||
| type: kubernetes.io/dockerconfigjson | ||
| stringData: | ||
| .dockerconfigjson: '{"auths": {"distribution-registry-test-service.default.svc.cluster.local:5001": {"auth": "","email": "[email protected]"}}}' | ||
|
|
@@ -24,6 +26,7 @@ apiVersion: batch/v1 | |
| kind: Job | ||
| metadata: | ||
| name: my-async-upload-job | ||
| namespace: default | ||
| labels: | ||
| app.kubernetes.io/name: model-registry-async-job | ||
| app.kubernetes.io/component: async-job | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| # Integration Tests for Async-Upload Job | ||
|
|
||
| This directory contains integration tests for the async-upload job functionality, specifically testing the complete workflow from model creation through job execution and validation. | ||
|
|
||
| ## Installation | ||
|
|
||
| To run the integration tests, you need to install the integration test dependencies: | ||
|
|
||
| ```bash | ||
| # Install all dependencies including integration test dependencies | ||
| poetry install --with integration | ||
|
|
||
| # Or install just the integration group | ||
| poetry install --only integration | ||
|
|
||
| # No external CLI tools required - everything is pure Python! | ||
| ``` | ||
|
|
||
| ## Dependencies Added | ||
|
|
||
| The integration tests require the following additional dependencies: | ||
|
|
||
| ### Main Dependencies (added to `[tool.poetry.dependencies]`) | ||
|
|
||
| - **`requests`**: For HTTP calls (downloading models, uploading to MinIO) | ||
| - **`pyyaml`**: For YAML processing (kustomization files) | ||
|
|
||
| ### Integration Test Dependencies (added to `[tool.poetry.group.integration.dependencies]`) | ||
|
|
||
| - **`kubernetes`**: Official Python client for Kubernetes API operations | ||
|
|
||
| ### Pure Python Approach | ||
|
|
||
| The integration tests use a pure Python approach without external dependencies: | ||
|
|
||
| - **No subprocess calls**: All operations use Python libraries | ||
| - **No kustomize CLI**: YAML patching is done using pure Python dict operations | ||
| - **No shell commands**: Everything is handled through Python APIs | ||
|
|
||
| ## Running the Tests | ||
|
|
||
| ```bash | ||
| # Run integration tests only | ||
| poetry run pytest --integration tests/integration/ -v | ||
|
|
||
| # Run with environment variables | ||
| MR_HOST_URL=http://my-registry:8080 poetry run pytest --integration tests/integration/ -v | ||
|
|
||
| # Run all tests including integration | ||
| poetry run pytest --integration tests/ -v | ||
| ``` | ||
|
|
||
| ## Test Requirements | ||
|
|
||
| The integration tests require: | ||
|
|
||
| 1. **Model Registry service** running (default: `http://localhost:8080`) | ||
| 2. **MinIO service** running (default: `http://localhost:9000`) | ||
| 3. **Kubernetes cluster** with kubectl configured | ||
| 4. **OCI registry** for job artifact storage | ||
|
|
||
| ## Environment Variables | ||
|
|
||
| - `MR_HOST_URL`: Model Registry URL (default: `http://localhost:8080`) | ||
| - `CONTAINER_IMAGE_URI`: Container image for the async-upload job (default: `ghcr.io/kubeflow/model-registry/job/async-upload:latest`) | ||
|
|
||
| ## What the Tests Do | ||
|
|
||
| The integration tests validate the complete async-upload job workflow: | ||
|
|
||
| 1. **Model Registry Setup**: Creates RegisteredModel, ModelVersion, and placeholder ModelArtifact | ||
| 2. **File Operations**: Downloads ONNX model and uploads to MinIO using pure Python | ||
| 3. **Kubernetes Job**: Creates and applies job using pure Python YAML patching (no kustomize CLI) | ||
| 4. **Validation**: Verifies job completion and artifact state updates using kubernetes client | ||
|
|
||
| ## Debugging Failed Tests | ||
|
|
||
| If tests fail, check: | ||
|
|
||
| 1. **Services are running**: Model Registry, MinIO, Kubernetes cluster | ||
| 2. **Connectivity**: Can reach all required services | ||
| 3. **Permissions**: Kubernetes permissions for job creation | ||
| 4. **Logs**: Integration test captures and displays pod logs on failure |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """Integration tests for async-upload job functionality.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ease maintenance in m/s i would really prefer to have these at the top, this way we just adjust OCI reference in a single place when porting. Can you restore these Envs here, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there are some issues with how the ENV vars are being passed around to downstream make commands, etc. So I think a more comprehensive refactor of all the make files will be needed. This was really done to get all the tests to actually use the variables it should be using. For example, even providing an
IMGto the make file for the mr-server image will not always take the correctIMG_VERSION. There are some hard-coded instances in some of these makefiles.Let me take a look and see if I can do a minimal refactor with the env vars here restored