Skip to content

ITEP-64780 weights uploader #225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 19, 2025
Merged

Conversation

mgumowsk
Copy link
Contributor

@mgumowsk mgumowsk commented May 15, 2025

📝 Description

In order to overcome the problem with installation timeouts due to prolonged initial weights download, we changed the way they are copied to Geti.
In details:

  • Instead of having all weights inside of an image, we keep there only the pretrained_models_v2.json file. This file should be sent to S3 as the first one based on a content of this file it should download the file with weights from the location given in the WEIGHTS_URL parameter passed to the pod. Each file should be then sent to S3 - as it is done now.
  • If certain model already exists in S3, the algorithm skips it and moves to another one, adding information about this fact to logs.
  • The weights download job waits until seaweedfs is been installed, but it does not block the installer.
  • WEIGHTS_URL env is passed to the job to indicate the location of the weights. The default value set to the https://storage.geti.infra-host.com/weights value. Additionally, this job isn't executed if the DISABLE_WEIGHT_UPLOADING parameter is set to true.
  • As irrelevant anymore, local_mirror has been removed from pretrained_models_v2.json.

Fixes #147

✨ Type of Change

Select the type of change your PR introduces:

  • 🐞 Bug fix – Non-breaking change which fixes an issue
  • 🚀 New feature – Non-breaking change which adds functionality
  • 🔨 Refactor – Non-breaking change which refactors the code base
  • 💥 Breaking change – Changes that break existing functionality
  • 📚 Documentation update
  • 🔒 Security update
  • 🧪 Tests

🧪 Testing Scenarios

  • ✅ Tested manually
  • 🤖 Run automated end-to-end tests

✅ Checklist

Before submitting the PR, ensure the following:

  • 🔍 PR title is clear and descriptive
  • 📝 For internal contributors: If applicable, include the JIRA ticket number (e.g., ITEP-123456) in the PR title. Do not include full URLs
  • 💬 I have commented my code, especially in hard-to-understand areas
  • 📄 I have made corresponding changes to the documentation
  • ✅ I have added tests that prove my fix is effective or my feature works

@mgumowsk mgumowsk requested review from a team, jcchr and piotrgrubicki as code owners May 15, 2025 11:50
@github-actions github-actions bot added IAI Interactive AI backend Platform labels May 15, 2025
@michalad1 michalad1 requested a review from Copilot May 15, 2025 12:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors and implements the weights uploader service (ITEP-64780), adding configuration, unit tests, Helm charts, and updating related helper scripts.

  • Added unit tests for main in test_weights_uploader.py
  • Defined project metadata and dependencies in pyproject.toml
  • Introduced Helm chart templates for ServiceEntry, RBAC, proxy, and Job resources
  • Implemented core upload/download logic in app/weights_uploader.py and app/pretrained_models/pretrained_models.py
  • Updated interactive AI workflows to use the new download utilities and exposed WEIGHTS_URL in pod definitions

Reviewed Changes

Copilot reviewed 15 out of 19 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
platform/services/weights_uploader/tests/unit/test_weights_uploader.py Added auto-mocked env var fixture and success/failure tests
platform/services/weights_uploader/pyproject.toml Project metadata, dependencies, and tooling config
platform/services/weights_uploader/chart/templates/*.yaml Added Helm templates for ServiceEntry, RBAC, proxy, and Job
platform/services/weights_uploader/app/weights_uploader.py Core S3 upload logic and main workflow
platform/services/weights_uploader/app/pretrained_models/pretrained_models.py Download, unzip, retry, and SHA‐256 verification logic
interactive_ai/.../scripts/run.py Switched calls from download_file_from_s3 to download_file
interactive_ai/.../scripts/minio_util.py Refactored download utilities, added URL fallback
interactive_ai/.../trainer_pod_definition.py Exposed WEIGHTS_URL in Kubernetes container environment
Files not reviewed (4)
  • platform/services/weights_uploader/.dockerignore: Language not supported
  • platform/services/weights_uploader/Dockerfile: Language not supported
  • platform/services/weights_uploader/Makefile: Language not supported
  • platform/services/weights_uploader/app/pretrained_models/pretrained_models_v2.json: Language not supported
Comments suppressed due to low confidence (3)

platform/services/weights_uploader/pyproject.toml:6

  • There's a typo in the description—"Wights" should be "Weights".
description = "Wights uploader service for Geti"

interactive_ai/workflows/otx_domain/trainer/otx_v2/scripts/minio_util.py:16

  • The download_file_from_url function uses bucket_name inside but does not accept it as a parameter, causing an undefined variable at runtime.
def download_file_from_url(client: Minio, object_name: str, file_path: str) -> None:

interactive_ai/workflows/otx_domain/trainer/otx_v2/scripts/minio_util.py:23

  • requests is used here but not imported at the top of the module, leading to a NameError.
resp = requests.get(url, timeout=600)

jcchr
jcchr previously approved these changes May 15, 2025
piotrgrubicki
piotrgrubicki previously approved these changes May 15, 2025
@piotrgrubicki piotrgrubicki enabled auto-merge May 15, 2025 14:52
@mgumowsk mgumowsk dismissed stale reviews from piotrgrubicki and jcchr via de590dd May 16, 2025 07:50
@mgumowsk mgumowsk force-pushed the mgumowsk/ITEP-64780-weights-uploader branch from 16bf548 to de590dd Compare May 16, 2025 07:50
@mgumowsk mgumowsk force-pushed the mgumowsk/ITEP-64780-weights-uploader branch from de590dd to c2961ee Compare May 16, 2025 08:17
Copy link
Contributor

@AlexanderBarabanov AlexanderBarabanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

jcchr
jcchr previously approved these changes May 16, 2025
@mgumowsk mgumowsk force-pushed the mgumowsk/ITEP-64780-weights-uploader branch from 2399ec7 to 8a3830e Compare May 19, 2025 08:11
@piotrgrubicki piotrgrubicki added this pull request to the merge queue May 19, 2025
Merged via the queue into main with commit 90e00eb May 19, 2025
10 checks passed
@piotrgrubicki piotrgrubicki deleted the mgumowsk/ITEP-64780-weights-uploader branch May 19, 2025 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IAI Interactive AI backend Platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Failed to install geti using the installer.sh
7 participants