ITEP-32416 Add FP16 inference with feature flag #233

itallix · 2025-05-16T11:40:56Z

📝 Description

This PR introduces FEATURE_FLAG_FP16_INFERENCE to control which model precision is used for inference operations. The change allows for more efficient resource utilization while maintaining backward compatibility.

Changes:

Added new feature flag FEATURE_FLAG_FP16_INFERENCE to control model precision selection
Updated model selection logic to prioritize models based on feature flag setting
Implemented fallback mechanism between FP16 and FP32 models
Ensured appropriate XAI-enabled models are exported based on selected precision
Added tests covering both feature flag states

Details

When enabled, the system will:

Use FP16 models for inference operations
Fall back to FP32 models if FP16 isn't available, or other way around
Export appropriate XAI-enabled models according to selected precision

This change optimizes resource utilization by deploying more compact FP16 models that consume less memory and storage while maintaining inference performance.

JIRA: ITEP-32416 ITEP-66504 ITEP-66505

✨ Type of Change

Select the type of change your PR introduces:

🐞 Bug fix – Non-breaking change which fixes an issue
🚀 New feature – Non-breaking change which adds functionality
🔨 Refactor – Non-breaking change which refactors the code base
💥 Breaking change – Changes that break existing functionality
📚 Documentation update
🔒 Security update
🧪 Tests

🧪 Testing Scenarios

Describe how the changes were tested and how reviewers can test them too:

✅ Tested manually
🤖 Run automated end-to-end tests

✅ Checklist

Before submitting the PR, ensure the following:

🔍 PR title is clear and descriptive
📝 For internal contributors: If applicable, include the JIRA ticket number (e.g., ITEP-123456) in the PR title. Do not include full URLs
💬 I have commented my code, especially in hard-to-understand areas
📄 I have made corresponding changes to the documentation
✅ I have added tests that prove my fix is effective or my feature works

leoll2

Looks great, minor comments

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py

…ng in logger

Copilot

Pull Request Overview

This PR adds a feature flag to toggle FP16 inference and updates model creation, selection logic, and tests to support FP16-first pipeline with FP32 fallback.

Introduces FEATURE_FLAG_FP16_INFERENCE and uses it in prepare_train and ModelRepo to choose precision order.
Renames mo_fp32_with_xai → mo_with_xai across code, fixtures, and tests.
Updates ModelRepo.get_latest_model_for_inference* to fetch all matching precisions and implement FP16/FP32 fallback.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
interactive_ai/workflows/geti_domain/train/job/tasks/prepare_and_train/train_helpers.py	Use feature flag to set FP16 or FP32 for XAI model
interactive_ai/workflows/geti_domain/train/job/tasks/evaluate_and_infer/evaluate_and_infer.py	Swap `mo_fp32_with_xai` references to `mo_with_xai`
interactive_ai/workflows/geti_domain/common/jobs_common/features/feature_flag_provider.py	Add `FEATURE_FLAG_FP16_INFERENCE` enum entry
interactive_ai/workflows/geti_domain/common/jobs_common_extras/mlflow/utils/train_output_models.py	Rename `mo_fp32_with_xai` → `mo_with_xai` in IDs/parse
interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py	Update inference query to include both precisions and implement fallback logic
Tests and fixtures (multiple files)	Rename fields/tests for `mo_with_xai` and cover both flag states

Comments suppressed due to low confidence (2)

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py:450

Aggregation pipeline lacks a $sort stage to ensure the latest model is returned first; this can lead to selecting an older model ID when multiple precisions exist—add sorting by version or _id before projecting.

matched_docs = list(self.aggregate_read(aggr_pipeline))

interactive_ai/libs/iai_core_py/tests/repos/test_model_repo.py:431

[nitpick] Update this docstring to reflect the new FP16-first behavior when the feature flag is enabled, e.g. mention M6_FP16 as expected under fp16-enabled.

The latest model for inference is M4 (the first one generated after the base model).

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py

…hen fetching models + add unit test for fallback model

Copilot

Pull Request Overview

The PR adds a new feature flag (FEATURE_FLAG_FP16_INFERENCE) to support FP16 inference and updates model selection logic, renaming the optimized model field from mo_fp32_with_xai to mo_with_xai throughout the code and tests. Key changes include:

Adding FP16 feature flag support in feature flag services and enum.
Updating model creation and selection logic to conditionally use FP16 based on the flag.
Refactoring tests and fixture data to reflect the renamed model field and validate FP16 and FP32 scenarios.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
web_ui/src/core/feature-flags/services/feature-flag-service.interface.ts	Added FEATURE_FLAG_FP16_INFERENCE flag to the development features
interactive_ai/workflows/geti_domain/train/tests/unit/workflows/test_train_workflow.py	Updated reference to mo_with_xai in tests after renaming
interactive_ai/workflows/geti_domain/train/tests/unit/tasks/prepare_and_train/test_train_helpers.py	Added parameterized tests for feature flag-driven precision selection
interactive_ai/workflows/geti_domain/train/tests/unit/tasks/evaluate_and_infer/test_evaluate_and_infer.py	Updated model field references in evaluation/inference tests
interactive_ai/workflows/geti_domain/train/tests/fixtures/train_workflow_data.py	Updated fixture to use the new model field name mo_with_xai
interactive_ai/workflows/geti_domain/train/job/tasks/prepare_and_train/train_helpers.py	Modified model builder creation to choose FP16 or FP32 based on the feature flag
interactive_ai/workflows/geti_domain/train/job/tasks/evaluate_and_infer/evaluate_and_infer.py	Updated inference tasks to reference the new model field naming
interactive_ai/workflows/geti_domain/common/jobs_common_extras/mlflow/utils/train_output_models.py	Refactored TrainOutputModelIds and TrainOutputModels to use mo_with_xai
interactive_ai/workflows/geti_domain/common/jobs_common/features/feature_flag_provider.py	Added FEATURE_FLAG_FP16_INFERENCE to the enumerated flags
interactive_ai/libs/iai_core_py/tests/repos/test_model_repo.py	Revised tests for the model repository to validate FP16/FP32 selection logic
interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py	Updated repository queries and aggregation pipelines to prioritize FP16 via feature flag

Comments suppressed due to low confidence (1)

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py:398

[nitpick] Consider verifying that sorting by _id ascending reliably reflects the creation order when selecting the earliest model for inference; if _id does not guarantee chronological order, you might sort using a dedicated timestamp field for better clarity.

"precision": {"$in": [ModelPrecision.FP16.name, ModelPrecision.FP32.name]},

web_ui/src/core/feature-flags/services/feature-flag-service.interface.ts

feat: Add FP16 inference model support with feature flag

39949ea

itallix requested a review from a team as a code owner May 16, 2025 11:40

itallix marked this pull request as draft May 16, 2025 11:41

github-actions bot added the IAI Interactive AI backend label May 16, 2025

style: fix linting issues

9513b2f

leoll2 reviewed May 16, 2025

View reviewed changes

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py Outdated Show resolved Hide resolved

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py Show resolved Hide resolved

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py Outdated Show resolved Hide resolved

chore: incorporate code review comments, use deferred string formatti…

2e4bc9c

…ng in logger

itallix requested a review from Copilot May 16, 2025 15:01

Copilot AI reviewed May 16, 2025

View reviewed changes

interactive_ai/libs/iai_core_py/iai_core/repos/model_repo.py Outdated Show resolved Hide resolved

github-actions bot added the UI label May 19, 2025

chore: add sorting for deterministic output, convert cursor to list w…

c313664

…hen fetching models + add unit test for fallback model

itallix requested a review from Copilot May 19, 2025 13:58

Copilot AI reviewed May 19, 2025

View reviewed changes

itallix marked this pull request as ready for review May 20, 2025 07:21

itallix requested review from jpggvilaca, MarkRedeman, camiloHimura, pplaskie, ActiveChooN, romanowska and dwesolow as code owners May 20, 2025 07:21

itallix requested a review from leoll2 May 20, 2025 07:21

itallix added the ready for review label May 20, 2025

leoll2 previously approved these changes May 20, 2025

View reviewed changes

MarkRedeman previously approved these changes May 20, 2025

View reviewed changes

leoll2 reviewed May 20, 2025

View reviewed changes

web_ui/src/core/feature-flags/services/feature-flag-service.interface.ts Outdated Show resolved Hide resolved

itallix dismissed stale reviews from MarkRedeman and leoll2 via 5fdb21b May 20, 2025 10:14

chore: remove ff from UI

5fdb21b

itallix requested a review from leoll2 May 20, 2025 10:16

itallix removed the UI label May 21, 2025

MarkRedeman approved these changes May 21, 2025

View reviewed changes

leoll2 approved these changes May 22, 2025

View reviewed changes

itallix added this pull request to the merge queue May 22, 2025

Merged via the queue into main with commit c104709 May 22, 2025
22 checks passed

itallix deleted the vitalii/use-fp16-inference branch May 22, 2025 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ITEP-32416 Add FP16 inference with feature flag #233

ITEP-32416 Add FP16 inference with feature flag #233

Uh oh!

itallix commented May 16, 2025 •

edited

Loading

Uh oh!

leoll2 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ITEP-32416 Add FP16 inference with feature flag #233

ITEP-32416 Add FP16 inference with feature flag #233

Uh oh!

Conversation

itallix commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

Changes:

Details

✨ Type of Change

🧪 Testing Scenarios

✅ Checklist

Uh oh!

leoll2 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

itallix commented May 16, 2025 •

edited

Loading