Skip to content

Support export_model and import_model for MULTI device mode#5

Open
andersendsa wants to merge 37 commits intomasterfrom
support-multi-export-import-6287962269027407743
Open

Support export_model and import_model for MULTI device mode#5
andersendsa wants to merge 37 commits intomasterfrom
support-multi-export-import-6287962269027407743

Conversation

@andersendsa
Copy link
Owner

Support export_model and import_model for MULTI device mode

Motivation:
Exporting a compiled model in MULTI mode was previously not supported, returning OPENVINO_NOT_IMPLEMENTED. This prevented users from caching or serializing MULTI-device models.

Changes:

  • Implemented AutoCumuCompiledModel::export_model in src/plugins/auto/src/cumulative_compiled_model.cpp to serialize the MULTI configuration (XML) and delegate model export to sub-devices.
  • Implemented Plugin::import_model in src/plugins/auto/src/plugin.cpp to deserialize the MULTI configuration and reconstruct the distributed model state by importing sub-models.
  • Updated CumuSchedule::init in src/plugins/auto/src/cumulative_schedule.cpp to handle initialization during import (skipping compilation tasks when ov::Model is null).
  • Added openvino/pass/serialize.hpp and openvino/util/xml_parse_utils.hpp includes where necessary.
  • Added a functional test CanExportImportMultiModel in src/plugins/auto/tests/functional/behavior/multi_export_import_test.cpp.

Verification:

  • Compiled openvino_auto_plugin and ov_auto_func_tests.
  • Ran AutoFuncTests.CanExportImportMultiModel test, which passed successfully.
  • Verified that the exported model can be imported and used for inference with correct results and device utilization.

PR created automatically by Jules for task 6287962269027407743 started by @andersendsa

Motivation:
Exporting a compiled model in MULTI mode was previously not supported, returning OPENVINO_NOT_IMPLEMENTED. This prevented users from caching or serializing MULTI-device models.

Changes:
- Implemented `AutoCumuCompiledModel::export_model` in `src/plugins/auto/src/cumulative_compiled_model.cpp` to serialize the MULTI configuration (XML) and delegate model export to sub-devices.
- Implemented `Plugin::import_model` in `src/plugins/auto/src/plugin.cpp` to deserialize the MULTI configuration and reconstruct the distributed model state by importing sub-models.
- Updated `CumuSchedule::init` in `src/plugins/auto/src/cumulative_schedule.cpp` to handle initialization during import (skipping compilation tasks when `ov::Model` is null).
- Added `openvino/pass/serialize.hpp` and `openvino/util/xml_parse_utils.hpp` includes where necessary.
- Added a functional test `CanExportImportMultiModel` in `src/plugins/auto/tests/functional/behavior/multi_export_import_test.cpp`.

Verification:
- Compiled `openvino_auto_plugin` and `ov_auto_func_tests`.
- Ran `AutoFuncTests.CanExportImportMultiModel` test, which passed successfully.
- Verified that the exported model can be imported and used for inference with correct results and device utilization.

Co-authored-by: andersendsa <199610634+andersendsa@users.noreply.github.com>
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

andersendsa and others added 28 commits January 30, 2026 16:31
…able Affine Parameters (openvinotoolkit#33861)

### Details:
- This PR enhances RMS normalization fusion to support pattern without
learnable affine parameter (gamma), enabling optimization of transformer
architecture like LTX-Video
- The existing RMS fusion pass only supported pattern with constant
gamma parameter. However, some transformer model (e.g., LTX-Video's
attention layers) use RMS normalization followed by dynamic scaling
operation where the scale factor is non-constant. These pattern was
previously unfused, missing optimization opportunity
- When `elementwise_affine=False` (equivalent to [Pytorch RMS's
attribute](https://docs.pytorch.org/docs/stable/generated/torch.nn.modules.normalization.RMSNorm.html)),
RMS normalization does not include learnable gamma parameters. The gamma
is implicitly fixed to ones, reducing the decomposed graph pattern
from:
`x → Power(2) → ReduceMean → Add(eps) → Sqrt → Divide(1/√) → Multiply(x,
1/√) → Multiply(gamma) `
to:
`x → Power(2) → ReduceMean → Add(eps) → Sqrt → Divide(1/√) → Multiply(x,
1/√) [NO gamma multiplication] `
<img width="648" height="924" alt="image-2026-01-26-22-53-05-973"
src="https://github.com/user-attachments/assets/02b4580f-bbce-43ea-affd-438f0a5f4ea7"
/>

### Tickets:
 - [CVS-179953](https://jira.devtools.intel.com/browse/CVS-179953)

---------

Signed-off-by: Andrew Park <andrew.park@intel.com>
### Details:
Remove 'using namespace *' in the snippets related part of the code base
(tests excluded)

### Tickets:
 - N/A
…otoolkit#33940)

### Description of the issue(symptom, root-cause, how it was resolved)
 - Fixed the use of float accumulators for intermediate calculations.
 - Corrected the use of float inputs for MAD operations.


#### The code and line that caused this issue (if it is not changed
directly)
- src\plugins\intel_gpu\src\kernel_selector\cl_kernels\gemm_tiled_opt.cl

#### Reproduction step and snapshot (if applicable. Do not attach for
customer model)
 - reproducer is attached in the ticket.

#### Checklist
 - [x] Is it a proper fix? (not a workaround)
 - [x] Did you include test case for this fix, if necessary?
- [ ] Did you review existing test that can be extended to cover this
scenario? Which test did you review?


### Tickets:
 - 179229
…change in pattern (openvinotoolkit#33984)

### Details:
- *Due to recent change, Reshape special_zeros attribute flipped bool
val, causing mismatch*
 - *Fix e2e test skip*

### Tickets:
 - *CVS-180693*
 - *CVS-180696*
 - *CVS-180665*

---------

Co-authored-by: Mikhail Ryzhov <mikhail.ryzhov@intel.com>
…notoolkit#34058)

### Details:
Rebase the oneDNN change to the latest v3.8 HEAD

### Tickets:
 - N/A
### Details:
 - *item1*
 - *...*

### Tickets:
 - *ticket-id*
### Details:
 - ACL is upgraded to 52.8.0
 - Android ACL scons command has been changed to match ACL team setup:
- Switched to target-triple compiler prefix `<triple><api>-` and empty
`toolchain_prefix`
     - Added Android ABI -> target triple mapping
- Added API level resolution/validation (`ANDROID_PLATFORM_LEVEL` with
fallback from `ANDROID_PLATFORM`).

### Tickets:
 - CVS-180218
### Details:
ITT traces were initially enabled on target platform only in
openvinotoolkit#31499

This patch extends support for all available platforms

### Tickets:
 - N/A
…openvinotoolkit#33979)

### Tickets:
 - CVS-179708

---------

Signed-off-by: Tomasz Jankowski <tomasz1.jankowski@intel.com>
[About]

This PR enables u8 kv cache precsion for SDPA operator and optimizes the
same with NEON and SVE.

- Improves the performance of OSS master [ where reference
implementation is available ] version by 27%.

- But we are slower by 2.7% when compared with non-quantized f16 cache
precision due to additional overhead of quantization and dequantization
for smaller models like TinyLlama-1.1B for single inference.

- Such performance benefit [from u8 quantization] can be seen only when
the inference is more memory bound. We see speedups around 3-5% when
inferencing LLama-70B int8 quantized model for single Inference case.

- Therefore, even though we achieve a speedup of 27% compared to
reference implementation, we assume the general case to be compute bound
and currently keeping the default as F16 only.

- As models get larger and in multiple batch scenarios, by setting
kv_cache as "u8" we see significant boost at inference level.


| OSS ref impl - u8 | This PR |
|----------|:----------:|
| 10.8 tokens/sec  | 13.7 tokens/sec  |

Single inference performance on LLAMA2-7B model on 32c graviton machine.
The values are in TPS [ Tokens per second ].

This work is contributed by @ashwins990  & @abhijain1204fujitsu
### Details:
The ReshapePRelu transformation crashes when PRelu input has rank=0
(scalar).

The existing check `prelu_rank.get_length() == 1` skips rank-1 inputs
but allows rank-0 scalars to pass through. The code then attempts to
access dimension index 1 (channel_dim_idx), which causes "Accessing
out-of-range dimension" exception.

Changed the condition from `== 1` to `< 2` to skip both scalar (rank=0)
and 1D (rank=1) inputs, since the transformation requires at least
rank-2 tensors to access the channel dimension.

### Tickets:
 - 179013
…tures (openvinotoolkit#34062)

### Details:
- Move shared CPU snippets shape infer registrations to
`transformations/snippets/common/shape_inference.cpp`
- Make x64/aarch64 shape inference files extend the common registry with
arch-specific ops only
- Exclude `transformations/snippets/x64/*` for all non-x64 builds (not
only aarch64) in CPU plugin CMake, so RISC-V build **does not** depend
on x64 snippets transformations directory


### Tickets:
 - N/A
Basics unit tests are skipped until CVS-180810 is fixed.

---------

Signed-off-by: Kirill Suvorov <kirill.suvorov@intel.com>
### Details:
- Guard InteractionNode-dependent logic in constructor and
`isSupportedOperation` with `OPENVINO_ARCH_X86_64`

Interaction node source is compiled on non-x64 targets, while
`transformations/cpu_opset/x64/op/interaction.cpp` (which defines
`ov::intel_cpu::InteractionNode`) is excluded by CMake when x64 is OFF.
Issue that is being resolved here: non-x64 links could fail with:
"undefined reference to typeinfo for ov::intel_cpu::InteractionNode"
(e.g. in case of `ov_cpu_unit_tests` on RISCV64).

### Tickets:
 - N/A
…#33244)

### Details:
- *Changed partial ov::parallel_for to CpuParallel::parallel_for in
which can set TBB partitioner to AUTO or STATIC*

### Tickets:
 - *CVS-177452*
…penvinotoolkit#33963)

### Details:
 - Make `jit_fill_emitter` not responcible for inplace operations
- Introduce `InsertTailFill` pass as a part of `ReduceDecomposition`
pass to support conditional tail insertion to remove workaround on
emitter side

### Tickets:
 - 126270
### Details:
- Adding missing L0 extension headers
```
[2026-02-10T13:14:32.621Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_graph_ext.h
[2026-02-10T13:14:32.622Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_graph_profiling_ext.h
[2026-02-10T13:14:32.622Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_command_queue_npu_ext.h
[2026-02-10T13:14:32.622Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_intel_npu_uuid.h
[2026-02-10T13:14:32.622Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_context_npu_ext.h
[2026-02-10T13:14:32.622Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/level-zero-ext/ze_driver_npu_ext.h
``` 
- Adding ittnotify headers
```
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/disable_warnings.h
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittnotify_config.h
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittnotify_static.c
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittnotify_static.h
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittnotify_types.h
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittptmark32.asm
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittptmark32.S
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittptmark64.asm
[2026-02-10T13:14:17.185Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/ittptmark64.S
[2026-02-10T13:14:17.186Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify/jitprofiling.c
[2026-02-10T13:14:17.186Z] -- Up-to-date: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/advisor-annotate.h
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/AdvisorAnnotate.cs
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran/advisor_annotate.f90
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran/posix
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran/posix/ittnotify.f90
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran/win32
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/fortran/win32/ittnotify.f90
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify-zca.h
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/ittnotify.h
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/jitprofiling.h
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/legacy
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/legacy/ittnotify.h
[2026-02-10T13:14:17.546Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/include/ittnotify/libittnotify.h

...

[2026-02-10T13:14:44.069Z] -- Installing: C:\jenkins\workspace\openVINO-builder\dev_package/developer_package/lib/libittnotify.lib
```

### Tickets:
 - E#164884

### Tests:
- OpenVINO-Windows10/78895/
- Updated Python API stub (.pyi) files from the latest available nightly

Auto-generated by GitHub Actions
Currently, copyright header consistency is checked manually during code
review (e.g.
[1](openvinotoolkit#32393 (comment)),
[2](openvinotoolkit#33072 (comment)),
[3](openvinotoolkit#32983 (comment)),
[4](openvinotoolkit#31191 (comment))).
The idea of this PR is to automate this process and save reviewers'
time.

### Details:

- *Introduced GHA workflow that validates copyright headers in C++ and
Python files on PRs. If issues are found, it generates a patch file and
fails the check with clear instructions.*
- *Added some changes to the files which have copyright inconsistency in
order to verify the added scripts*

### Tickets:
 - *N\A*
openvinotoolkit#33992)

### Details:
 - *Remove `std::vector<unit_8> compiledNetwork`*
- *Add support for `make_tensor_from_aligned_addr`, which creates
`ov::Tensor` from aligned allocated memory*

### Tickets:
 - *CVS-180882*

---------

Signed-off-by: Kang, Wenjing <wenjing.kang@intel.com>
### Details:
The PR enables Convolution non-i32 bias support. Before this PR only i32
Convolution bias was supported (due to ACL limitations).
- `ConvertConvolutionBias` has been introduced. This transformation
detects specific quantized Convolution patterns followed by Multiply and
Add and inserts a Convert to i32 between the constant bias and the Add
node.
- `AddTransformation` is called for non-convolution bias only on ARM.
Convolution bias is handled by `ConvertConvolutionBias transformation`
on ARM.
- The order of applying scales and shifts has been changed on ARM:
`bias, scale, fq` on ARM vs `scale, bias, fq` on x86. It's needed to get
specific postops order.
- `ConvertConvolutionBias` transformation tests have been added to test
the transformation.

### Tickets:
 - CVS-180491

---------

Co-authored-by: Vladislav Golubev <vladislav.golubev@intel.com>
### Details:
 - The ITT allocate memory from the below call stack.
 ```
000001d4`82618d30 00007ffd`12845ff9
vfbasics!AVrfpInitializeCriticalSectionCommon+0x13d
000001d4`82618d38 00007ffc`d277ac4c
openvino!__itt_get_collection_state+0x2c
[C:\Jenkins\workspace\private-ci\ie\build-windows-vs2022@2\b\repos\openvino\thirdparty\ittapi\ittapi\src\ittnotify\ittnotify_static.c
@ 1665]
000001d4`82618d40 00007ffc`d1d68949
openvino!openvino::itt::internal::`dynamic initializer for 'state''+0x9
[src\common\itt\src\itt.cpp @ 22]
000001d4`82618d48  00007ffd`2f8de716 ucrtbase!initterm+0x36
000001d4`82618d50 00007ffc`d2782eea
openvino!dllmain_crt_process_attach+0x9a
[D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @
66]
000001d4`82618d58 00007ffc`d2783057 openvino!dllmain_dispatch+0x6f
[D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @
276]
000001d4`82618d60 00007ffd`13d20ec4
verifier!AVrfpStandardDllEntryPointRoutine+0xf4
000001d4`82618d68 00007ffd`20dfb704
vrfcore!VfCoreStandardDllEntryPointRoutine+0x184
000001d4`82618d70 00007ffd`12848694
vfbasics!AVrfpStandardDllEntryPointRoutine+0xf4
000001d4`82618d78 00007ffd`322df86e
ntdll!LdrpCallInitRoutineInternal+0x22
000001d4`82618d80  00007ffd`3218bcae ntdll!LdrpCallInitRoutine+0x10e
000001d4`82618d88  00007ffd`321897ac ntdll!LdrpInitializeNode+0x19c
000001d4`82618d90 00007ffd`322176ea
ntdll!LdrpInitializeGraphRecurse+0x6a
000001d4`82618d98 00007ffd`32217716
ntdll!LdrpInitializeGraphRecurse+0x96
```
 - Need to call `__itt_release_resources` when unload `openvino.dll`.

 - The solutions
Here is the solution: we create a class used to store all resource deallocation methods, then create a static object. The release method will register to the static object; this object will be released when the dll unload, all release functions will be called in the destructor. In this way, we didn't need to change any code in DLLMain/unload_library. Just use a MACRO to define the function pointer, like the code below.
```
static void shutdown_frontend_resources() {
    google::protobuf::ShutdownProtobufLibrary();
}

OV_REGISTER_SHUTDOWN_CALLBACK(shutdown_frontend_resources)
```

### Tickets:
 - [CVS-179009](https://jira.devtools.intel.com/browse/CVS-179009)
 - [CVS-180657](https://jira.devtools.intel.com/browse/CVS-180657)

---------

Co-authored-by: Michal Lukaszewski <michal.lukaszewski@intel.com>
…kit#34047)

### Details:
Switching Debian 10 ARM64 CPU Functional Tests to newer generation of
ARM64 runner - it's faster and cheaper. Leaving Linux ARM64 and
cross-compilation on the old one - tests are failing, and it's not
really worth to fix them, unless it's quick.

Switching Python API tests in Linux ARM64 workflow to a less powerful
runner - they run just as fine on it.

### Tickets:
 - *CVS-158878*
…rs (openvinotoolkit#34030)

### Details:
- *Implement test RNG via `SeededRandom`, defaulting to FP32 outputs and
consistent seed usage to stabilize test inputs.*
- *Update layer tests to use RNG helpers and dtype parameters directly
instead of ad‑hoc casts.*
- *Remove unused imports and minor cleanup across PyTorch tests*
- *Adjust string constant handling in `fx_decoder.py`.*

### Tickets:
 - *ticket-id*
aobolensk and others added 8 commits February 13, 2026 19:03
…4117)

### Details:
Remove NOLINT and template instantiations

### Tickets:
 - N/A
### Details:
 - Fix some Coverity issues

### Tickets:
 - *ticket-id*
### Details:
 - *Fix set_value attr size validation and add related testcase*


### Tickets:
 - *CVS-181028*
…otoolkit#34109)

out_high can be read of out bounds because of incorrect stride

CVS-181020
…ding (openvinotoolkit#33961)

### Details:
- Changed the for-each loop over `model->get_ordered_ops()` to an
index-based for loop, using `std::move` to transfer ownership of each
node from the `nodes` vector. That would allow us to destroy unused
constants immediately and minimize the peak memory allocation.

### Tickets:
 - CVS-176571

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Katarzyna Mitrus <katarzyna.mitrus@intel.com>
Motivation:
Exporting a compiled model in MULTI mode was previously not supported, returning OPENVINO_NOT_IMPLEMENTED. This prevented users from caching or serializing MULTI-device models.

Changes:
- Implemented `AutoCumuCompiledModel::export_model` in `src/plugins/auto/src/cumulative_compiled_model.cpp` to serialize the MULTI configuration (XML) and delegate model export to sub-devices.
- Implemented `Plugin::import_model` in `src/plugins/auto/src/plugin.cpp` to deserialize the MULTI configuration and reconstruct the distributed model state by importing sub-models.
- Updated `CumuSchedule::init` in `src/plugins/auto/src/cumulative_schedule.cpp` to handle initialization during import (skipping compilation tasks when `ov::Model` is null).
- Added `openvino/pass/serialize.hpp` and `openvino/util/xml_parse_utils.hpp` includes where necessary.
- Added a functional test `CanExportImportMultiModel` in `src/plugins/auto/tests/functional/behavior/multi_export_import_test.cpp`.
- Updated copyright years to 2026.

Verification:
- Compiled `openvino_auto_plugin` and `ov_auto_func_tests`.
- Ran `AutoFuncTests.CanExportImportMultiModel` test, which passed successfully.
- Verified that the exported model can be imported and used for inference with correct results and device utilization.

Co-authored-by: andersendsa <199610634+andersendsa@users.noreply.github.com>
Motivation:
Enable `export_model` and `import_model` functionality for the MULTI device plugin to allow saving and loading compiled models for Cumulative Throughput mode. Also fix a critical CI failure in `Smart_CI` action.

Changes:
- Implemented `AutoCumuCompiledModel::export_model` to serialize device configurations to XML and sub-device binaries to the stream.
- Implemented `Plugin::import_model` to parse the XML and reconstruct the distributed compiled model.
- Updated `CumuSchedule` initialization to support loading without an initial `ov::Model`.
- Added functional test `CanExportImportMultiModel` covering export/import flow.
- Fixed race condition in `export_model` by capturing device state under lock.
- Removed faulty `StreamSerialize` fallback.
- Updated `.github/actions/smart-ci/action.yml` to use isolated paths for internal checkouts, preventing workspace cleanup issues.

Verification:
- Added `AutoFuncTests.CanExportImportMultiModel` passes.
- Verified fix for race condition and XML formatting.
- CI fix addresses `requirements.txt` not found error.

Co-authored-by: andersendsa <199610634+andersendsa@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.