Skip to content

Conversation

@ahnyoung-paul
Copy link
Contributor

@ahnyoung-paul ahnyoung-paul commented Mar 12, 2025

After the Whisper model's commit (07dfba94a36f88f0614b8bb61ebb2d9b8ef6324c) is merged, beam_idx and encoder_hidden_states are created as ov::Tensor in the Whisper pipeline. However, this creation of ov::Tensor in GenAI may affect performance. In the GPU plugin, the tensor obtained from infer_request is a host tensor that uses usm_host memory, while the newly created tensor is an ov::allocated_tensor.

When ov::allocatedTensor is used as an input for prepare_input, it is neither a remote tensor nor a usm_host tensor. Consequently, the GPU plugin internally creates a usm_device tensor and performs unnecessary copying of values, leading to degraded performance.

To fix this issue, use the create_host_tensor method instead of creating ov::Tensor directly.

This is my issue ticket: CVS-162818

@ahnyoung-paul ahnyoung-paul added the category: LLM LLM pipeline (stateful, static) label Mar 12, 2025
@github-actions github-actions bot added category: whisper Whisper pipeline and removed category: LLM LLM pipeline (stateful, static) labels Mar 12, 2025
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was doing a similar thing for VLM m_request.get_compiled_model().get_context() didn't bring any performance. But reusing the context did. My guess was that m_request.get_compiled_model().get_context() takes too much time.

Did you verify that the issue is gone with you patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, get_context() is not available for CPU

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahnyoung-paul Did you verify that the issue is gone with you patch?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling is known to be costly in C++. With that try catch you need to also verify that CPU performance didn't degrade

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling is known to be costly in C++. With that try catch you need to also verify that CPU performance didn't degrade

Verified no found regression for CPU, please find detail from the ticket. thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahnyoung-paul Did you verify that the issue is gone with you patch?

This issue is not caused by a single factor but by multiple issues leading to performance degradation. By applying this PR to the Whisper model's commit (07dfba9), most of the issues were resolved, but another performance degradation was found in the latest commit. This PR reduces unnecessary enqueue memory creation and copying, including enqueue memcpy, thereby reducing device time and somewhat improving latency. However, further improvements require additional analysis from the GenAI team. Detailed information has been left in the ticket (CVS-162818), thanks.

@ilya-lavrenov ilya-lavrenov added this to the 2025.1 milestone Mar 13, 2025
@ahnyoung-paul ahnyoung-paul force-pushed the trial_whisper_tiny_perf_drop branch from 59d488e to 4b8f74b Compare March 13, 2025 09:11
@ilya-lavrenov ilya-lavrenov enabled auto-merge March 13, 2025 10:02
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Mar 13, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 13, 2025
@ilya-lavrenov ilya-lavrenov merged commit 2b94f73 into openvinotoolkit:master Mar 13, 2025
54 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Jun 13, 2025
Bumps [librosa](https://github.com/librosa/librosa) from 0.10.2.post1 to
0.11.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/librosa/librosa/releases">librosa's
releases</a>.</em></p>
<blockquote>
<h2>0.11.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Bump minimum matplotlib to 3.5.0 by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1839">librosa/librosa#1839</a></li>
<li>Fix name collision for norm argument in mfcc by <a
href="https://github.com/ssslakter"><code>@​ssslakter</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1844">librosa/librosa#1844</a></li>
<li>Remove .H for conjugate().T by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1851">librosa/librosa#1851</a></li>
<li>[documentation][Issue <a
href="https://redirect.github.com/librosa/librosa/issues/1855">#1855</a>]
Update Fast Mellin Transform doc example by <a
href="https://github.com/dhunstack"><code>@​dhunstack</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1856">librosa/librosa#1856</a></li>
<li>fix onset documentation specshow samplerate by <a
href="https://github.com/BenedictSt"><code>@​BenedictSt</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1858">librosa/librosa#1858</a></li>
<li>Updating github issue templates by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1860">librosa/librosa#1860</a></li>
<li>Faster abs2 on real inputs by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1864">librosa/librosa#1864</a></li>
<li>Revised dtw docstring, fixes <a
href="https://redirect.github.com/librosa/librosa/issues/1741">#1741</a>
by <a href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1866">librosa/librosa#1866</a></li>
<li>Various doc updates 0.11 by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1878">librosa/librosa#1878</a></li>
<li>Improved CMND implementation for YIN / pYIN by <a
href="https://github.com/dsuedholt"><code>@​dsuedholt</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1882">librosa/librosa#1882</a></li>
<li>Update effects.py by <a
href="https://github.com/scottvr"><code>@​scottvr</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1891">librosa/librosa#1891</a></li>
<li>bump codecov action to v5 by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1898">librosa/librosa#1898</a></li>
<li>declare numba requirement before numpy by <a
href="https://github.com/dsuedholt"><code>@​dsuedholt</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1897">librosa/librosa#1897</a></li>
<li>allow skipping linkchecks on ~dpwe by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1899">librosa/librosa#1899</a></li>
<li>fixed pooch registry regression from <a
href="https://redirect.github.com/librosa/librosa/issues/1829">#1829</a>
by <a href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1900">librosa/librosa#1900</a></li>
<li>Modernize for 2024/5 by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1903">librosa/librosa#1903</a></li>
<li>implemented testing with network isolation by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1911">librosa/librosa#1911</a></li>
<li>0.11.0rc0 release prep by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1907">librosa/librosa#1907</a></li>
<li>Deprecate set_fftlib, update fftpack uses by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1916">librosa/librosa#1916</a></li>
<li>PR for 0.11.0rc1 by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1913">librosa/librosa#1913</a></li>
<li>Get samplerate installing from source on windows for python 3.13 by
<a href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1919">librosa/librosa#1919</a></li>
<li>Trying on the new linux-arm64 runners by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1921">librosa/librosa#1921</a></li>
<li>0.11.0 final by <a
href="https://github.com/bmcfee"><code>@​bmcfee</code></a> in <a
href="https://redirect.github.com/librosa/librosa/pull/1922">librosa/librosa#1922</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/ssslakter"><code>@​ssslakter</code></a>
made their first contribution in <a
href="https://redirect.github.com/librosa/librosa/pull/1844">librosa/librosa#1844</a></li>
<li><a href="https://github.com/dhunstack"><code>@​dhunstack</code></a>
made their first contribution in <a
href="https://redirect.github.com/librosa/librosa/pull/1856">librosa/librosa#1856</a></li>
<li><a
href="https://github.com/BenedictSt"><code>@​BenedictSt</code></a> made
their first contribution in <a
href="https://redirect.github.com/librosa/librosa/pull/1858">librosa/librosa#1858</a></li>
<li><a href="https://github.com/dsuedholt"><code>@​dsuedholt</code></a>
made their first contribution in <a
href="https://redirect.github.com/librosa/librosa/pull/1882">librosa/librosa#1882</a></li>
<li><a href="https://github.com/scottvr"><code>@​scottvr</code></a> made
their first contribution in <a
href="https://redirect.github.com/librosa/librosa/pull/1891">librosa/librosa#1891</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/librosa/librosa/compare/0.10.2...0.11.0">https://github.com/librosa/librosa/compare/0.10.2...0.11.0</a></p>
<h2>0.11.0rc1</h2>
<p>This is the second pre-release for version 0.11.0.</p>
<h2>Known issues</h2>
<p>Full python 3.13 support currently requires manually installing the
following packages:</p>
<ul>
<li><code>standard-aifc</code></li>
<li><code>standard-sunau</code></li>
</ul>
<p>With pip, this can be done by calling <code>python -m pip install
standard-aifc standard-sunau</code>.</p>
<p>This step is not required for python 3.12 or earlier.</p>
<p>Windows users with python 3.13 environments may encounter problems
with the optional <code>samplerate</code> backend package for sample
rate conversion. Other platforms are (to our knowledge) unaffected, as
are earlier python versions on windows. Users who only use the default
sample rate conversion settings should not be affected.</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/librosa/librosa/blob/main/docs/changelog.rst">librosa's
changelog</a>.</em></p>
<blockquote>
<h2>v0.11.0</h2>
<p>2025-03-11</p>
<p>Maintenance
- <code>[#1831](https://github.com/librosa/librosa/issues/1831)</code>_
Numpy 2.0 is now supported.
- <code>[#1898](https://github.com/librosa/librosa/issues/1898)</code>_
Updated codecov action. <em>Brian McFee</em>
- <code>[#1860](https://github.com/librosa/librosa/issues/1860)</code>_
Updated GitHub issue templates. <em>Brian McFee</em>
- <code>[#1921](https://github.com/librosa/librosa/issues/1921)</code>_
Added linux-arm64 to the test matrix. <em>Brian McFee</em></p>
<p>Enhancements
- <code>[#1916](https://github.com/librosa/librosa/issues/1916)</code>_
The default FFT backend has been changed from <code>numpy</code> to
<code>scipy</code>.
- <code>[#1864](https://github.com/librosa/librosa/issues/1864)</code>_
Accelerated <code>abs2</code> calculation on real-typed inputs.
<em>Brian McFee</em></p>
<p>Bug fixes
- <code>[#1916](https://github.com/librosa/librosa/issues/1916)</code>_
Utility decorators now preserve type annotations properly. <em>Brian
McFee</em>
- <code>[#1900](https://github.com/librosa/librosa/issues/1900)</code>_
Fixed an error in pooch registry definition. <em>Brian McFee</em>
- <code>[#1897](https://github.com/librosa/librosa/issues/1897)</code>_
Accommodate dependency resolution issues with <code>uv</code>. <em>David
Südholt</em>
- <code>[#1878](https://github.com/librosa/librosa/issues/1878)</code>_
Improved accuracy for <code>yin</code> and <code>pyin</code>. <em>David
Südholt</em>
- <code>[#1851](https://github.com/librosa/librosa/issues/1851)</code>_
Fixed sparse matrix compatibility for scipy&gt;=1.14. <em>Brian
McFee</em>
- <code>[#1844](https://github.com/librosa/librosa/issues/1844)</code>_
Fixed parameter name collision for norm argument in
<code>feature.mfcc</code>. <em>Slava Chaunin</em></p>
<p>Documentation
- <code>[#1899](https://github.com/librosa/librosa/issues/1899)</code>_
Avoid documentation link-checking failures from unreliable URLs.
<em>Brian McFee</em>
- <code>[#1891](https://github.com/librosa/librosa/issues/1891)</code>_
Corrected typo in effects module. <em>Scott VanRavenswaay</em>
- <code>[#1878](https://github.com/librosa/librosa/issues/1878)</code>_
Numerous documentation updates (spectral centroid,
<code>key_to_notes</code>, <code>effects.trim</code>). <em>Brian
McFee</em>
- <code>[#1866](https://github.com/librosa/librosa/issues/1866)</code>_
Updated documentation for <code>dtw</code>. <em>Brian McFee</em>
- <code>[#1858](https://github.com/librosa/librosa/issues/1858)</code>_
Fixed documentation example for onset detection. <em><a
href="https://github.com/BenedictSt"><code>@​BenedictSt</code></a></em>
- <code>[#1856](https://github.com/librosa/librosa/issues/1856)</code>_
Updated documentation example for Fast Mellin Transform. <em>Anmol
Mishra</em></p>
<p>Deprecations
- <code>[#1916](https://github.com/librosa/librosa/issues/1916)</code>_
<code>librosa.set_fftlib</code> is deprecated and will be removed in
version 1.0. Users should transition to using
<code>scipy.fft.set_backend</code> instead.
- Expired deprecation of <code>mono=</code> parameter in
<code>util.valid_audio</code>.
- <code>win_length</code> parameter in <code>yin</code> and
<code>pyin</code></p>
<p>.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1921">#1921</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1921">librosa/librosa#1921</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1916">#1916</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1916">librosa/librosa#1916</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1831">#1831</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1831">librosa/librosa#1831</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1898">#1898</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1898">librosa/librosa#1898</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1860">#1860</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1860">librosa/librosa#1860</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1864">#1864</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1864">librosa/librosa#1864</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1900">#1900</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1900">librosa/librosa#1900</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1897">#1897</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1897">librosa/librosa#1897</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1878">#1878</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1878">librosa/librosa#1878</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1851">#1851</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1851">librosa/librosa#1851</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1844">#1844</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1844">librosa/librosa#1844</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1899">#1899</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1899">librosa/librosa#1899</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1891">#1891</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1891">librosa/librosa#1891</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1866">#1866</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1866">librosa/librosa#1866</a>
.. _<a
href="https://redirect.github.com/librosa/librosa/issues/1858">#1858</a>:
<a
href="https://redirect.github.com/librosa/librosa/pull/1858">librosa/librosa#1858</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/librosa/librosa/commit/af8c839fb15317fa2712ea66e7a22da6a9267b32"><code>af8c839</code></a>
0.11.0 final (<a
href="https://redirect.github.com/librosa/librosa/issues/1922">#1922</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/24b3a2963842ec21d0f253814a0f152fc7dbbfc0"><code>24b3a29</code></a>
Trying on the new linux-arm64 runners (<a
href="https://redirect.github.com/librosa/librosa/issues/1921">#1921</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/9ec4f4b73208159454477df7fff8ff60cdf3b47a"><code>9ec4f4b</code></a>
Get samplerate installing from source on windows for python 3.13 (<a
href="https://redirect.github.com/librosa/librosa/issues/1919">#1919</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/894942673d55aa2206df1296b6c4c50827c7f1d6"><code>8949426</code></a>
PR for 0.11.0rc1 (<a
href="https://redirect.github.com/librosa/librosa/issues/1913">#1913</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/89fd6804cdc0f68671ec64e267635d666e49c99a"><code>89fd680</code></a>
Deprecate set_fftlib, update fftpack uses (<a
href="https://redirect.github.com/librosa/librosa/issues/1916">#1916</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/3f88620f36d6726c3b656e8264828799ea7fa8fe"><code>3f88620</code></a>
0.11.0rc0 release prep (<a
href="https://redirect.github.com/librosa/librosa/issues/1907">#1907</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/ad82de6d102d0394ea31ee6e2d1e2a19f74deadd"><code>ad82de6</code></a>
implemented testing with network isolation (<a
href="https://redirect.github.com/librosa/librosa/issues/1911">#1911</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/ebd878fabb87c588f98f372a94fca5d167c59cf2"><code>ebd878f</code></a>
Modernize for 2024/5 (<a
href="https://redirect.github.com/librosa/librosa/issues/1903">#1903</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/338667079c5bf18e7ed5ab8d9456b228df36a3d9"><code>3386670</code></a>
fixed pooch registry regression from <a
href="https://redirect.github.com/librosa/librosa/issues/1829">#1829</a>
(<a
href="https://redirect.github.com/librosa/librosa/issues/1900">#1900</a>)</li>
<li><a
href="https://github.com/librosa/librosa/commit/5d540c047f0316f35200daf67bc0c18c3174fd22"><code>5d540c0</code></a>
allow skipping linkchecks on ~dpwe (<a
href="https://redirect.github.com/librosa/librosa/issues/1899">#1899</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/librosa/librosa/compare/0.10.2.post1...0.11.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=librosa&package-manager=pip&previous-version=0.10.2.post1&new-version=0.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants