Skip to content

LLM: release plugin once pipeline is removed and WA for GPU #1846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

p-durandin
Copy link

Based on #1627 and WA for GPU oneDNN cache clean added

@github-actions github-actions bot added category: continuous batching Continuous batching category: LLM LLM pipeline (stateful, static) category: tokenizers Tokenizer class or submodule update category: GenAI C++ API Changes in GenAI C++ public headers no-match-files labels Mar 5, 2025
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
ContinuousBatchingPipeline::IContinuousBatchingPipeline::~IContinuousBatchingPipeline() {
m_tokenizer = {};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IContinuousBatchingPipeline is ContinuousBatchingImpl's parent. The order of destructor calls is reverse:

  1. ~ContinuousBatchingImpl()
  2. ~IContinuousBatchingPipeline()

By the time ~IContinuousBatchingPipeline() is called, utils::release_core_plugin(m_device) from ContinuousBatchingImpl() had already been executed. Given that the solution is satisfactory, you could have defined ~IContinuousBatchingPipeline() = default; and get the same result. But maybe a better solution would be to move utils::release_core_plugin(m_device) to IContinuousBatchingPipeline() (m_device is already there). This would fix the call order, remove the need to manually clear ContinuousBatchingImpl's members and enable clearing for other children: PromptLookupImpl and SpeculativeDecodingImpl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena I don't have right to modify that PR, so I have applied comment here #2102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: LLM LLM pipeline (stateful, static) category: tokenizers Tokenizer class or submodule update do_not_merge no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants