-
Notifications
You must be signed in to change notification settings - Fork 232
LLM: release plugin once pipeline is removed and WA for GPU #1846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>; | ||
ContinuousBatchingPipeline::IContinuousBatchingPipeline::~IContinuousBatchingPipeline() { | ||
m_tokenizer = {}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IContinuousBatchingPipeline
is ContinuousBatchingImpl
's parent. The order of destructor calls is reverse:
- ~ContinuousBatchingImpl()
- ~IContinuousBatchingPipeline()
By the time ~IContinuousBatchingPipeline()
is called, utils::release_core_plugin(m_device)
from ContinuousBatchingImpl()
had already been executed. Given that the solution is satisfactory, you could have defined ~IContinuousBatchingPipeline() = default;
and get the same result. But maybe a better solution would be to move utils::release_core_plugin(m_device)
to IContinuousBatchingPipeline()
(m_device
is already there). This would fix the call order, remove the need to manually clear ContinuousBatchingImpl
's members and enable clearing for other children: PromptLookupImpl
and SpeculativeDecodingImpl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on #1627 and WA for GPU oneDNN cache clean added