Skip to content

LLM: release plugin once pipeline is removed and WA for GPU #2102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sbalandi
Copy link
Contributor

No description provided.

@sbalandi sbalandi marked this pull request as ready for review April 23, 2025 10:30
@github-actions github-actions bot added category: continuous batching Continuous batching category: LLM LLM pipeline (stateful, static) category: tokenizers Tokenizer class or submodule update category: GenAI C++ API Changes in GenAI C++ public headers no-match-files labels Apr 23, 2025
@sbalandi
Copy link
Contributor Author

checked the memory on Linux/Windows and got the same results as in the task, most of the memory is released after removing the pipeline, but 40-60 MB remains(but that tail is not the goal of that pr)

@Wovchena please, take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: LLM LLM pipeline (stateful, static) category: tokenizers Tokenizer class or submodule update no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants