-
-
Notifications
You must be signed in to change notification settings - Fork 14
Description
In part, I am offering observation should it help others. Additionally, I'd be grateful if you can comment on expected run duration?
In short, I was facing issues with llama-cpp-python using CPU not GPU in another LLM node, and your script provided with MiniCPM was perfect to fix the problem! I'm unsure, however, if I have your MiniCPM node running on GPU properly!
For info, the issue I had with another node is here, SeargeDP/ComfyUI_Searge_LLM#54
As I had your MiniCPM node, I found your script to install llama_cpp_python, which is superb as it helped me get llama-cpp-python to use CUDA.
https://github.com/1038lab/ComfyUI-MiniCPM/blob/main/llama_cpp_install/llama_cpp_install.py
The problem I found is that once llama-ccp-python is installed in an environment (perhaps by another node without flagging to use CUDA), any attempts to install llama-cpp-python or calls from requirements.txt are both seen as already be fulfilled. As such your script sees the GPU, you specify cmake.args="-DGGML_CUDA=on", but pip fails to updated llama-cpp-python as pip considers requirements already met.
BUT..
uninstalling llama-cpp-python first, results in your script calling pip and building a wheel which gets llama-cpp-python installed and correctly using cuda! Thank you.
In my case (ubuntu, venv, cu128, python 3.12, 9thGen I7, 5090), I did the following
cd ComfyUI
source ./venv/bin/activate
cd custom_nodes/ComfyUI-MiniCPM/llama_cpp_install/
pip uninstall llama-cpp-python
python3 llama_cpp_install.py
For the Searge node I reference above, the llama-cpp-python reinstall cut run duration from 50s (I see all CPU cores being used) to 1.5s (GPU at 100%).
Sadly, this hasn't sped up runs with MiniCPM which remained around 28 seconds (I see only 1 CPU core used) and a fewer flashes at 20% on the GPU. I'm not sure I can tell what is being used, as it taxes nothing but takes its time.
I'd be keen to know if 30 seconds is an expected runtime with the specs I mention, and any pointers on resolution if not? Other than that, feel free to close this as resolved. Thanks.