-
Notifications
You must be signed in to change notification settings - Fork 146
[OpenVINO] Support openbmb/MiniCPM-o-2_6 for image-text-to-text task #1454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! left a question and a nit suggestion
thanks for the addition !
"minicpm": "katuni4ka/tiny-random-minicpm", | ||
"minicpm3": "katuni4ka/tiny-random-minicpm3", | ||
"minicpmv": "katuni4ka/tiny-random-minicpmv-2_6", | ||
"minicpmo": "rkazants/tiny-random-MiniCPM-o-2_6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this model will slow down our ci greatly, it is 400MB 🫨
https://huggingface.co/rkazants/tiny-random-MiniCPM-o-2_6/tree/main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a minimal size I managed to receive. minicpmv
is about ~300MB and it is tested: https://huggingface.co/katuni4ka/tiny-random-minicpmv-2_6/tree/main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be reduced as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reduced to 144MB. Minimal hidden_size is 128 for llm part: https://huggingface.co/rkazants/tiny-random-MiniCPM-o-2_6/blob/main/modeling_minicpmo.py#L209
That also impacts apm and tts module size.
@IlyasMoutawwakil, @echarlaix, I propose to do further reduction in further PR(s) if any ideas. Now my other colleagues anticipate this PR merge, let us not block PR merge due to tiny model size. We know that the implemented logic are passing the tests in GHA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
completely agree with @IlyasMoutawwakil comment, we should be super careful with our tiny random models size to not slow down the ci, could you extend on the different models parameters constraint @rkazants https://huggingface.co/rkazants/tiny-random-MiniCPM-o-2_6/blob/main/config.json#L20 for example I see d_model
/ decoder_ffn_dim
/ encoder_ffn_dim
respectively set to 1024, 1024 and 4096
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if the PR really needs to be merged asap I'm ok with keeping this model but would like to have a following PR to change it to a smaller model or if that cannnot be done due to modeling constraint then would like to have more information on what are the constraints / why it cannot be done, would that sound reasonable @rkazants ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline with @echarlaix to proceed with the merge.
I will take this AR for further optimization. Indeed, there is a room for optimization such as d_model
, encoder_ffn_dim
but it will take some time because varying these parameters values needs to adjust several parameters from other modalities. It requires a bit deeper model understanding.
Thanks!
Co-authored-by: Nikita Savelyev <[email protected]>
Co-authored-by: Nikita Savelyev <[email protected]>
Co-authored-by: Nikita Savelyev <[email protected]>
What does this PR do?
Command to export the model:
optimum-cli export openvino -m openbmb/MiniCPM-o-2_6 MiniCPM-o-2_6 --task=image-text-to-text --trust-remote-code
Example of inference:
Before submitting