-
Notifications
You must be signed in to change notification settings - Fork 271
Description
Feature request description
If you say this:
$ ramalama serve --generate quadlet gpt-oss:120b
$ grep -c ^Mount gpt-oss-120b-GGUF.container
2
…the output should be 4 because this is a 3-part model, plus a small (17k) metadata file, all of which need to be bind-mounted into the container.
Since #1989, Ramalama can do this for file:// paths of the pattern $MODEL-0000N-of-0000M.gguf, but it is failing in this instance, where the parts are in OCI artifact form, thus named with random SHA256 hashes.
Suggest potential solution
Parse the small metadata file and generate Mount= declarations for each model part found, named so that the #1989 feature recognizes them by glob pattern.
Have you considered any alternatives?
The only such alternative I can see is hand-hacking the generated Quadlet file, and my attempts to do that resulted in startup errors. Ramalama is in a better position to know the right thing to do here than the admin.
Additional context
No response