Skip to content

Quadlet generator should understand multi-part models #2017

@tangentsoft

Description

@tangentsoft

Feature request description

If you say this:

$ ramalama serve --generate quadlet gpt-oss:120b
$ grep -c ^Mount gpt-oss-120b-GGUF.container
2

…the output should be 4 because this is a 3-part model, plus a small (17k) metadata file, all of which need to be bind-mounted into the container.

Since #1989, Ramalama can do this for file:// paths of the pattern $MODEL-0000N-of-0000M.gguf, but it is failing in this instance, where the parts are in OCI artifact form, thus named with random SHA256 hashes.

Suggest potential solution

Parse the small metadata file and generate Mount= declarations for each model part found, named so that the #1989 feature recognizes them by glob pattern.

Have you considered any alternatives?

The only such alternative I can see is hand-hacking the generated Quadlet file, and my attempts to do that resulted in startup errors. Ramalama is in a better position to know the right thing to do here than the admin.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions