Skip to content

Multi-Modal model support  #1025

Open
Open
@doberst

Description

@doberst

We are very interested in integrating open source self-hosted multi-modal models into LLMWare. We have been watching the space closely and looking for ideas and contributions for supporting open source multi-modal models that work in conjunction with RAG and Agent-based automation pipelines.

Our key criteria is that there must be a use case related to some business objective (e.g., not just image generation), the model needs to work reasonably well, and should be self-hostable (e.g., max of 10-15B parameters).

To implement, the key focus will be the construction of a new MultiModal model class, and design of the preprocessor and postprocessors required to handle the multi-modal content, along with support for the underlying model packaging (e.g., GGUF, Pytorch, ONNX, OpenVino). We would look to collaborate and will support the underlying inferencing technology required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions