Multi-Modal model support 

We are very interested in integrating open source self-hosted multi-modal models into LLMWare.   We have been watching the space closely and looking for ideas and contributions for supporting open source multi-modal models that work in conjunction with RAG and Agent-based automation pipelines.    

Our key criteria is that there must be a use case related to some business objective (e.g., not just image generation), the model needs to work reasonably well, and should be self-hostable (e.g., max of 10-15B parameters).

To implement, the key focus will be the construction of a new MultiModal model class, and design of the preprocessor and postprocessors required to handle the multi-modal content, along with support for the underlying model packaging (e.g., GGUF, Pytorch, ONNX, OpenVino).   We would look to collaborate and will support the underlying inferencing technology required.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Modal model support #1025

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-Modal model support #1025

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions