Skip to content

Support for OpenSource LLMs and VLMs #338

@chitrabhat04

Description

@chitrabhat04

Proposal

I'm currently conducting a benchmarking experiment involving a range of vision-language models (VLMs), including both open-source models like Qwen and paid services such as Gemini and GPT-4V.

Pezzo seems like a great tool for managing prompts, but at the moment, there's no clear way to use it for testing and comparing prompts across these VLMs—especially for vision-based models.

It would be extremely helpful to have support for:

  1. Managing and sending prompts to various VLMs, including those with visual input
  2. Connecting to these providers via API (e.g., Gemini, GPT-4V, Qwen-VL, etc.)
  3. Storing and comparing results to streamline benchmarking workflows

This kind of functionality would make Pezzo an invaluable part of research and evaluation pipelines for multimodal models.

Use-Case

No response

Is this a feature you are interested in implementing yourself?

Maybe

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions