Skip to content

mtmd: refactor preprocessor, add mtmd_image_preproc_out#24736

Open
ngxson wants to merge 5 commits into
ggml-org:masterfrom
ngxson:xsn/mtmd_image_preproc_out
Open

mtmd: refactor preprocessor, add mtmd_image_preproc_out#24736
ngxson wants to merge 5 commits into
ggml-org:masterfrom
ngxson:xsn/mtmd_image_preproc_out

Conversation

@ngxson

@ngxson ngxson commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Overview

  • refactor, add mtmd_image_preproc_out
  • add mtmd dev docs
  • remove various unused clip.cpp API

next PR: add batching support for llava-uhd models

Requirements

@ngxson ngxson requested a review from a team as a code owner June 17, 2026 22:17
@github-actions github-actions Bot added documentation Improvements or additions to documentation examples labels Jun 17, 2026
@ngxson

ngxson commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Test results:

[vision] OK:   ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
[vision] OK:   THUDM/glm-edge-v-5b-gguf:Q4_K_M
[vision] OK:   second-state/Llava-v1.5-7B-GGUF:Q2_K
[vision] OK:   cjpais/llava-1.6-mistral-7b-gguf:Q3_K_M
[vision] OK:   ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
[vision] OK:   second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
[vision] OK:   openbmb/MiniCPM-V-2_6-gguf:Q2_K
[vision] OK:   openbmb/MiniCPM-o-2_6-gguf:Q4_0
[vision] OK:   bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/InternVL2_5-1B-GGUF:Q8_0
[vision] OK:   ggml-org/InternVL3-1B-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[vision] OK:   ggml-org/LFM2-VL-450M-GGUF:Q8_0
[vision] OK:   ggml-org/granite-docling-258M-GGUF:Q8_0
[vision] OK:   ggml-org/LightOnOCR-1B-1025-GGUF:Q8_0
[vision] OK:   ggml-org/DeepSeek-OCR-GGUF:Q8_0
[vision] OK:   ggml-org/dots.ocr-GGUF:Q8_0
[vision] OK:   ggml-org/HunyuanOCR-GGUF:Q8_0
[vision] OK:   ggml-org/gemma-4-E2B-it-GGUF:Q8_0
[audio]  OK:   ggml-org/ultravox-v0_5-llama-3_2-1b-GGUF:Q8_0
[audio]  OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[audio]  OK:   ggml-org/Voxtral-Mini-3B-2507-GGUF:Q4_K_M
[audio]  OK:   ggml-org/LFM2-Audio-1.5B-GGUF:Q8_0
[audio]  OK:   ggml-org/gemma-4-E2B-it-GGUF:Q8_0
[audio]  OK:   ggml-org/Qwen3-ASR-0.6B-GGUF:Q8_0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant