Open
Description
The mlx-community/SmolVLM-Instruct-4bit fails to describe an image. It uses the Idefics3ImageProcessor and this doesn't use the chat template and it doesn't produce the right structure for the prompt:
You are a helpful assistant who answers questions in English.
describe the picture describe<image>
If I force it to use the template then we get a Jinja error.
HuggingFaceTB/SmolVLM2-500M-Video-Instruct-mlx
works fine (and uses the SmolVLMImageProcessor)
Metadata
Metadata
Assignees
Labels
No labels