VLMs image resizing #50

Bezdarnost · 2025-02-20T07:24:50Z

Added a convenient option to set the maximum number of pixels per image during training. You can do something like this:

from unsloth import is_bf16_supported
from unsloth.trainer import UnslothVisionDataCollator
from trl import SFTTrainer, SFTConfig

FastVisionModel.for_training(model)  # Enable for training!

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    data_collator=UnslothVisionDataCollator(
        model,  # Must use!
        tokenizer, 
        max_image_size=16384 * 28 * 28  # Reduce if OOM occurs
    ),
    train_dataset=converted_dataset,
    ...

This helps prevent OOM issues when dealing with large images in the dataset.

Bezdarnost added 2 commits February 20, 2025 13:15

Update vision_utils.py

978ba0d

Update vision_utils.py

183ea37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VLMs image resizing #50

VLMs image resizing #50

Bezdarnost commented Feb 20, 2025

Uh oh!

Uh oh!

VLMs image resizing #50

Are you sure you want to change the base?

VLMs image resizing #50

Conversation

Bezdarnost commented Feb 20, 2025

Uh oh!

Uh oh!