Fix issue with from pretrained and kwargs in image processors #41997

yonigozlan · 2025-11-03T15:57:28Z

What does this PR do?

Fixes #41955.
Fixes an issue raised in #41954. Instead of setting attributes from kwargs after instantiating the image processor in from_pretrained, we update the image processor dict with the kwargs before instantiating the object. This allows custom logic in the init to take into account the custom kwargs passed to from_pretrained.

In the PR linked, the issue was that max_pixels is supposed to overwrite size["longest_edge"] when passed to the init, but in from_pretrained, max_pixels was never passed to the init and only set as an attribute after instantiating the image processor.

ReinforcedKnowledge · 2025-11-03T16:05:28Z

I have a quick question about the:

        image_processor_dict.update(kwargs)
        image_processor = cls(**image_processor_dict)

Fix. Because you are adding user given kwargs to image_processor which is built by using the processor kwarg. And maybe the initializing of the class doesn't support those kwargs? (Not trying to criticize just trying to learn more about transformers).

I think the only way to check wether the kwargs are or are not in the init function is by inspecting its signature, but I don't think it's clean and it doesn't solve the underlying core pattern. Because processors can do whatever they want in their init, even use env vars for all they care while from_pretrained can't solve those issues for them.

HuggingFaceDocBuilderDev · 2025-11-03T16:06:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yonigozlan · 2025-11-03T16:19:24Z

I have a quick question about the:
    image_processor_dict.update(kwargs)
    image_processor = cls(**image_processor_dict)
Fix. Because you are adding user given kwargs to image_processor which is built by using the processor kwarg. And maybe the initializing of the class doesn't support those kwargs? (Not trying to criticize just trying to learn more about transformers)

No problem feel free to ask questions! The idea is that all the init logic/setting of attributes and checking of kwargs should be in the class init, to avoid having different results when loading models in different ways (from_pretrained, from_dict, direct instantiation etc.).
Fast image processors are becoming the default in the library, and they all include checks of kwargs in their init, to only accept kwargs that they should accept. So this fix should work without issues, as all fast image processors accept **kwargs in init, then filter them to set as attributes only the kwargs that are in "Model"ImageProcessorKwargs.

ReinforcedKnowledge · 2025-11-03T16:23:01Z

I agree, if you can enforce some kind of contract where the __init__ works with any **kwargs that the user passes then it's good. But I don't think it's easily doable. And also, the user might pass kwargs that are not used in the __init__ but used downstream in some other utility function of the processor, that checks wether an attribute exists, and then does some specific logic based on the existence of that attribute otherwise it does something else. But the __init__ might not care specifically about it. I don't know if my point is clear 🤔

And thank you for your openness!

yonigozlan · 2025-11-03T16:33:24Z

Added logic to more clearly use cls.valid_kwargs to update the image processor dict ;).

And also, the user might pass kwargs that are not used in the init but used downstream in some other utility function of the processor, that checks wether an attribute exists, and then does some specific logic based on the existence of that attribute otherwise it does something else. But the init might not care specifically about it. I don't know if my point is clear 🤔

In that case, if the kwarg is in the valid_kwargs attribute of the processor class, it will still be set as an instance attribute (see the logic here in the base fast image processor class)

ReinforcedKnowledge · 2025-11-03T16:35:00Z

Oh ok perfect thank you @yonigozlan didn't know about valid_kwargs, that's exactly what I wanted to do initially but didn't know about them. Nice chatting with you!

molbap

Looks fine to me, good use of valid_kwargs but imo we should also get rid of the mutated kwargs!

molbap · 2025-11-03T17:05:26Z

src/transformers/image_processing_base.py

+        # Remove kwargs that are used to initialize the image processor attributes
+        for key in list(kwargs):
            if hasattr(image_processor, key):
-                setattr(image_processor, key, value)
-                to_remove.append(key)
-        for key in to_remove:
-            kwargs.pop(key, None)
+                kwargs.pop(key)


While we're at it, I'd prefer to not mutate the kwargs. Precedence will change after the first call, which can lead to weird situations (same class instance, two from_dict calls --> kwargs belong to same object but are changed between the 2 calls)

Are you sure about this? I can't reproduce this issue on this branch. For example:

from transformers import Qwen2VLImageProcessorFast kwargs = {"max_pixels": 200_000} processor = Qwen2VLImageProcessorFast.from_dict({}, **kwargs) print(kwargs)

prints
{'max_pixels': 200000}
even though "max_pixels" is popped

github-actions · 2025-11-03T18:31:00Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: pix2struct

accept kwargs in image proc from_pretrained

b5c2aaf

yonigozlan requested a review from molbap November 3, 2025 15:57

yonigozlan mentioned this pull request Nov 3, 2025

Fix max_pixels/min_pixels ignored in Qwen2VLImageProcessorFast.from_pretrained #41954

Open

yonigozlan added 2 commits November 3, 2025 16:27

only use kwargs that are in cls.valid_kwargs

74d56be

remove specific logic for _from_auto

2a95622

molbap approved these changes Nov 3, 2025

View reviewed changes

yonigozlan added 2 commits November 3, 2025 17:53

add image_seq_length to Images_kwargs for backward compatibility

a2e6c5c

fix missing image kwargs in pix2struct

96a2a70

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix issue with from pretrained and kwargs in image processors #41997

Fix issue with from pretrained and kwargs in image processors #41997

yonigozlan commented Nov 3, 2025 •

edited

Loading

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

molbap left a comment

Uh oh!

molbap Nov 3, 2025

Uh oh!

yonigozlan Nov 3, 2025

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix issue with from pretrained and kwargs in image processors #41997

Are you sure you want to change the base?

Fix issue with from pretrained and kwargs in image processors #41997

Conversation

yonigozlan commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

yonigozlan commented Nov 3, 2025

Uh oh!

ReinforcedKnowledge commented Nov 3, 2025

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

molbap Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yonigozlan commented Nov 3, 2025 •

edited

Loading

ReinforcedKnowledge commented Nov 3, 2025 •

edited

Loading