- 
                Notifications
    
You must be signed in to change notification settings  - Fork 31k
 
Fix issue with from pretrained and kwargs in image processors #41997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix issue with from pretrained and kwargs in image processors #41997
Conversation
| 
           I have a quick question about the:         image_processor_dict.update(kwargs)
        image_processor = cls(**image_processor_dict)Fix. Because you are adding user given  I think the only way to check wether the kwargs are or are not in the init function is by inspecting its signature, but I don't think it's clean and it doesn't solve the underlying core pattern. Because processors can do whatever they want in their init, even use env vars for all they care while   | 
    
| 
           The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.  | 
    
          
 No problem feel free to ask questions! The idea is that all the init logic/setting of attributes and checking of kwargs should be in the class init, to avoid having different results when loading models in different ways (from_pretrained, from_dict, direct instantiation etc.).  | 
    
| 
           I agree, if you can enforce some kind of contract where the  And thank you for your openness!  | 
    
| 
           Added logic to more clearly use cls.valid_kwargs to update the image processor dict ;). 
 In that case, if the kwarg is in the valid_kwargs attribute of the processor class, it will still be set as an instance attribute (see the logic here in the base fast image processor class)  | 
    
| 
           Oh ok perfect thank you @yonigozlan didn't know about   | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, good use of valid_kwargs but imo we should also get rid of the mutated kwargs!
| # Remove kwargs that are used to initialize the image processor attributes | ||
| for key in list(kwargs): | ||
| if hasattr(image_processor, key): | ||
| setattr(image_processor, key, value) | ||
| to_remove.append(key) | ||
| for key in to_remove: | ||
| kwargs.pop(key, None) | ||
| kwargs.pop(key) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at it, I'd prefer to not mutate the kwargs. Precedence will change after the first call, which can lead to weird situations (same class instance, two from_dict calls --> kwargs belong to same object but are changed between the 2 calls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about this? I can't reproduce this issue on this branch. For example:
from transformers import Qwen2VLImageProcessorFast
kwargs = {"max_pixels": 200_000}
processor = Qwen2VLImageProcessorFast.from_dict({}, **kwargs)
print(kwargs)prints
{'max_pixels': 200000}
even though "max_pixels" is popped
| 
           [For maintainers] Suggested jobs to run (before merge) run-slow: pix2struct  | 
    
What does this PR do?
Fixes #41955.
Fixes an issue raised in #41954. Instead of setting attributes from kwargs after instantiating the image processor in
from_pretrained, we update the image processor dict with the kwargs before instantiating the object. This allows custom logic in the init to take into account the custom kwargs passed tofrom_pretrained.In the PR linked, the issue was that
max_pixelsis supposed to overwritesize["longest_edge"]when passed to the init, but infrom_pretrained,max_pixelswas never passed to the init and only set as an attribute after instantiating the image processor.