[bnb] Small improvements on utils#18646
Conversation
- replace `modules_to_not_convert` by `module_to_not_convert`
|
The documentation is not available anymore as the PR was closed or merged. |
|
Can confirm the tests pass! |
|
so will there always be just one module not to convert? won't it be safer to have modules instead and work with the list? |
- changed variables name - now output a list - change error message
|
I have proposed a small refactoring that includes:
The bnb slow tests are passing with this fix! |
|
From #18660 I also just added a commit to support having a custom list of the keys to ignore |
sgugger
left a comment
There was a problem hiding this comment.
Thanks for working on this, I left some comments.
src/transformers/modeling_utils.py
Outdated
| offload_state_dict = kwargs.pop("offload_state_dict", False) | ||
| load_in_8bit = kwargs.pop("load_in_8bit", False) | ||
| int8_threshold = kwargs.pop("int8_threshold", 6.0) | ||
| no_load_in_8bit_modules = kwargs.pop("no_load_in_8bit_modules", None) |
There was a problem hiding this comment.
Would it make more sense to have this be a class variable of PreTrainedModel (like the no_split variable used for big model inference)? I'm afraid the user won't know what to set this too and it looks like it's something we should automatically handle?
There was a problem hiding this comment.
I don't have a strong opinion on that but this argument is optional because the function get_keys_not_to_convert should automatically take care of that except for some models like Jukebox where it is a bit trickier due to its architecture.
In this case the user will just have to manually set which modules should be kept in their native precision and specify them in the kwargs, so I feel like it is a bit easier than having it as an argument of PretrainedModel because you would need to open a PR to add the feature.
Co-authored-by: stas00 <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
stas00
left a comment
There was a problem hiding this comment.
thank you for addressing the suggestions, @younesbelkada
|
Can confirm the slow tests pass after rebasing on |
What does this PR do?
Fixes a small typo in
bitsandbytes.py, should address huggingface/blog#463 (comment)I will have to test it first and mark it as ready for review!