Fix local from_pretrained when CONFIG_NAME contains subdirectories#4067
Fix local from_pretrained when CONFIG_NAME contains subdirectories#4067ijiraq wants to merge 2 commits intohuggingface:mainfrom
Conversation
ModelHubMixin.from_pretrained used CONFIG_NAME in os.listdir(model_id) for local directories. listdir only returns top-level names, so nested paths such as codecs/image/config.json were never detected; remote hf_hub_download already supported such filenames. Use os.path.isfile(os.path.join(model_id, CONFIG_NAME)) instead. Adds regression test with temporary nested config path.
|
Hi @ijiraq did you ran into an actual issue with |
|
@Wauplin This was a real world problem experienced with AION when using the aion.codex.CodecManager. This model set of pretrained weights and also weights for a codec. Is the loading of the codex weights that runs into problem when attempting to use a local repository: The request happens here; It appears to me that huggingface_hub does the correct action (looking model subdirectory codex/images for config.json in this case) when doing a remote request but not on locally cached file somehow. |
|
Oooh, I see. Sorry I haven't read correctly at first. So what you are doing is monkey-patching What I would suggest you to do is to drop Please let me know if you have any questions |
ModelHubMixin.from_pretrained used CONFIG_NAME in os.listdir(model_id) for local directories. listdir only returns top-level names, so nested paths such as codecs/XXXX/config.json are never detected; remote hf_hub_download already supported such filenames. This is a correction to make local behave like remote.
here's a tag to help get to the heart of the issue: local from_pretrained + CONFIG_NAME containing /
Use os.path.isfile(os.path.join(model_id, CONFIG_NAME)) instead.
Adds regression test with temporary nested config path.
Note
Low Risk
Low risk bugfix limited to local config discovery in
ModelHubMixin.from_pretrained, plus a regression test. Main risk is unintended behavior change for local directory loads if a file exists at the joined path.Overview
Fixes
ModelHubMixin.from_pretrainedlocal loading to correctly detect configs whenconstants.CONFIG_NAMEincludes subdirectories by switching from a top-levelos.listdircheck toos.path.isfile(os.path.join(dir, CONFIG_NAME)), and updates the warning message accordingly.Adds a regression test (with a small dummy model) that temporarily sets
CONFIG_NAMEto a nested path and verifiesfrom_pretrainedcan loadconfig.jsonfrom that nested location in a local folder.Reviewed by Cursor Bugbot for commit 6835fbc. Bugbot is set up for automated code reviews on this repo. Configure here.