Skip to content

Fix: Add filename parameter support for ONNX file caching from Hugging Face Hub (Issue #2218)#2386

Open
ada-ggf25 wants to merge 7 commits intohuggingface:mainfrom
ada-ggf25:Issue_2218
Open

Fix: Add filename parameter support for ONNX file caching from Hugging Face Hub (Issue #2218)#2386
ada-ggf25 wants to merge 7 commits intohuggingface:mainfrom
ada-ggf25:Issue_2218

Conversation

@ada-ggf25
Copy link

@ada-ggf25 ada-ggf25 commented Nov 29, 2025

Fix: Add filename parameter support for ONNX file caching from Hugging Face Hub

Fixes #2218

What does this PR do?

This PR fixes issue #2218 by adding support for the filename and local_filename parameters when downloading and caching files from Hugging Face Hub repositories. This is particularly important for repositories like xenova that require specific naming conventions for cached ONNX files.

Problem

Previously, when downloading files from Hugging Face Hub using snapshot_download, there was no way to specify a custom local filename for the cached file. This caused issues when users needed to cache files (especially ONNX models) with specific filenames that differ from the original repository filename.

Solution

  • Added a new download_file_with_filename() utility function that uses hf_hub_download instead of snapshot_download when a specific filename is requested
  • Updated TasksManager.get_model_files() to accept filename and local_filename parameters
  • When filename is provided, the function now uses hf_hub_download which supports the local_filename parameter for custom caching

Changes

  1. New function: optimum.utils.file_utils.download_file_with_filename()

    • Downloads a specific file from Hugging Face Hub with optional custom local filename
    • Supports subfolders, revisions, tokens, and different repo types
    • Uses hf_hub_download which properly handles the local_filename parameter
  2. Enhanced function: TasksManager.get_model_files()

    • Added filename parameter: when specified, downloads only that specific file
    • Added local_filename parameter: allows custom naming for cached files
    • Backward compatible: all existing code continues to work without changes
  3. Tests: Added comprehensive test suite in tests/utils/test_file_utils.py

    • Tests for default filename behaviour
    • Tests for custom local filename
    • Tests for subfolders, revisions, tokens, and different repo types
    • All 7 tests passing

Example Usage

from optimum.exporters.tasks import TasksManager

# Download a specific ONNX file with custom local filename
files, error = TasksManager.get_model_files(
    model_name_or_path="xenova/model-name",
    filename="model.onnx",
    local_filename="custom_model.onnx",  # Now works!
    cache_dir="./cache"
)

Or using the utility function directly:

from optimum.utils import download_file_with_filename

file_path = download_file_with_filename(
    repo_id="xenova/model-name",
    filename="model.onnx",
    local_filename="custom_model.onnx",
    cache_dir="./cache"
)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes? (Function docstrings included)
  • Did you write any new necessary tests? (7 new tests added)

Testing

All new tests pass successfully:

tests/utils/test_file_utils.py::TestDownloadFileWithFilename::test_download_file_with_default_filename PASSED
tests/utils/test_file_utils.py::TestDownloadFileWithFilename::test_download_file_with_custom_local_filename PASSED
tests/utils/test_file_utils.py::TestDownloadFileWithFilename::test_download_file_with_subfolder PASSED
tests/utils/test_file_utils.py::TestDownloadFileWithFilename::test_download_file_with_revision_and_token PASSED
tests/utils/test_file_utils.py::TestDownloadFileWithFilename::test_download_file_with_dataset_repo_type PASSED
tests/utils/test_file_utils.py::TestValidateFileExists::test_validate_file_exists_local_directory PASSED
tests/utils/test_file_utils.py::TestValidateFileExists::test_validate_file_exists_remote_repo PASSED

Files Changed

  • optimum/utils/file_utils.py: Added download_file_with_filename() function
  • optimum/utils/__init__.py: Exported new function in public API
  • optimum/exporters/tasks.py: Enhanced get_model_files() with filename support
  • tests/utils/test_file_utils.py: Added comprehensive test suite

Backward Compatibility

This change is fully backward compatible. All existing code will continue to work without modification. The new parameters are optional and only affect behaviour when explicitly provided.

…del_files method

Add support for downloading specific files from model repositories with
custom local filename handling in the TasksManager.get_model_files method.

Changes:
- Added filename parameter to allow downloading a specific file from
  the repository instead of listing all files
- Added local_filename parameter to specify a custom name for the
  cached file, useful for repositories with specific naming requirements
  (e.g., xenova)
- Implemented logic to use hf_hub_download via download_file_with_filename
  utility when filename is specified
- Added comprehensive docstring documentation for all parameters and
  return values
- Enhanced error handling for file download operations

This enhancement enables more flexible file retrieval from Hugging Face
Hub repositories, particularly for cases where custom local filenames
are required for compatibility with downstream tools.
Add exports for file utility functions to make them accessible from the
optimum.utils module.

Changes:
- Export download_file_with_filename function for downloading files
  with custom local filename support
- Export find_files_matching_pattern function for pattern-based file
  searching
- Export validate_file_exists function for file existence validation

This enables users to import these utilities directly from optimum.utils
instead of requiring direct imports from optimum.utils.file_utils.
…l filename support

Add new utility function to download files from Hugging Face Hub with
support for custom local filenames, enabling better compatibility with
repositories that require specific naming conventions.

Changes:
- Add download_file_with_filename function to file_utils module
- Import hf_hub_download from huggingface_hub to support custom
  local_filename parameter
- Implement support for subfolder paths in filename construction
- Add comprehensive docstring with parameter descriptions and usage
  example
- Support for optional cache directory, token authentication, and
  repository type specification

This function is particularly useful for repositories like xenova that
may have specific naming requirements for cached files, allowing users
to download files with custom local filenames while maintaining proper
caching behaviour.
Add test suite for file utility functions to ensure proper functionality
and edge case handling.

Changes:
- Add TestDownloadFileWithFilename class with test cases covering:
  * Downloading files with default filename
  * Downloading files with custom local filename
  * Downloading files from subfolders
  * Downloading files with revision and token parameters
  * Downloading files from different repository types (model, dataset)
- Add TestValidateFileExists class with test cases covering:
  * Validating file existence in local directories (root and subfolders)
  * Validating file existence in remote repositories
  * Handling subfolder paths in remote repositories
- Use unittest.mock for mocking Hugging Face Hub API calls
- Use tempfile for creating temporary test directories

These tests ensure the file utility functions work correctly across
different scenarios and provide confidence for future changes.
Clean up test file by removing unused imports and trailing whitespace.

Changes:
- Remove unused os import
- Remove unused pytest import
- Remove trailing blank line at end of file

This improves code cleanliness and follows best practices by only
importing what is actually used in the test file.
Move file_utils imports to be positioned earlier in the file, right
after constant imports, to improve import organisation and maintain
a more logical grouping of related imports.

Changes:
- Move file_utils imports from after input_generators to after constant
  imports
- Maintain alphabetical and logical grouping of utility imports

This improves code organisation and makes the import structure more
consistent and easier to navigate.
…iles

Remove unused downloaded_path variable assignment when calling
download_file_with_filename, as the return value is not used in the
subsequent code.

Changes:
- Remove unused downloaded_path variable assignment
- Keep the function call to maintain download functionality

This improves code cleanliness by removing unused variables and
follows best practices for code maintenance.
@github-actions
Copy link

This PR has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Caching xenova repo onnx files - filename parameter not working

1 participant