-
Notifications
You must be signed in to change notification settings - Fork 14
Add lifetime of package cache #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add lifetime of package cache #108
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #108 +/- ##
==========================================
- Coverage 81.37% 81.25% -0.13%
==========================================
Files 9 9
Lines 204 224 +20
==========================================
+ Hits 166 182 +16
- Misses 38 42 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@nabenabe0928 Could you review this PR? |
| cache_dir_prefix = os.path.join(_conf.cache_home(), hostname, repo_owner, repo_name, ref) | ||
| package_cache_dir = os.path.join(cache_dir_prefix, dir_path) | ||
| use_cache = not force_reload and os.path.exists(package_cache_dir) | ||
| print(f"package_cache_dir: {package_cache_dir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this print statement intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think load_module should not print anything. This function just loads the package, so side effects should be minimal.
|
NOTE: This PR needs discussion to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of issues with the current approach:
- Unreliable last_modified_time
- On macOS, Finder automatically creates .DS_Store files just by opening a directory.
- Text editors may also update file metadata unintentionally when viewing code.
- As a result, last_modified_time can change without any real modification.
- Risks of using rglob
- If a symbolic link exists inside the cache directory, rglob will follow it and traverse external files.
- This can even cause infinite loops in certain cases.
- rglob is also not efficient when the number of files becomes large, leading to unnecessary performance overhead.
A safer approach would be to create a dedicated file (e.g., last_update_time.txt) and explicitly store the update timestamp there.
| cache_dir_prefix = os.path.join(_conf.cache_home(), hostname, repo_owner, repo_name, ref) | ||
| package_cache_dir = os.path.join(cache_dir_prefix, dir_path) | ||
| use_cache = not force_reload and os.path.exists(package_cache_dir) | ||
| print(f"package_cache_dir: {package_cache_dir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think load_module should not print anything. This function just loads the package, so side effects should be minimal.
|
Let me unassign the reviewer since this PR was marked as draft state. |
Motivation
The package cache currently is not updated unless a user sets
force_reload=Truewhen callingload_module().Description of the changes
I added a lifetime for the package cache.
When the lifetime expires, the package will be downloaded again.
Users can set the lifetime by setting the
OPTUNAHUB_CACHE_EXPIRATION_SECONDSenvironment variable (default is 30 days).