-
Notifications
You must be signed in to change notification settings - Fork 473
Removed zip compression when saving checkpoints #1210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed zip compression when saving checkpoints #1210
Conversation
Signed-off-by: Charlelie Laurent <[email protected]>
|
/blossom-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR changes the checkpoint saving mechanism from zipfile.ZIP_DEFLATED to zipfile.ZIP_STORED, effectively disabling compression when creating checkpoint archives.
Key Changes:
- Modified
physicsnemo/models/module.py:544to useZIP_STOREDinstead ofZIP_DEFLATED - This follows the recent refactoring in #1185 that introduced zip format support
Trade-offs:
- Faster I/O: Removes CPU-intensive compression/decompression operations during checkpoint save/load
- Larger files: Checkpoint files will be larger on disk (PyTorch model weights are already somewhat compressed by
torch.save) - Better for training workflows: Training checkpoints are often written frequently, so faster save times can reduce interruptions
This is a reasonable optimization for ML workloads where checkpoint I/O speed is often more critical than storage space, especially when saving frequently during training.
Confidence Score: 5/5
- This PR is safe to merge with minimal risk
- The change is a simple one-line modification that switches from compressed to uncompressed zip format. The code is functionally identical otherwise - both
ZIP_DEFLATEDandZIP_STOREDare valid zip compression modes, and the loading logic handles both transparently. This is a performance optimization trading file size for speed, which is appropriate for ML checkpointing workflows. - No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| physicsnemo/models/module.py | 5/5 | Changed zip compression from ZIP_DEFLATED to ZIP_STORED (no compression). This trades file size for faster save/load performance. |
Sequence Diagram
sequenceDiagram
participant User
participant Module
participant ZipFile
participant FileSystem
User->>Module: save(file_name)
Module->>Module: Create temp file
Module->>ZipFile: ZipFile(path, "w", ZIP_STORED)
Note over ZipFile: No compression applied
Module->>ZipFile: writestr("model.pt", state_dict)
Module->>ZipFile: writestr("args.json", args)
Module->>ZipFile: writestr("metadata.json", metadata)
ZipFile-->>Module: Archive created
Module->>FileSystem: Upload to destination
Module->>Module: Clean up temp file
Module-->>User: Checkpoint saved
User->>Module: load(file_name)
Module->>FileSystem: Download checkpoint
Module->>ZipFile: ZipFile(path, "r")
Note over ZipFile: Reads uncompressed data
ZipFile->>Module: Read "model.pt"
Module->>Module: torch.load(model_bytes)
Module-->>User: Model loaded
1 file reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch and fix, lgtm!
|
/blossom-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR removes zip compression overhead from checkpoint saving by changing from zipfile.ZIP_DEFLATED to zipfile.ZIP_STORED in physicsnemo/models/module.py:544.
Key changes:
- Changed compression mode from
ZIP_DEFLATED(compress data) toZIP_STORED(store without compression) - This change reduces CPU overhead during checkpoint saves, trading faster save times for larger file sizes
- The change is backward compatible - existing checkpoint loading code works with both compressed and uncompressed zip archives
- PyTorch model state dicts (
.ptfiles) are already compressed internally, so additional zip compression provides minimal benefit
Confidence Score: 5/5
- This PR is safe to merge with minimal risk
- The change is a simple one-line modification that only affects compression mode, not the zip file format or structure. The loading code in
physicsnemo/models/module.py:700useszipfile.ZipFile(cached_file_name, "r")which automatically handles both compressed and uncompressed zip files, ensuring backward compatibility. The change trades disk space for performance, which is a reasonable optimization for large model checkpoints. - No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| physicsnemo/models/module.py | 5/5 | Changed from ZIP_DEFLATED to ZIP_STORED to eliminate compression overhead during checkpoint saves |
Sequence Diagram
sequenceDiagram
participant User
participant Module
participant ZipFile
participant TempFile
participant FileSystem
User->>Module: save(file_name)
Module->>TempFile: Create temporary .zip file
Module->>ZipFile: Open with ZIP_STORED (no compression)
Module->>Module: Serialize state_dict to BytesIO
Module->>ZipFile: writestr("model.pt", state_dict_bytes)
Module->>ZipFile: writestr("args.json", args_json)
Module->>ZipFile: writestr("metadata.json", metadata_json)
ZipFile->>TempFile: Write uncompressed archive
Module->>FileSystem: Upload temp file to destination
Module->>TempFile: Clean up temporary file
1 file reviewed, no comments
PhysicsNeMo Pull Request
Description
Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.