feat: inplace pin memory for safetensors in /dev/shm/#58
Merged
blahgeek merged 10 commits intoMoonshotAI:mainfrom Dec 11, 2025
Merged
feat: inplace pin memory for safetensors in /dev/shm/#58blahgeek merged 10 commits intoMoonshotAI:mainfrom
blahgeek merged 10 commits intoMoonshotAI:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request adds an optimization for loading safetensors checkpoint files stored in /dev/shm/ by enabling in-place memory pinning, which avoids copying data and reduces memory consumption by half. When safetensors files are detected in /dev/shm/, the code now pins the memory-mapped file directly instead of allocating separate pinned memory and copying the tensors.
Key changes:
- Added inplace pin memory path for safetensors files in
/dev/shm/using CUDA'scudaHostRegister - Implemented manual safetensors header parsing to extract tensor metadata without loading through the safetensors library
- Parallelized inplace pinning operations using ThreadPoolExecutor
- Preserved existing checkpoint loading path as fallback for non-safetensors files or files outside
/dev/shm/
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
648e61b to
272355d
Compare
93f4470 to
3b3371b
Compare
blahgeek
reviewed
Dec 8, 2025
9ce3f31 to
37d8f0b
Compare
15e8dba to
93f3fa9
Compare
93f3fa9 to
4d68fb3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
resolve #60
Safetensors file have aligned storage layout. If safetensors files are in
/dev/shm, we can pin it inplace without copying it, which will not cost double memory usage.