vllm plugin#35
Draft
sivanravidos wants to merge 17 commits into
Draft
Conversation
- refactored and simplified muilti modal support and forward code - load_weights bug fix - Improved docs throughout to explain the pipeline and batching behavior.
- Downloads checkpoint from source HF repo - Converts .ckpt to model.safetensors - Extracts and saves config.json with placeholder for modifications - Copies tokenizer files and README - Adds reference to original model in README - Uploads to new HF repository
- Reduced from 306 to 143 lines - Simplified logic and removed verbose print statements - Cleaner error handling with try/except passes - More Pythonic file operations - Maintained all functionality
More descriptive name that clearly indicates the script's purpose
Sets architectures to ['BiomedRnaForSequenceEmbedding'] which is required for vLLM plugin to properly register and load the model
- Removed --token argument for better security - Updated documentation to explain HF_TOKEN setup - Token won't appear in shell history or process lists - Follows HuggingFace CLI standard practice
- Removed --- markers that were interpreted as invalid YAML - Use markdown blockquote for reference instead - Cleaner, more standard markdown format
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://huggingface.co/sivanravid/biomed.rna.llama.47m.wced.multitask.v1.vllm/blob/main/config.json
converted with
scripts/create_vllm_compatible_hf_model_repo.pyPR is still a draft, todos: