Skip to content

vllm plugin#35

Draft
sivanravidos wants to merge 17 commits into
mainfrom
vllm
Draft

vllm plugin#35
sivanravidos wants to merge 17 commits into
mainfrom
vllm

Conversation

@sivanravidos
Copy link
Copy Markdown
Collaborator

@sivanravidos sivanravidos commented May 26, 2026

PR is still a draft, todos:

  • support pooling types, get it via request argument
  • add classification output
  • support MLM ckpt

@sivanravidos sivanravidos changed the title vllm plugin first draft vllm plugin May 26, 2026
Sivan Ravid and others added 16 commits May 26, 2026 10:03
- refactored and simplified muilti modal support and forward code
- load_weights bug fix
-  Improved
docs throughout to explain the pipeline and batching behavior.
- Downloads checkpoint from source HF repo
- Converts .ckpt to model.safetensors
- Extracts and saves config.json with placeholder for modifications
- Copies tokenizer files and README
- Adds reference to original model in README
- Uploads to new HF repository
- Reduced from 306 to 143 lines
- Simplified logic and removed verbose print statements
- Cleaner error handling with try/except passes
- More Pythonic file operations
- Maintained all functionality
More descriptive name that clearly indicates the script's purpose
Sets architectures to ['BiomedRnaForSequenceEmbedding'] which is required
for vLLM plugin to properly register and load the model
- Removed --token argument for better security
- Updated documentation to explain HF_TOKEN setup
- Token won't appear in shell history or process lists
- Follows HuggingFace CLI standard practice
- Removed --- markers that were interpreted as invalid YAML
- Use markdown blockquote for reference instead
- Cleaner, more standard markdown format
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
Signed-off-by: Sivan Ravid <sivanra@il.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant