Skip to content

[data][llm] Model Support divergence from vLLM #60780

@jiangwu300

Description

@jiangwu300

Description

Current implementation of Ray Data LLM has begun to diverge in terms of supported models with vLLM. There are models that work with vllm that breaks when we try to use it with Ray Data LLM. This is caused by the dependency of Ray Data LLM on the transformers library for loading in model config/tokenizer/etc. Newer model architectures like GLM-4.7-Flash (glm4_moe_lite) are not supported by the required version of transformers by vLLM (<5.0.0), yet this model architecture only exists in a newer version of transformers (5.1.0). The same can be said about DeepSeek v3.2 (#60056).

These are examples that work with vLLM serve, and even Ray serve that break with Ray data LLM.

Question: Is it possible to remove dependency on transformers and rely on purely on vLLM?

Use case

This syncs up supported models by Ray and vLLM without users needing to worry about whether the model actually works on Ray, currently not all models supported by vLLM will run on Ray and that becomes a hindrance.

Metadata

Metadata

Labels

community-backlogenhancementRequest for new feature and/or capabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions