[data][llm] Model Support divergence from vLLM

### Description

Current implementation of Ray Data LLM has begun to diverge in terms of supported models with vLLM. There are models that work with vllm that breaks when we try to use it with Ray Data LLM. This is caused by the dependency of Ray Data LLM on the transformers library for loading in model config/tokenizer/etc. Newer model architectures like GLM-4.7-Flash (glm4_moe_lite) are not supported by the required version of transformers by vLLM (<5.0.0), yet this model architecture only exists in a newer version of transformers (5.1.0). The same can be said about DeepSeek v3.2 (https://github.com/ray-project/ray/issues/60056). 

These are examples that work with vLLM serve, and even Ray serve that break with Ray data LLM.

Question: Is it possible to remove dependency on transformers and rely on purely on vLLM?


### Use case

This syncs up supported models by Ray and vLLM without users needing to worry about whether the model actually works on Ray, currently not all models supported by vLLM will run on Ray and that becomes a hindrance. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[data][llm] Model Support divergence from vLLM #60780

Description

Use case

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[data][llm] Model Support divergence from vLLM #60780

Description

Description

Use case

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions