-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Description
Description
Description
Currently, when using ray.data.llm with vLLM, users can specify logprobs=True (or logprobs=N) in the sampling_params dictionary, and vLLM successfully processes this parameter. However, the logprobs data is not returned in the final output from Ray Data.
The issue is that while vLLM generates the logprobs data (as evidenced by the SamplingParams being correctly parsed), this information is dropped during the conversion from vLLM's RequestOutput to Ray's vLLMOutputData format in the from_vllm_engine_output method.
Current behavior:
logprobs=Truecan be specified insampling_params- vLLM processes the request with logprobs enabled
- The output shows
SamplingParams(n=1,…,logprobs=1,…)indicating vLLM received the parameter - However, the actual logprobs data (probability distributions) is not present in the returned rows
Expected behavior:
- When
logprobsis specified insampling_params, the logprobs data should be included in the output rows - Users should be able to access logprobs through the postprocessor function
Technical details:
The issue is in python/ray/llm/_internal/batch/stages/vllm_engine_stage.py:
- The
vLLMOutputDatamodel (lines 71-88) does not have a field for logprobs - The
from_vllm_engine_outputmethod (lines 91-124) extractsgenerated_tokens,generated_text, andmetricsfrom vLLM's output, but does not extractlogprobsfromoutput.outputs[0].logprobs - vLLM's
CompletionOutput(inoutput.outputs[0]) contains alogprobs: SampleLogprobs | Nonefield that is currently being ignored - Additionally,
output.prompt_logprobs(of typePromptLogprobs | None) may also need to be exposed if users request prompt logprobs
Reproduction:
from ray.data.llm import build_llm_processor
processor = build_llm_processor(
config,
preprocess=lambda row: dict(
messages=[
{"role": "system", "content": "You are a bot that responds with haikus."},
{"role": "user", "content": row["item"]}
],
sampling_params=dict(
temperature=0.3,
max_tokens=250,
logprobs=True # This is parsed correctly by vLLM
)
),
postprocess=lambda row: dict(**row) # logprobs not found in row
)Use case
Use case
Users need access to logprobs for downstream tasks like evaluation/analysis, filtering, debugging, research, etc.
Without access to logprobs, users are unable to perform these analyses even though vLLM supports this feature. This creates a gap between what vLLM can provide and what Ray Data LLM exposes to users.
The fix should be straightforward: extract the logprobs data from vLLM's output object and include it in the vLLMOutputData model so it flows through to the final output rows.