refactor(model): sync whisper with SupportsTranscription interface (WIP) by rebel-eunji · Pull Request #668 · RBLN-SW/vllm-rbln

rebel-eunji · 2026-06-10T11:08:31Z

🚀 Summary of Changes

What does this PR do? What feature, fix, or improvement does it bring?

📌 Related Issues / Tickets

Resolves #
Related to #

✅ Type of Change

🚀 Release (release)
✨ Feature (feature)
🧠 Model support (model)
🧬 Core engine changes (core)
🛠 Bug fix (fix)
⚙️ Performance improvement (perf)
🔁 Refactor or code cleanup (refactor)
📄 Documentation (docs)
❓ Other (other): please describe

🧪 How to Test

Run ...
Verify output: ...
Edge case tested: ...

📸 Screenshots / Logs (if applicable)

📋 Checklist

PR title follows Conventional Commits format
This PR is linked to an existing issue
The test method is described, and the expected result is clearly stated
Relevant documentation has been updated (if applicable)

💬 Notes

- scheduler: accept hash_block_size from EngineCore and assert it equals block_size; set enable_return_routed_experts read by inherited update_from_output; drop dead num_cached_tokens fix-up (field removed from Request in 0.22) - kv cache manager/coordinator: plumb max_num_batched_tokens down to get_manager_for_kv_cache_spec (new required args); replace use_eagle flag with eagle_group_ids (empty set + assert, eagle unsupported) - model runner: InputBatch is_spec_decode -> num_spec_tokens; rework _get_prompt_logprobs_dict to per-request in_progress_prompt_logprobs_cpu - input batch: upload renamed _make_prompt_token_ids_cpu_tensor result to device (caller-side upload in 0.22) - worker: return CompilationTimes from compile_or_warm_up_model (executor unconditionally reads .language_model/.encoder) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vLLM 0.18's ECConnectorModelRunnerMixin.get_finished_ec_transfers() was removed in 0.22; its logic now lives inside the maybe_get_ec_connector_output context manager (get_finished + clear on exit). sample_tokens() still called the old method, raising AttributeError. Drive the full EC lifecycle through the upstream context manager in execute_model(): bind metadata + load caches on entry, poll finished transfers into ec_connector_output on exit. Carry the result via ExecuteModelState and forward it to ModelRunnerOutput in sample_tokens(), matching upstream gpu_model_runner. Preserves RBLN's non-blocking decode / blocking prefill cache load. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vLLM 0.22's cleanup_dist_env_and_memory() calls torch.accelerator.empty_cache() for any non-CPU platform. RBLN is non-CPU (OOT) but runs on a CPU-only torch build with no torch accelerator, so torch 2.11 raises "Cannot access accelerator device when none is available" and the EngineCore dies during shutdown cleanup. Wrap torch.accelerator.empty_cache() to swallow that one case (matching its documented no-op contract); other errors propagate. Applied via register_ops() at plugin load. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

rebel-eunji and others added 18 commits June 7, 2026 13:22

other: sync with vllm 0.19.1

6f31300

update pyproject.toml

10e10e5

clean

072f389

random_seed_all

2e4b7a9

type2 numa setting

17eb84f

remove unused param - seed in sampler

f3a3a2c

pre-commit

d10f3b0

add comment

49460ea

add comment space

05680f3

update optimum branch

ae32676

update vllm whl version

d4ca587

optimum-rbln==0.11.0a0

41efeae

resolve deps

863d966

update trigger on pr condition

5a7f50b

implement classmethods

42c04ec

rebel-eunji self-assigned this Jun 10, 2026

rebel-eunji force-pushed the dev-0.22.0-optimum branch 3 times, most recently from bab39ce to 8502914 Compare June 11, 2026 01:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(model): sync whisper with SupportsTranscription interface (WIP)#668

refactor(model): sync whisper with SupportsTranscription interface (WIP)#668
rebel-eunji wants to merge 18 commits into
dev-0.22.0-optimumfrom
eunji/refactor-whisper

rebel-eunji commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rebel-eunji commented Jun 10, 2026

🚀 Summary of Changes

📌 Related Issues / Tickets

✅ Type of Change

🧪 How to Test

📸 Screenshots / Logs (if applicable)

📋 Checklist

💬 Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants