Release v0.9.5 by rebel-eunji · Pull Request #308 · RBLN-SW/vllm-rbln

rebel-eunji · 2026-01-29T06:58:46Z

🚀 Summary of Changes

This PR is for Release v0.9.5.

📌 Related Issues / Tickets

Resolves #
Related to #

✅ Type of Change

✨ Feature (feature)
🧠 Model support (model)
🧬 Core engine changes (core)
🛠 Bug fix (bug-fix)
⚙️ Performance improvement (perf)
🔁 Refactor or code cleanup (refactor)
📄 Documentation (docs)
❓ Other (other): please describe

🧪 How to Test

Run ...
Verify output: ...
Edge case tested: ...

📸 Screenshots / Logs (if applicable)

📋 Checklist

PR title follows Conventional Commits format
This PR is linked to an existing issue
The test method is described, and the expected result is clearly stated
Relevant documentation has been updated (if applicable)

💬 Notes

- configure device ids and ranks for data parallel

Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>

…empted (#248)

* rm requirements and update get optimum version * optimum rbln version * simplify workflow * add openai * fix * refactor: rm comments * rm condition for check-code-quality * debug patch * debug patch * simplify * rm requirements.txt * rm setup.py and requiremets * build without setup.py * fix if * fix dependency * [skip core] [skip worker] fix installing vllm-rbln in pytest-arc * [skip core] [skip worker] update runner * [skip worker][skip core][skip openai] pypi mirror server * [skip worker][skip core]for test * [skip core][skip worker][skip openai] * [skip core][skip worker][skip openai]separage each rsd cases * [skip worker][skip openai][skip core]fix dispatch * [skip worker][skip openai][skip core]fix dispatch * [skip worker][skip core][skip openai] fix * update runner * name fix * name fix * move out build vllm-rbln * fix path

…205)

* fix: rm requirements * rm v prefix * fix --------- Co-authored-by: Sungho Shin <87514200+rebel-shshin@users.noreply.github.com>

* enable profiler for optimum * precommit --------- Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>

* Update fsw pr ci * Update rh ci * Reflect comments * Update git pat * Fix github repository uppercase issue * Revert token

Co-authored-by: rebel-jaebin <jaebin@rebellions.ai>

* other(test): add unit test for attention backend.

This reverts commit d588395.

* replace triton with torch_triton Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

* fix dp for v1 - remove DP padding support in v1 worker - add validation for DP implementation constraints in v1 worker - apply token mask to custom MOE kernel router logits - update default environment variables: - VLLM_RBLN_DP_IMPL: "dummy_prefill" -> "padded_decode" - VLLM_RBLN_USE_MOE_TOKENS_MASK: False -> True - fix DP metadata handling in forward context - add is_prefills field to RBLNFlashAttentionMetadata * fix test_rbln_envs.py - VLLM_RBLN_DP_IMPL should be padded_decode by default * fix DPMetadata for tokens mask - remove is_prefills field and related logic from DP metadata - fix get_tokens_mask() for non-DP case --------- Co-authored-by: rebel-jonghewk <142865404+rebel-jonghewk@users.noreply.github.com>

This reverts commit cae35d7.

…rom performance tracking (#279)

* added torch_triton mode in RBLN_KERNEL_MODE Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

* fixed along with triton kernels in rebel_compiler Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

rebel-yskim and others added 30 commits December 15, 2025 13:13

NUMA socket aware split

e43aa85

Exclusive split

e6371e5

Clean up codes & use NUMA-aware by env variable

170909c

Fix

46744de

Remove comment

db29c5e

Merge branch 'dev' of github.com:rebellions-sw/vllm-rbln into numa_split

7198fdd

Merge branch 'dev' of github.com:rebellions-sw/vllm-rbln into numa_split

c92b6e2

Set OMP_NUM_THREADS=2 as default

31bd21f

Fix pre-commit

9ef4978

ruff

f7efb0c

Fix pre-commit

8909bca

Fix ruff

3c9ab91

Set VLLM_RBLN_NUMA default value as True

72629f7

Merge branch 'dev' of github.com:rebellions-sw/vllm-rbln into numa_split

73fc2ef

Apply comments

c82bc62

fix: add guard_filter to rbln_sampler (#241)

a2e4d4d

fix: environment variables for rbln ccl (#242)

aa76af8

- configure device ids and ranks for data parallel

Apply comments - separate to utils.py

b0a3040

other(ci): change the runner of PR CI (#245)

6b26831

Revert "fix: add guard_filter to rbln_sampler" (#251)

52f009a

rbln compiler log level should be off (#250)

1301199

Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>

other(ci): log the length of generated tokens when the request is pre…

4576595

…empted (#248)

fix(model): fix whisper model (#247)

e5b8b71

update(core): customize blockpool to increase prefix cache hit rate (#…

59067d6

…205)

Merge branch 'dev' of github.com:rebellions-sw/vllm-rbln into numa_split

93da5c2

Set OMP_NUM_THREADS if TP>1 or DP>1

e47c9c5

fix(ci): pkg auto update (#259)

ecda982

fix(ci): pkg update (#260)

dd095bd

* fix: rm requirements * rm v prefix * fix --------- Co-authored-by: Sungho Shin <87514200+rebel-shshin@users.noreply.github.com>

other: add enable_expert_parallel in basic example (#255)

238ac56

rebel-shshin and others added 27 commits January 13, 2026 14:39

Update optimum-rbln to 0.9.5a1 (#268)

1b7863c

update: enable profiler for optimum-rbln based vllm (#256)

6794e9a

* enable profiler for optimum * precommit --------- Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>

fix: warm up for swa hybrid model (#231)

f4bd2ca

other(ci): remove explicit secrets (#276)

8dec103

Update optimum-rbln to 0.9.5a2 (#275)

ace31aa

Update runner name (#281)

ef0298e

* Update fsw pr ci * Update rh ci * Reflect comments * Update git pat * Fix github repository uppercase issue * Revert token

other: repo migration (sw) (#271)

57b9668

Co-authored-by: rebel-jaebin <jaebin@rebellions.ai>

Update optimum-rbln to 0.9.5a4 (#282)

7bb2926

other(test): add unit test for attention backend. (#273)

d588395

* other(test): add unit test for attention backend.

Revert "other(test): add unit test for attention backend. (#273)" (#285)

d71f4c7

This reverts commit d588395.

Update optimum-rbln to 0.9.5a5 (#286)

027a502

update(triton): replace triton with torch_triton (#274)

cae35d7

* replace triton with torch_triton Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

Update optimum-rbln to 0.9.5a6 (#288)

a89609c

Revert "update(triton): replace triton with torch_triton (#274)" (#289)

e4a500e

This reverts commit cae35d7.

feat: enable topk_topp_sampler (#284)

fbf9ae7

feat: prefill performance by request_id and exclude warmup requests f…

fb0ae96

…rom performance tracking (#279)

feat(triton): support torch_triton (#292)

279968f

* added torch_triton mode in RBLN_KERNEL_MODE Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

fix(model): pad image tokens using an out-of-vocab index (#291)

2d91a99

Update optimum-rbln to 0.9.5a7 (#296)

8eb9878

fix: use rbln_sampler when both top_k and top_p are None (#295)

9ca209d

update(triton): fixed along with triton kernels in rebel_compiler (#299)

b4535ae

* fixed along with triton kernels in rebel_compiler Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

fix: apply argmax to greedy (#300)

62b8fa2

fixed num_block of kvcache in sliding window (#301)

19650e4

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

Update optimum-rbln to 0.9.5a8 (#302)

fbc4f04

alignment enforced for q,k,v (#303)

4cf7a9e

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>

fix(attn): remove invalid attention param (#305)

2bed0a5

rebel-eunji requested review from rebel-jonghewk and rebel-shshin January 29, 2026 06:59

rebel-eunji closed this Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.9.5#308

Release v0.9.5#308
rebel-eunji wants to merge 62 commits intomainfrom
dev

rebel-eunji commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Conversation

rebel-eunji commented Jan 29, 2026

🚀 Summary of Changes

📌 Related Issues / Tickets

✅ Type of Change

🧪 How to Test

📸 Screenshots / Logs (if applicable)

📋 Checklist

💬 Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants