Skip to content

Release v0.9.5#308

Closed
rebel-eunji wants to merge 62 commits intomainfrom
dev
Closed

Release v0.9.5#308
rebel-eunji wants to merge 62 commits intomainfrom
dev

Conversation

@rebel-eunji
Copy link
Copy Markdown
Collaborator

🚀 Summary of Changes

This PR is for Release v0.9.5.


📌 Related Issues / Tickets

  • Resolves #
  • Related to #

✅ Type of Change

  • ✨ Feature (feature)
  • 🧠 Model support (model)
  • 🧬 Core engine changes (core)
  • 🛠 Bug fix (bug-fix)
  • ⚙️ Performance improvement (perf)
  • 🔁 Refactor or code cleanup (refactor)
  • 📄 Documentation (docs)
  • ❓ Other (other): please describe

🧪 How to Test

  1. Run ...
  2. Verify output: ...
  3. Edge case tested: ...

📸 Screenshots / Logs (if applicable)


📋 Checklist

  • PR title follows Conventional Commits format
  • This PR is linked to an existing issue
  • The test method is described, and the expected result is clearly stated
  • Relevant documentation has been updated (if applicable)

💬 Notes


rebel-yskim and others added 30 commits December 15, 2025 13:13
- configure device ids and ranks for data parallel
Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>
* rm requirements and update get optimum version

* optimum rbln version

* simplify workflow

* add openai

* fix

* refactor: rm comments

* rm condition for check-code-quality

* debug patch

* debug patch

* simplify

* rm requirements.txt

* rm setup.py and requiremets

* build without setup.py

* fix if

* fix dependency

* [skip core] [skip worker] fix installing vllm-rbln in pytest-arc

* [skip core] [skip worker] update runner

* [skip worker][skip core][skip openai] pypi mirror server

* [skip worker][skip core]for test

* [skip core][skip worker][skip openai]

* [skip core][skip worker][skip openai]separage each rsd cases

* [skip worker][skip openai][skip core]fix dispatch

* [skip worker][skip openai][skip core]fix dispatch

* [skip worker][skip core][skip openai] fix

* update runner

* name fix

* name fix

* move out build vllm-rbln

* fix path
* fix: rm requirements

* rm v prefix

* fix

---------

Co-authored-by: Sungho Shin <87514200+rebel-shshin@users.noreply.github.com>
rebel-shshin and others added 27 commits January 13, 2026 14:39
* enable profiler for optimum

* precommit

---------

Co-authored-by: rebel-jonghewk <jonghewk@rebellions.in>
* Update fsw pr ci

* Update rh ci

* Reflect comments

* Update git pat

* Fix github repository uppercase issue

* Revert token
Co-authored-by: rebel-jaebin <jaebin@rebellions.ai>
* other(test): add unit test for attention backend.
* replace triton with torch_triton

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>
* fix dp for v1

- remove DP padding support in v1 worker
- add validation for DP implementation constraints in v1 worker
- apply token mask to custom MOE kernel router logits
- update default environment variables:
  - VLLM_RBLN_DP_IMPL: "dummy_prefill" -> "padded_decode"
  - VLLM_RBLN_USE_MOE_TOKENS_MASK: False -> True
- fix DP metadata handling in forward context
- add is_prefills field to RBLNFlashAttentionMetadata

* fix test_rbln_envs.py

- VLLM_RBLN_DP_IMPL should be padded_decode by default

* fix DPMetadata for tokens mask

- remove is_prefills field and related logic from DP metadata
- fix get_tokens_mask() for non-DP case

---------

Co-authored-by: rebel-jonghewk <142865404+rebel-jonghewk@users.noreply.github.com>
* added torch_triton mode in RBLN_KERNEL_MODE

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>
* fixed along with triton kernels in rebel_compiler

Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>
Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>
Signed-off-by: Jinseok Lee <jindol21@rebellions.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.