Releases · rebellions-sw/vllm-rbln

03 Nov 01:01

rebel-shshin

v0.9.2.post1

ec40978

v0.9.2.post1 Latest

Latest

What's Changed

fix(version): Release v0.9.2.post1 release by @rebel-shshin in #141

New Contributors

@rebel-shshin made their first contribution in #141

Full Changelog: v0.9.2...v0.9.2.post1

What's Changed

fix(ci) : check collaborator logic by @rebel-seinpark in #124
feat: Support for Sort-free sampling by @rebel-eunji in #107
fix(CI): fix output sync code in pr dispatch workflow by @rebel-jswoo in #125
update: Bump version to v0.10.2 (optimum) by @rebel-eunji in #122
fix(model): Fix pooling models in v0.10.2 by @rebel-eunji in #129
other: Update the python version to 3.12 by @rebel-eunji in #131
add(core): Support for Prefix Caching by @rebel-eunji in #85
fix(model): pre-release bug fixes by @rebel-eunji in #133
fix(model): fix whisper, qwen3-embedding/reranker, qwen2.5-vl model by @rebel-eunji in #134
fix: clarify the semantics of attention custom ops by @rebel-jaehwang in #117
fix(model): fix multi-modal processing and classification model by @rebel-eunji in #135
update: Modify code to be compatible with v0.10.2 by @rebel-jiwoopark in #119
fix(core): resolve prefix caching issue during preemption by @rebel-eunji in #136
fix(other): Disable prefix caching for stability by @rebel-eunji in #137
fix(other): turn off the prefix caching completely by @rebel-eunji in #138
fix(other): fix pytest bugs by @rebel-eunji in #139
fix(version): Release v0.9.2.post1 release by @rebel-shshin in #141
fix(version): update release version to v0.9.2.post1 by @rebel-shshin in #142

New Contributors

@rebel-shshin made their first contribution in #141

Full Changelog: v0.9.2...v0.9.2.post1

Contributors

rebel-shshin, rebel-jswoo, and 4 other contributors

Assets 2

31 Oct 10:24

rebel-shshin

v0.9.2

53a7b30

v0.9.2

What's Changed

feat(core): Enable causal attention kernel by @rebel-jiwoopark in #95
fix: TP in V1 engine by @rebel-jiwoopark in #101
other: Update the version of transformers to 4.53.1 by @rebel-eunji in #102
fix(CI): add outer collaborator checker by @rebel-jswoo in #100
other: add team members by @rebel-seinpark in #103
feat(core): support for Multi-LoRA by @rebel-eunji in #48
refactor: update variable names and logging for better code clarity by @pei0033 in #105
other: Upgrade the version of optimum-rbln by @rebel-eunji in #109
other: Script for logprobs validation by @rebel-eunji in #108
fix(CI): Add cleanup job in dispatch pr ci by @rebel-jswoo in #104
fix: checking is_collaborator logic in CI by @rebel-seinpark in #110
Run patches in RblnPlatform.pre_register_and_update by @rebel-jaehwang in #43
feat: compile lm_head using rbln backend (#96) by @rebel-jiwoopark in #99
fix(core): timeout error of large models. by @rebel-jiwoopark in #111
bug-fix: Skip using compile_context when RBLN_COMPILE_MODEL=0 (V0) by @rebel-eunji in #116
add: initial works for enabling warmup in v1 engine by @huijjj in #84
other: change PR trigger to dev by @rebel-jonghewk in #123

Full Changelog: v0.9.1...v0.9.2

Contributors

pei0033, huijjj, and 6 other contributors

Assets 2

24 Oct 04:49

rebel-shshin

v0.9.2a1

53a7b30

v0.9.2a1 Pre-release

Pre-release

What's Changed

refactor: update variable names and logging for better code clarity by @pei0033 in #105
other: Upgrade the version of optimum-rbln by @rebel-eunji in #109
other: Script for logprobs validation by @rebel-eunji in #108
fix(CI): Add cleanup job in dispatch pr ci by @rebel-jswoo in #104
fix: checking is_collaborator logic in CI by @rebel-seinpark in #110
Run patches in RblnPlatform.pre_register_and_update by @rebel-jaehwang in #43
feat: compile lm_head using rbln backend (#96) by @rebel-jiwoopark in #99
fix(core): timeout error of large models. by @rebel-jiwoopark in #111
bug-fix: Skip using compile_context when RBLN_COMPILE_MODEL=0 (V0) by @rebel-eunji in #116
add: initial works for enabling warmup in v1 engine by @huijjj in #84
other: change PR trigger to dev by @rebel-jonghewk in #123

Full Changelog: v0.9.2a0...v0.9.2a1

Contributors

pei0033, huijjj, and 6 other contributors

Assets 2

02 Oct 07:03

rebel-shshin

v0.9.2a0

32f57ba

v0.9.2a0 Pre-release

Pre-release

What's Changed

feat(core): Enable causal attention kernel by @rebel-jiwoopark in #95
fix: TP in V1 engine by @rebel-jiwoopark in #101
other: Update the version of transformers to 4.53.1 by @rebel-eunji in #102
fix(CI): add outer collaborator checker by @rebel-jswoo in #100
other: add team members by @rebel-seinpark in #103
feat(core): support for Multi-LoRA by @rebel-eunji in #48

Full Changelog: v0.9.1...v0.9.2a0

Contributors

rebel-jswoo, rebel-jiwoopark, and 2 other contributors

Assets 2

30 Sep 03:22

rebel-shshin

v0.9.1

baea36e

v0.9.1

What's Changed

update(dep): Update requirement version for optimum-rbln by @rebel-jonghewk in #62
update(attn): support head size of 80 for attention. by @rebel-jiwoopark in #69
Other(ci): add dispatch workflow - trigger pr ci by @rebel-jswoo in #61
fix(model): Fix the block table in case of Sliding Window Attention by @rebel-eunji in #70
feat(option): Introduce environment variables for vLLM models and binary caching. by @rebel-jiwoopark in #74
fix(core): block allocation for torch compile path by @huijjj in #77
fix: raise MemoryError when available_dram becomes negative by @junstar92 in #80
other(ci): bugfix in trigger dispatch workflow by @rebel-jswoo in #72
fix: get_maximum_num_blocks usage in USE_VLLM_MODEL=1 by @huijjj in #82
fix: make RoPE more compatible with RBLN by @rebel-jaehwang in #83
fix(core): apply changes to enable vLLM MoE models by @rebel-wonsubkim in #81
feat: support eager mode by @rebel-jiwoopark in #66
feat(model): support Qwen2-VL model by @rebel-eunji in #91
Update Python version by @rebel-hjkim in #16
other: Update optimum-rbln version in requirements.txt by @rebel-jonghewk in #98

New Contributors

@rebel-jswoo made their first contribution in #61
@junstar92 made their first contribution in #80
@rebel-wonsubkim made their first contribution in #81
@rebel-hjkim made their first contribution in #16

Full Changelog: v0.8.3...v0.9.1

Contributors

huijjj, junstar92, and 7 other contributors

Assets 2

28 Aug 10:04

rebel-shshin

v0.8.3

386fbf4

v0.8.3

What's Changed

fix(core): fix kv cache block table by @rebel-eunji in #42
Support for flash attention by @rebel-jindol21 in #7
update(docs): update CONTRIBUTING and PR Template by @rebel-jiwoopark in #40
fix(core): sync num_gpu_blocks w/ estimated blocks by @rebel-jonghewk in #49
fix(core): bug fix for block table logic by @rebel-jonghewk in #53
fix: use config classes when computing max blocks by @huijjj in #56
fix(ci): fix latest optimum version in PR CI by @rebel-seinpark in #54
feat: support unit test using pytest by @rebel-seinpark in #52
feat(model): update and refactor the sliding window attention by @rebel-eunji in #44
fix(worker): clamp num available blocks by num required blocks by @rebel-jaehwang in #59
fix(sampler): Fix sampler graph by @rebel-jongho in #58
core: migrating to V1 Engine by @rebel-jiwoopark in #51

New Contributors

@rebel-jindol21 made their first contribution in #7
@huijjj made their first contribution in #56
@rebel-jaehwang made their first contribution in #59

Full Changelog: v0.8.2...v0.8.3

Contributors

huijjj, rebel-jongho, and 6 other contributors

Assets 2

31 Jul 08:03

rebel-shshin

v0.8.2

841aaae

v0.8.2

What's Changed

[core] Update the sampler for model runner. by @rebel-jiwoopark in #2
fix: cross encoder output by @rebel-sunwook in #3
Support for Whisper model in vLLM by @rebel-eunji in #6
fix: sync with optimum-rbln fix by @rebel-seinpark in #15
feature: update attention layer for V0 engine by @rebel-jiwoopark in #14
Add Qwen3 models by @rebel-eunji in #11
Migrate to V1 Engine w/ Optimum-based by @rebel-eunji in #10
Turn off Prefix caching option by @rebel-eunji in #22
fix: use original vllm entrypoint for vllm cli by @rebel-jiwoopark in #20
Fix shape of multi modal data in Gemma3 by @rebel-eunji in #23
fix: get num_gpu_blocks logic in V1 by @rebel-seinpark in #29
Fixes: gemma3's block_table scheduling bug due to padding by @rebel-thkim in #28
feature: extend envs for RBLN environments. by @rebel-jiwoopark in #25
Add logger and Refactor the V1 codes by @rebel-eunji in #33
Update README with new logo by @rebel-eunji in #32
Support for LlavaForConditionalGeneration models by @rebel-eunji in #31
other: Add warmup runs when running with RBLNSampler() by @rebel-jonghewk in #37
other: update optimum requirements for v0.8.2 by @rebel-jonghewk in #35

New Contributors

@rebel-jiwoopark made their first contribution in #2
@rebel-sunwook made their first contribution in #3
@rebel-eunji made their first contribution in #6
@rebel-seinpark made their first contribution in #15
@rebel-thkim made their first contribution in #28
@rebel-jonghewk made their first contribution in #37

Full Changelog: v0.8.1...v0.8.2

Contributors

rebel-sunwook, rebel-jonghewk, and 4 other contributors

Assets 2

09 Jul 08:40

rebel-shshin

v0.8.1.post1

cc81794

v0.8.1.post1

What's Changed

[core] Update the sampler for model runner. by @rebel-jiwoopark in #2
fix: cross encoder output by @rebel-sunwook in #3

New Contributors

@rebel-jiwoopark made their first contribution in #2
@rebel-sunwook made their first contribution in #3

Full Changelog: v0.8.1...v0.8.1.post1

Contributors

rebel-sunwook and rebel-jiwoopark

Assets 2

04 Jul 04:52

rebel-shshin

v0.8.1

0974f32

v0.8.1

Full Changelog: https://github.com/rebellions-sw/vllm-rbln/commits/v0.8.1

Assets 2

Releases: rebellions-sw/vllm-rbln

v0.9.2.post1

What's Changed

New Contributors

What's Changed

New Contributors

Contributors

Uh oh!

v0.9.2

What's Changed

Contributors

Uh oh!

v0.9.2a1

What's Changed

Contributors

Uh oh!

v0.9.2a0

What's Changed

Contributors

Uh oh!

v0.9.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.3

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.1.post1

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.1

Uh oh!