Release v0.10.0 · RBLN-SW/vllm-rbln

What's Changed

fix: add guard_filter to rbln_sampler by @rebel-eunji in #241
fix environment variables for rbln ccl by @rebel-ykchoi in #242
other(ci): change the runner of PR CI by @rebel-eunji in #245
Revert "fix: add guard_filter to rbln_sampler" by @rebel-eunji in #251
update: Change log level in CI by @rebel-jonghewk in #250
other(ci): log the length of generated tokens when the request is preempted by @rebel-eunji in #248
fix(model): fix whisper model by @rebel-eunji in #247
other(CI): clean up workflow by @rebel-seinpark in #233
update(core): customize blockpool to increase prefix cache hit rate by @rebel-eunji in #205
fix(ci): pkg auto update by @rebel-seinpark in #259
fix(ci): pkg update by @rebel-seinpark in #260
other: add enable_expert_parallel in basic example by @rebel-jiwoopark in #255
Auto-update optimum-rbln to 0.9.5a0 by @rebel-shshin in #261
update(worker): Set NUMA aware CPU affinity and OMP_NUM_THREADS by @rebel-yskim in #232
other(ci): fix bug in sampler and update pytest for sampler by @rebel-eunji in #239
Add dev0.12 event type by @rebel-jaebin in #267
Auto-update optimum-rbln to 0.9.5a1 by @rebel-shshin in #268
update: enable profiler for optimum-rbln based vllm by @rebel-jonghewk in #256
fix: warm up for swa hybrid model by @rebel-jaehwang in #231
other(ci): remove explicit secrets by @rebel-seinpark in #276
Auto-update optimum-rbln to 0.9.5a2 by @rebel-shshin in #275
Update runner name by @rebel-jaebin in #281
other: repo migration (sw) by @rebel-seinpark in #271
Auto-update optimum-rbln to 0.9.5a4 by @rebel-develop in #282
other(test): add unit test for attention backend. by @rebel-jiwoopark in #273
Revert "other(test): add unit test for attention backend." by @rebel-jiwoopark in #285
Auto-update optimum-rbln to 0.9.5a5 by @rebel-develop in #286
update(triton): replace triton with torch_triton by @rebel-jindol21 in #274
fix moe data parallel for v1 engine by @rebel-ykchoi in #252
Auto-update optimum-rbln to 0.9.5a6 by @rebel-develop in #288
Revert "update(triton): replace triton with torch_triton" by @rebel-jindol21 in #289
feat: enable topk_topp_sampler by @rebel-eunji in #284
feat: prefill performance by request_id and exclude warmup requests from performance tracking by @rebel-eunji in #279
feat(triton): support torch_triton by @rebel-jindol21 in #292
fix(model): pad image tokens using an out-of-vocab index by @rebel-eunji in #291
Auto-update optimum-rbln to 0.9.5a7 by @rebel-develop in #296
fix: use rbln_sampler when both top_k and top_p are None by @rebel-eunji in #295
update(triton): fixed along with triton kernels in rebel_compiler by @rebel-jindol21 in #299
fix: apply argmax to greedy by @rebel-eunji in #300
fix(kernel): fix kvcache by @rebel-jindol21 in #301
Auto-update optimum-rbln to 0.9.5a8 by @rebel-develop in #302
fix(triton): alignment enforced by @rebel-jindol21 in #303
fix(attn): remove invalid attention param by @rebel-jiwoopark in #305
sync main-dev by @rebel-eunji in #314
Auto-update optimum-rbln to 0.10.0.post1 by @rebel-develop in #315
Release v0.10.0 by @rebel-eunji in #318

New Contributors

@rebel-yskim made their first contribution in #232
@rebel-develop made their first contribution in #282

Full Changelog: v0.9.4...v0.10.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.10.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!