HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 97
Star 72

Code
Issues 10
Pull requests 68
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

New pull request New

68 Open 1,143 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix head_dim reading when head_dim is null explicitly in the model config

#1294 opened May 21, 2025 by yeonsily • Draft

[PoC][CI] Introduce test templates

#1293 opened May 21, 2025 by kzawora-intel • Draft

Set vllm-hpu-extension revision to bf51134

#1291 opened May 21, 2025 by iboiko-habana

Loading…

Add torch.compile tests into test_config.yaml

#1289 opened May 21, 2025 by kzawora-intel

Loading…

[deepseek_r1] add scripts for benchmark throughput and serving

#1288 opened May 21, 2025 by yangulei

Loading…

Increase the default value of VLLM_MOE_SLICE_LENGTH to 100k

#1287 opened May 21, 2025 by czhu15

Loading…

test disagg 2 nodes

#1283 opened May 20, 2025 by libinta • Draft

[SW-225565] Enable traingular softmax with merged prefill

#1278 opened May 20, 2025 by kamil-kaczor • Draft

Qwen2.5 omni

#1269 opened May 19, 2025 by wenbinc-Bin

Loading…

Enable triangular attention

#1268 opened May 16, 2025 by kamil-kaczor • Draft

Allow FSDPA for Qwen

#1267 opened May 16, 2025 by madamczyk-intel • Draft

[draft] Dev flags overhaul

#1266 opened May 16, 2025 by madamczyk-intel • Draft

Add split_qkv for Granite

#1263 opened May 15, 2025 by kdamaszk

Loading…

[CI][habana_main]Add fp8 test for deepseek-v2

#1260 opened May 14, 2025 by xuechendi • Draft

set enable-expert-parallel for qwen3-235b FP run

#1257 opened May 14, 2025 by ccrhx4

Loading…

Added embedding online/offline benchmark funtonality

#1253 opened May 13, 2025 by yeonsily

Loading…

Enable embedding test on jenkins

#1234 opened May 8, 2025 by yeonsily

Loading…

[Qwen3] Enable on HPU

#1227 opened May 8, 2025 by xuechendi • Draft

Rebase may 07

#1220 opened May 7, 2025 by michalkuligowski

Loading…

Porting alibi fix PR from Haihao and Tanner

#1214 opened May 7, 2025 by testdig

Loading…

add calibration files of 235B model on G2

#1201 opened May 6, 2025 by mengniwang95

Loading…

fix the incorrect output_tokens for penalty calculation in the sampler when delayed sampling is enabled

#1199 opened May 6, 2025 by ccrhx4

Loading…

adding the benchmark script

#1191 opened Apr 30, 2025 by mrezavand

Loading…

[DRAFT] 3d warmup

#1178 opened Apr 29, 2025 by iboiko-habana • Draft

Update README

#1177 opened Apr 29, 2025 by Chris-Sigopt

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly