TomerBN-Nvidia / vllm Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Code
Pull requests 9
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: TomerBN-Nvidia/vllm

Labels 9 Milestones 0

New pull request New

9 Open 10 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix] Fix D2H buffer race in routed-experts async pipeline

#19 opened Apr 28, 2026 by TomerBN-Nvidia Owner

Loading…

[Bugfix] Drop worker-side stop predicate in routed-experts extractor

#18 opened Apr 28, 2026 by TomerBN-Nvidia Owner

Loading…

fix: skip w13 swap in FlashInfer CUTLASS path for non-gated MoE

#17 opened Apr 25, 2026 by djmmoss

Loading…

3 tasks done

[WIP] Rollout routing replay + prefix caching

#16 opened Apr 25, 2026 by pjin-nvidia • Draft

[Core] Add monolithic kernel routing replay and prefix caching sentinel

#12 opened Apr 16, 2026 by TomerBN-Nvidia Owner • Draft

2 of 4 tasks

Implement multi-node vLLM health check

#7 opened Mar 25, 2026 by TomerBN-Nvidia Owner

Loading…

5 tasks

[draft] Fix for reload_weights API for ModelOpt MXFP8

#4 opened Mar 16, 2026 by guyueh1

Loading…

5 tasks

Routing replay chat completions + NeMo RL interface

#3 opened Mar 12, 2026 by pjin-nvidia

Loading…

fix: Use fake mxfp8 quant on intermediate tensor of MoE

#1 opened Dec 1, 2025 by guyueh1

Loading…

5 tasks

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!