Skip to content

[pull] main from NVIDIA:main#361

Merged
pull[bot] merged 7 commits intoyingguo-trt:mainfrom
NVIDIA:main
Apr 6, 2026
Merged

[pull] main from NVIDIA:main#361
pull[bot] merged 7 commits intoyingguo-trt:mainfrom
NVIDIA:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 6, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

hnover-nv and others added 7 commits April 6, 2026 09:18
Signed-off-by: Harris Nover <249353502+hnover-nv@users.noreply.github.com>
…ampler (#12358)

Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
…ts (#12724)

Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
* Why?

We would like to be able to use a TorchLlmArgs config in
AutoDeploy's own version with minimal changes.

* What?

This commit removes the redefinition of:
- `model_kwargs`: existing usages guarded against `None` the same way
  as an empty dict.
- `max_batch_size: most unit tests set it explicitly; a few configs were
  updated to have the old default.
- `max_beam_width`: instead adds a validator for it.
- `att_backend`: although the default between the base class ("TRTLLM")
  and autodeploy ("flashinfer") differ, the
  `update_transforms_with_shortcuts` validator in practice reads the
  default from `default.yaml`, which is "flashinfer".
- `sampler`: the executor code already supported both. We just tweak it
  so that the "auto" value corresponds to the now removed default.

It also removes the `cuda_graph_batch_sizes` in favor of
`cuda_graph_config.batch_sizes`, with necessary adjustments to unit
tests and existing configs.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
@pull pull Bot locked and limited conversation to collaborators Apr 6, 2026
@pull pull Bot added the ⤵️ pull label Apr 6, 2026
@pull pull Bot merged commit 2b80f8d into yingguo-trt:main Apr 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants