fix(generator): use logarithmic cuda graph sizes #302

simone-chen · 2026-01-27T20:01:27Z

Overview:

Set cuda_graph_batch_sizes to a logarithmic series of graphs than capturing all graph sizes with step 1.

Details:

Refactored the _eval function in the rule engine to support a wider range of python expressions (e.g. list comprehension) for better flexibility. Instead of using jinja's expression compiler, use asteval to evaluate the rules in Python.

Why: The jinja expression compiler supports very limited python expression and doesn't support the use case of creating a logarithmic series. Without this change, we must define the series in all extra_engine_args*.yaml.j2.

  >>> env = Environment()
  >>> expr = env.compile_expression("[x * 2 for x in numbers]")
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/simonec/virtual_envs/venv_aic/lib/python3.12/site-packages/jinja2/environment.py", line 812, in compile_expression
      self.handle_exception(source=source)
    File "/home/simonec/virtual_envs/venv_aic/lib/python3.12/site-packages/jinja2/environment.py", line 942, in handle_exception
      raise rewrite_traceback_stack(source=source)
    File "<unknown>", line 1, in template
  jinja2.exceptions.TemplateSyntaxError: expected token ',', got 'for'

Intention: All computation logic should be centralized in .rule files instead of scattering in jinja templates, ensuring consistency and maintainability.

Set cuda_graph_batch_sizes to a logarithmic series of graphs than capturing all graph sizes with step 1. This saves excessive resource consumption at runtime.
- Ref: https://forums.developer.nvidia.com/t/best-practices-for-optimizing-pytorch-models-with-cuda-graphs/351177

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

copy-pr-bot · 2026-01-27T20:01:30Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…on syntax support Signed-off-by: Simone Chen <simonec@nvidia.com>

Signed-off-by: Simone Chen <simonec@nvidia.com>

jasonqinzhou · 2026-01-27T20:54:35Z

src/aiconfigurator/generator/rule_plugin/sglang.rule

 agg enable_mixed_chunk = true

-agg_prefill_decode cuda_graph_batch_sizes = ((range(1, max_batch_size + 1) | list) if max_batch_size else [])
+agg_prefill_decode cuda_graph_batch_sizes = ([x for x in [1,2,4,8,16,32,64,128,256,512,1024] if x < max_batch_size] + [max_batch_size] if max_batch_size else [])


This part is pending in discussion in this PR: #245.
We need a meeting to finalize this fix.

simone-chen requested review from a team, Arsene12358, Ethan-ES, jasonqinzhou and tianhaox as code owners January 27, 2026 20:01

github-actions bot added the fix label Jan 27, 2026

simone-chen added 2 commits January 27, 2026 12:07

rewrite the _eval function using asteval and munch to expand its pyth…

d19d55e

…on syntax support Signed-off-by: Simone Chen <simonec@nvidia.com>

update rule for vllm/sglang/trtllm

740ad71

Signed-off-by: Simone Chen <simonec@nvidia.com>

simone-chen force-pushed the simonec/trtc-199-aic-generator-fix-cuda-graph-batch-sizes branch from e8cbb5e to 740ad71 Compare January 27, 2026 20:07

jasonqinzhou reviewed Jan 27, 2026

View reviewed changes

simone-chen closed this Jan 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(generator): use logarithmic cuda graph sizes #302

fix(generator): use logarithmic cuda graph sizes #302

simone-chen commented Jan 27, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Jan 27, 2026

Uh oh!

jasonqinzhou Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(generator): use logarithmic cuda graph sizes #302

fix(generator): use logarithmic cuda graph sizes #302

Conversation

simone-chen commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

copy-pr-bot bot commented Jan 27, 2026

Uh oh!

jasonqinzhou Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simone-chen commented Jan 27, 2026 •

edited

Loading