Skip to content

Adds a heuristic upper bound in the case of unbounded symints#4083

Open
narendasan wants to merge 2 commits intomainfrom
narendasan/push-tvzqrmnlsyzn
Open

Adds a heuristic upper bound in the case of unbounded symints#4083
narendasan wants to merge 2 commits intomainfrom
narendasan/push-tvzqrmnlsyzn

Conversation

@narendasan
Copy link
Collaborator

Description

Repeated calls to torch.compile without explicit marking of dynamic dimensions may produce unbounded symints (no max val). We need a heuristic and throw a warning to unblock users.

We now assume the upper bound is 2**16 for any particular dim (so that overflow would be less likely for tensor volume calculations done in int32).

Added a bunch of test cases and integrated it into partitioning to generate intermediate shape ranges.

Supersedes: #4080 cc: @wenbingl

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

…ork from back to front to prevent type mismatch

When we have multiple SymInt placeholders, after the first one is replaced with `sym_size`, subsequent SymInts might be getting the wrong tensor argument. The issue is in the `remove_sym_nodes` function - when we pop from `sample_inputs` by index, we're modifying the list while iterating, which causes index misalignment.

We need to pop in reverse order to avoid index misalignment
@narendasan narendasan requested a review from apbose February 17, 2026 23:34
@meta-cla meta-cla bot added the cla signed label Feb 17, 2026
@github-actions github-actions bot added component: tests Issues re: Tests component: lowering Issues re: The lowering / preprocessing passes component: core Issues re: The core compiler component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: torch_compile labels Feb 17, 2026
@github-actions github-actions bot requested a review from zewenli98 February 17, 2026 23:34
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/pass_manager.py	2026-02-17 23:34:29.103514+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/pass_manager.py	2026-02-17 23:35:05.140759+00:00
@@ -123,13 +123,13 @@
        self._validated = False

    def check_pass_names_valid(self, debug_pass_names: List[str]) -> None:
        pass_names_str = [p.__name__ for p in self.passes]
        for name in debug_pass_names:
-            assert name in pass_names_str, (
-                f"{name} is not a valid pass! Passes: {pass_names_str}"
-            )
+            assert (
+                name in pass_names_str
+            ), f"{name} is not a valid pass! Passes: {pass_names_str}"

    def __call__(self, gm: Any, settings: CompilationSettings) -> Any:
        self.validate()
        out = gm
        for _pass in self.passes:
--- /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_unbounded_symint.py	2026-02-17 23:34:29.129089+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_unbounded_symint.py	2026-02-17 23:35:11.671782+00:00
@@ -27,20 +27,19 @@
    class SimpleModel(torch.nn.Module):
        def __init__(self):
            super().__init__()
            self.linear1 = torch.nn.Linear(128, 64)

-
        def forward(self, x):
            return F.relu(self.linear1(x))

    model = SimpleModel().eval().cuda()

    # Create input with unbounded batch dimension
    input_tensor = torch.randn(4, 128).cuda()
    # Mark dimension 0 as dynamic with only min (no max = unbounded)
-    #torch._dynamo.mark_dynamic(input_tensor, 0, min=1)
+    # torch._dynamo.mark_dynamic(input_tensor, 0, min=1)

    compile_spec = {
        "device": torchtrt.Device("cuda:0"),
        "enabled_precisions": {torch.float},
        "min_block_size": 1,
@@ -246,17 +245,15 @@

    input_tensor = torch.randn(8, 16, 16).cuda()
    output_ref = model(input_tensor)
    output_trt = trt_model(input_tensor)

-
    cos_sim = cosine_similarity(output_ref, output_trt)
    assertions.assertTrue(
        cos_sim > COSINE_THRESHOLD,
        msg=f"Reshape with unbounded SymInt test failed. Cosine sim: {cos_sim}",
    )
-

    # Verify output shapes match
    assertions.assertEqual(output_ref.shape, output_trt.shape)

    torch._dynamo.reset()
@@ -360,10 +357,11 @@
            msg=f"Reasonable default test failed at batch_size={batch_size}. Cosine sim: {cos_sim}",
        )

    torch._dynamo.reset()

+
@pytest.mark.unit
def test_unbounded_symint_fallback():
    """
    Test that the default max (min * 128) is applied for unbounded SymInts.
    This test verifies the fallback behavior when no max is specified.
@@ -374,11 +372,10 @@
            super().__init__()
            self.linear1 = torch.nn.Linear(64, 128)
            self.linear2 = torch.nn.Linear(128, 64)
            self.linear3 = torch.nn.Linear(64, 32)
            self.relu = torch.nn.ReLU()
-

        def forward(self, x):
            x = self.relu(self.linear1(x))
            x = self.relu(self.linear2(x))
            x = self.linear3(x)
@@ -412,8 +409,7 @@
        )

    torch._dynamo.reset()


-
if __name__ == "__main__":
    pytest.main([__file__, "-v"])

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/pass_manager.py	2026-02-17 23:34:29.128708+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/pass_manager.py	2026-02-17 23:35:05.630531+00:00
@@ -123,13 +123,13 @@
        self._validated = False

    def check_pass_names_valid(self, debug_pass_names: List[str]) -> None:
        pass_names_str = [p.__name__ for p in self.passes]
        for name in debug_pass_names:
-            assert name in pass_names_str, (
-                f"{name} is not a valid pass! Passes: {pass_names_str}"
-            )
+            assert (
+                name in pass_names_str
+            ), f"{name} is not a valid pass! Passes: {pass_names_str}"

    def __call__(self, gm: Any, settings: CompilationSettings) -> Any:
        self.validate()
        out = gm
        for _pass in self.passes:
--- /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_unbounded_symint.py	2026-02-17 23:34:29.154709+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_unbounded_symint.py	2026-02-17 23:35:11.995372+00:00
@@ -27,20 +27,19 @@
    class SimpleModel(torch.nn.Module):
        def __init__(self):
            super().__init__()
            self.linear1 = torch.nn.Linear(128, 64)

-
        def forward(self, x):
            return F.relu(self.linear1(x))

    model = SimpleModel().eval().cuda()

    # Create input with unbounded batch dimension
    input_tensor = torch.randn(4, 128).cuda()
    # Mark dimension 0 as dynamic with only min (no max = unbounded)
-    #torch._dynamo.mark_dynamic(input_tensor, 0, min=1)
+    # torch._dynamo.mark_dynamic(input_tensor, 0, min=1)

    compile_spec = {
        "device": torchtrt.Device("cuda:0"),
        "enabled_precisions": {torch.float},
        "min_block_size": 1,
@@ -246,17 +245,15 @@

    input_tensor = torch.randn(8, 16, 16).cuda()
    output_ref = model(input_tensor)
    output_trt = trt_model(input_tensor)

-
    cos_sim = cosine_similarity(output_ref, output_trt)
    assertions.assertTrue(
        cos_sim > COSINE_THRESHOLD,
        msg=f"Reshape with unbounded SymInt test failed. Cosine sim: {cos_sim}",
    )
-

    # Verify output shapes match
    assertions.assertEqual(output_ref.shape, output_trt.shape)

    torch._dynamo.reset()
@@ -360,10 +357,11 @@
            msg=f"Reasonable default test failed at batch_size={batch_size}. Cosine sim: {cos_sim}",
        )

    torch._dynamo.reset()

+
@pytest.mark.unit
def test_unbounded_symint_fallback():
    """
    Test that the default max (min * 128) is applied for unbounded SymInts.
    This test verifies the fallback behavior when no max is specified.
@@ -374,11 +372,10 @@
            super().__init__()
            self.linear1 = torch.nn.Linear(64, 128)
            self.linear2 = torch.nn.Linear(128, 64)
            self.linear3 = torch.nn.Linear(64, 32)
            self.relu = torch.nn.ReLU()
-

        def forward(self, x):
            x = self.relu(self.linear1(x))
            x = self.relu(self.linear2(x))
            x = self.linear3(x)
@@ -412,8 +409,7 @@
        )

    torch._dynamo.reset()


-
if __name__ == "__main__":
    pytest.main([__file__, "-v"])

@narendasan narendasan force-pushed the narendasan/push-tvzqrmnlsyzn branch from f2afd1d to 274d970 Compare February 18, 2026 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests component: torch_compile

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments