Trace model process will introduce extra `aten::Int` op #1185

ruoqianguo · 2022-07-18T07:03:53Z

ruoqianguo
Jul 18, 2022

Background: When i tried to convert Pytorch model Swin-Transformer to torchscript, it didn't work with script and seemed like some ops are not supported in script process. So i have to trace the model. But during tracing process, i find it will introduce extra aten::Int op which isn't supported well during TorchTRT converting in some cases. In the Swin-Transformer case, i have to change the source code to avoid tracing process introduce extra aten::Int. I wonder if we have some methods to avoid tracing introducing aten::Int into the torchscript. There is an example about the case below.

There is a snippet code which is widely used in swin-transformer. When i traced the model and fully compiled it with torch-trt, it failed during compiling. When i scripted the model and compiled it with torch-trt, it worked.
The script process log is here:
script_log.txt.
The trace process log is here:
trace_full_compile.txt
In script process, self.window_size is a int value %self.window_size : int = prim::Constant[value=7]() and in trace process, self.window_size is a Tensor value %16 : Long(requires_grad=0, device=cpu) = prim::Constant[value={7}](). In my opinion, this introduced aten::Int op into the graph which leaded to the trace process failure.

So i wonder if we have some methods to avoid parsing self.window_size as a Tensor value during trace process. I think it will avoid lots of aten::Int op in some cases.

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import torch_tensorrt as torchtrt
torchtrt.logging.set_reportable_log_level(torchtrt.logging.Level.Debug)

class Net(nn.Module):
    def __init__(self, window_size=7):
        super(Net, self).__init__()
        self.window_size = window_size
    
    def forward(self, x):
        B, H, W, C = x.shape
        x = x.view(B, H // self.window_size, self.window_size, W // self.window_size, self.window_size, C)
        windows = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, self.window_size, self.window_size, C)
        return windows

shape = [2, 14, 14, 5]
x = torch.randn(*shape).cuda()
py_net = Net(window_size=7).eval().cuda()
# script_net = torch.jit.script(py_net).cuda()
script_net = torch.jit.trace(py_net, (x, )).cuda()
print("torchscript graph: ", script_net.graph)

y = script_net(x)

# trtorch
compile_settings = {
    "inputs":  [
        torchtrt.Input(shape, dtype=torch.float),
        ],
    "enabled_precisions": {torch.float},
    "require_full_compilation": True,
    "truncate_long_and_double": True,
    # "torch_executed_ops":["aten::Int"],
}

trt_ts_module = torchtrt.ts.compile(script_net, **compile_settings)
print("compiled")

trt_out = trt_ts_module(x)
print("torchscript results, ", y.size())
print("trt results, ", trt_out.size())
print("diff: ", torch.mean(torch.abs(trt_out - y)))

ruoqianguo · 2022-07-18T07:07:57Z

ruoqianguo
Jul 18, 2022
Author

@narendasan @peri044 When you have time please take a look, Thanks.

0 replies

ncomly-nvidia · 2022-08-02T23:52:25Z

ncomly-nvidia
Aug 2, 2022

watching

0 replies

narendasan · 2022-08-03T03:05:54Z

narendasan
Aug 3, 2022
Collaborator

In my experience aten::Int gets introduced when there is type ambiguity in the code. This is why patching models to be explicit about types avoids introducing aten::Int. Tracing has less type information available than scripting. We don't have a ton of control here within the tracing process unless we submit a patch upstream. Even then not sure what that patch would look like

1 reply

ruoqianguo Aug 4, 2022
Author

Thanks, I see it.

ruoqianguo · 2022-08-17T06:21:47Z

ruoqianguo
Aug 17, 2022
Author

I find a blog about tracing process. It mentioned that expressions like tensor.size(0), tensor.size()[1], tensor.shape[2] are integers in eager mode, but Tensors in tracing mode.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trace model process will introduce extra `aten::Int` op #1185

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Trace model process will introduce extra aten::Int op #1185

Uh oh!

Uh oh!

ruoqianguo Jul 18, 2022

Replies: 4 comments · 1 reply

Uh oh!

ruoqianguo Jul 18, 2022 Author

Uh oh!

ncomly-nvidia Aug 2, 2022

Uh oh!

narendasan Aug 3, 2022 Collaborator

Uh oh!

ruoqianguo Aug 4, 2022 Author

Uh oh!

ruoqianguo Aug 17, 2022 Author

Trace model process will introduce extra `aten::Int` op #1185

ruoqianguo
Jul 18, 2022

Replies: 4 comments 1 reply

ruoqianguo
Jul 18, 2022
Author

ncomly-nvidia
Aug 2, 2022

narendasan
Aug 3, 2022
Collaborator

ruoqianguo Aug 4, 2022
Author

ruoqianguo
Aug 17, 2022
Author