Fix tensor lifetime issue by SandSnip3r · Pull Request #4228 · pytorch/TensorRT

SandSnip3r · 2026-05-01T00:19:43Z

Description

This change fixes a correctness issue that I and others were seeing when running the FLUX2 diffusion model. The model, when compiled with either TensorRT or TensorRT-RTX was producing garbage images.

The issue was that the input tensor's lifetime was incorrect. The input tensor's ref count dropped to 0 before the engine ran with enqueueV3(). In this specific case, it was a bit of a perfect storm with an output having the same size and shape and also there being a fp32->bf16 cast. Another tensor was being allocated (the output tensor) and that was given the address of the input tensor.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

SandSnip3r · 2026-05-01T00:34:53Z


    auto dims = core::util::toVec(out_shape);
    auto type = util::TRTDataTypeToScalarType(compiled_engine->exec_ctx->getEngine().getTensorDataType(name.c_str()));
    outputs[pyt_idx] = std::move(at::empty(dims, {at::kCUDA}).to(type).contiguous());


By the way, a separate cleanup should be done where this line is instead

outputs[pyt_idx] = at::empty(dims, at::TensorOptions().device(at::kCUDA).dtype(type));

This would improve from two allocations & a dtype-conversion kernel to just a single allocation.

I think this is the same line Shane identified as well.

narendasan

This looks good to me

narendasan · 2026-05-01T15:45:33Z


    auto dims = core::util::toVec(out_shape);
    auto type = util::TRTDataTypeToScalarType(compiled_engine->exec_ctx->getEngine().getTensorDataType(name.c_str()));
    outputs[pyt_idx] = std::move(at::empty(dims, {at::kCUDA}).to(type).contiguous());


I think this is the same line Shane identified as well.

meta-cla Bot added the cla signed label May 1, 2026

github-actions Bot added component: tests Issues re: Tests component: core Issues re: The core compiler component: runtime labels May 1, 2026

github-actions Bot requested a review from narendasan May 1, 2026 00:20

SandSnip3r commented May 1, 2026

View reviewed changes

narendasan approved these changes May 1, 2026

View reviewed changes

SandSnip3r force-pushed the fix-runtime-buffer-lifetime branch from 3b6cdb3 to 66c5a42 Compare May 4, 2026 18:39

Fix tensor lifetime issue

61b3003

SandSnip3r force-pushed the fix-runtime-buffer-lifetime branch from 66c5a42 to 61b3003 Compare May 5, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tensor lifetime issue#4228

Fix tensor lifetime issue#4228
SandSnip3r wants to merge 1 commit intopytorch:mainfrom
SandSnip3r:fix-runtime-buffer-lifetime

SandSnip3r commented May 1, 2026 •

edited

Loading

Uh oh!

SandSnip3r May 1, 2026

Uh oh!

narendasan May 1, 2026

Uh oh!

narendasan left a comment

Uh oh!

narendasan May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SandSnip3r commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

SandSnip3r May 1, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan May 1, 2026

Choose a reason for hiding this comment

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

narendasan May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SandSnip3r commented May 1, 2026 •

edited

Loading