Adding the stream.async.cast op to fix potential async correctness issue.#22803
Merged
Adding the stream.async.cast op to fix potential async correctness issue.#22803
Conversation
e0a17de to
65c92dc
Compare
AWoloszyn
approved these changes
Dec 5, 2025
65c92dc to
0e0e204
Compare
HAL ABI import/export lowering sometimes needs to assert that a resource must be usable with a different lifetime without scheduling a copy. Modeling that assertion as stream.async.transfer is too strong: a transfer is real async work, so an external fence chained from the original timeline can signal before the transfer executes and let a caller observe a returned tensor too early. Add stream.async.cast as a tied async-phase passthrough that carries lifetime constraints through ResourceUsageAnalysis. RefineUsage folds the cast when the source can be refined to the requested lifetime and lowers it to stream.async.transfer only when concrete lifetimes cannot match. This keeps lifetime-only ABI transitions from introducing accidental timeline work while preserving real copies where they are required.
0e0e204 to
4b29940
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We had a potential correctness issue when using chained external fences and returning values (vs writing to output arguments) where we'd insert a
stream.async.transferas effectively just a cast to external lifetime for returned tensors. The problem is that thestream.timepoint.barrierfeeding the chain_external op was before the transfer, meaning that if the user did wait on the fence and consume the returned value they may be consuming it before the transfer has executed. We're mostly saved today by most usage being through the synchronous ABI or torch placing results into outputs as well as most transfers being elided, but it was not guaranteed.The new
stream.async.castthat just does lifetime assertions and pins values in usage refinement. This allows us to import/export and cast to avoid any potential for copies to arise. Future changes will use this op in a timeline verification pass that checks that resources produced by every StreamableOp are consumed using an appropriate timeline.