Skip to content

Adding the stream.async.cast op to fix potential async correctness issue.#22803

Merged
benvanik merged 1 commit intomainfrom
users/benvanik/timeline-safety
Apr 28, 2026
Merged

Adding the stream.async.cast op to fix potential async correctness issue.#22803
benvanik merged 1 commit intomainfrom
users/benvanik/timeline-safety

Conversation

@benvanik
Copy link
Copy Markdown
Collaborator

@benvanik benvanik commented Dec 3, 2025

We had a potential correctness issue when using chained external fences and returning values (vs writing to output arguments) where we'd insert a stream.async.transfer as effectively just a cast to external lifetime for returned tensors. The problem is that the stream.timepoint.barrier feeding the chain_external op was before the transfer, meaning that if the user did wait on the fence and consume the returned value they may be consuming it before the transfer has executed. We're mostly saved today by most usage being through the synchronous ABI or torch placing results into outputs as well as most transfers being elided, but it was not guaranteed.

The new stream.async.cast that just does lifetime assertions and pins values in usage refinement. This allows us to import/export and cast to avoid any potential for copies to arise. Future changes will use this op in a timeline verification pass that checks that resources produced by every StreamableOp are consumed using an appropriate timeline.

@benvanik benvanik requested a review from AWoloszyn December 3, 2025 01:50
@benvanik benvanik added the compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) label Dec 3, 2025
@benvanik benvanik force-pushed the users/benvanik/timeline-safety branch from e0a17de to 65c92dc Compare December 3, 2025 04:29
@benvanik benvanik changed the title Adding stream.resource.cast op to fix potential async correctness issue. Adding the stream.async.cast op to fix potential async correctness issue. Dec 3, 2025
@benvanik benvanik marked this pull request as ready for review December 3, 2025 05:45
@benvanik benvanik force-pushed the users/benvanik/timeline-safety branch from 65c92dc to 0e0e204 Compare April 28, 2026 05:52
HAL ABI import/export lowering sometimes needs to assert that a resource must be usable with a different lifetime without scheduling a copy. Modeling that assertion as stream.async.transfer is too strong: a transfer is real async work, so an external fence chained from the original timeline can signal before the transfer executes and let a caller observe a returned tensor too early.

Add stream.async.cast as a tied async-phase passthrough that carries lifetime constraints through ResourceUsageAnalysis. RefineUsage folds the cast when the source can be refined to the requested lifetime and lowers it to stream.async.transfer only when concrete lifetimes cannot match. This keeps lifetime-only ABI transitions from introducing accidental timeline work while preserving real copies where they are required.
@benvanik benvanik force-pushed the users/benvanik/timeline-safety branch from 0e0e204 to 4b29940 Compare April 28, 2026 05:54
@benvanik benvanik enabled auto-merge (squash) April 28, 2026 05:55
@benvanik benvanik merged commit a7e476d into main Apr 28, 2026
62 of 65 checks passed
@benvanik benvanik deleted the users/benvanik/timeline-safety branch April 28, 2026 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants