fix: prevent panic in memory_cleanup on CUDA after linear forward #3932
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
After a linear layer forward call in autodiff mode, executing AutodiffBackend::memory_cleanup on CUDA would panic with 'The size should match'. This occurred because memory_cleanup was called without synchronizing pending asynchronous operations, causing size mismatches in CubeCL's memory management.
The fix adds a sync() call before memory_cleanup in CubeBackend to ensure all operations are completed before cleanup. This resolves the issue for CUDA backends while having no impact on synchronous backends like ndarray.
Added a test in burn-cubecl to verify memory_cleanup works after linear operations.
Files changed:
Pull Request Template
Checklist
cargo run-checkscommand has been executed.Related Issues/PRs
Provide links to relevant issues and dependent PRs.
Changes
Summarize the problem being addressed and your solution.
Testing
Describe how these changes have been tested.
issue fixed: #3927