Multi forward MCH eviction fix #2836

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

aliafzal wants to merge 1 commit into pytorch:main from aliafzal:export-D71491003

Contributor

aliafzal commented Mar 19, 2025

Summary:

Issue:

Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

Solution:

Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Mar 19, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

facebook-github-bot added the fb-exported label

aliafzal force-pushed the export-D71491003 branch from 29ea307 to 141e513 Compare

March 20, 2025 18:03

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

141e513

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

4a19949

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch 2 times, most recently from 4a19949 to 5e4633e Compare

March 20, 2025 18:36

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

5e4633e

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

Contributor

facebook-github-bot commented Mar 20, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 5e4633e to a7205ec Compare

March 25, 2025 08:21

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

a7205ec

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

Contributor

facebook-github-bot commented Mar 25, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

7770bd8

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from a7205ec to 7770bd8 Compare

March 26, 2025 06:22

Contributor

facebook-github-bot commented Mar 26, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

b591c04

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 7770bd8 to b591c04 Compare

March 26, 2025 06:25

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

de77690

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from b591c04 to de77690 Compare

March 26, 2025 12:42

Contributor

facebook-github-bot commented Mar 26, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

9958f65

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from de77690 to 9958f65 Compare

March 28, 2025 08:24

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

69c230a

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 9958f65 to 69c230a Compare

March 28, 2025 08:25

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

f445f59

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 69c230a to f445f59 Compare

March 28, 2025 08:26

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

53a8095

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from f445f59 to 53a8095 Compare

March 28, 2025 08:27

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

83ad80f

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 53a8095 to 83ad80f Compare

March 28, 2025 08:30

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

e1ed145

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from 83ad80f to e1ed145 Compare

March 28, 2025 08:43

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

ae905c0

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from e1ed145 to ae905c0 Compare

March 28, 2025 12:49

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from ae905c0 to d8d47d5 Compare

March 28, 2025 13:55

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

d8d47d5

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

e3a6227

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from d8d47d5 to e3a6227 Compare

March 28, 2025 13:55

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

1 similar comment

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

aliafzal added a commit to aliafzal/torchrec that referenced this pull request


          Multi forward MCH eviction fix (pytorch#2836)

c7fc837

Summary:
Pull Request resolved: pytorch#2836

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from e3a6227 to c7fc837 Compare

March 28, 2025 13:59


          Multi forward MCH eviction fix (pytorch#2836)

af11926

Summary:

## Issue:
Direct tensor modification during training with multiple forward passes breaks PyTorch's autograd graph, causing "one of the variables needed for gradient computation has been modified by an inplace operation" runtime error.

## Solution:
Use in-place updates with .data accessor to safely reinitialize evicted embeddings without invalidating gradient computation.

Reviewed By: dstaay-fb

Differential Revision: D71491003

aliafzal force-pushed the export-D71491003 branch from c7fc837 to af11926 Compare

March 28, 2025 14:06

Contributor

facebook-github-bot commented Mar 28, 2025

This pull request was exported from Phabricator. Differential Revision: D71491003

facebook-github-bot closed this in

e00bbff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported