Skip to content

Commit e583c2c

Browse files
authored
Add a stub for host offloading docs (#8656)
1 parent 93a2ba6 commit e583c2c

File tree

1 file changed

+18
-2
lines changed

1 file changed

+18
-2
lines changed

examples/host_offloading/README.md

+18-2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
1-
This directory will contain a self-contained example for host offloading
2-
by the time of the 2.6 release.
1+
## Host offloading example
32

3+
When doing reverse-mode automatic differentiation, many tensors are saved
4+
during the forward pass to be used to compute the gradient during the backward pass.
5+
Previously you could use `torch_xla.utils.checkpoint` to discard tensors that's easy
6+
to recompute later, called "checkpointing" or "rematerialization". Now PyTorch/XLA
7+
also supports a technique called "host offloading", i.e. moving the tensor to host
8+
and moving them back, adding another tool in the arsenal to save memory. Use
9+
`torch_xla.experimental.stablehlo_custom_call.place_to_host` to move a tensor to host
10+
and `torch_xla.experimental.stablehlo_custom_call.place_to_device` to move a tensor
11+
back to the device. For example, you can use this to move intermediate activations
12+
to host during a forward pass, and move those activations back to device during
13+
the corresponding backward pass.
14+
15+
Because the XLA graph compiler aggressively reorders operations, host offloading is
16+
best used in combination with `scan`.
17+
18+
TODO(yifeit): Clean up the example in https://github.com/tengyifei/playground/blob/master/graph_transforms/offloading.py
19+
and put that here.

0 commit comments

Comments
 (0)