Update deepspeed activation checkpointing docs

### 📚 Documentation

In your [documentation](https://lightning.ai/docs/pytorch/stable/advanced/model_parallel.html#deepspeed-activation-checkpointing), you refer to the function `deepspeed.checkpointing.checkpoint`, but it looks like it does not exist (anymore?).  Can you update that section?  

While we're at it, can you provide a more common use-case as an example?  The guide warns against wrapping an entire model, but having a pretrained language model from `transformers` is probably the most common use case.  What if someone is just using GPT-2 or T5 with no further mathematical layers?  What should get wrapped then?

cc @borda @awaelchli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update deepspeed activation checkpointing docs #17621

📚 Documentation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update deepspeed activation checkpointing docs #17621

Description

📚 Documentation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions